# NEURAL AND COMPUTATIONAL MODELING OF MOVEMENT CONTROL

EDITED BY: Ning Lan, Vincent C. K. Cheung and Simon C. Gandevia PUBLISHED IN: Frontiers in Computational Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-130-2 DOI 10.3389/978-2-88945-130-2

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **NEURAL AND COMPUTATIONAL MODELING OF MOVEMENT CONTROL**

Topic Editors:

**Ning Lan,** Shanghai Jiao Tong University, China and University of Southern California, USA **Vincent C. K. Cheung,** The Chinese University of Hong Kong, China **Simon C. Gandevia,** Neuroscience Research Australia, Australia

Computational models translate functions of sensorimotor control system. Cover image by N. Lan, V. C. K. Cheung

and S. C. Gandevia

In the study of sensorimotor systems, an important research goal has been to understand the way neural networks in the spinal cord and brain interact to control voluntary movement. Computational modeling has provided insight into the interaction between centrally generated commands, proprioceptive feedback signals and the biomechanical responses of the moving body. Research in this field is also driven by the need to improve and optimize rehabilitation after nervous system injury and to devise biomimetic methods of control in robotic devices.

This research topic is focused on efforts dedicated to identify and model the neuromechanical control of movement. Neural networks in the brain and spinal cord are known to generate patterned activity that mediates coordinated activation of multiple muscles in both rhythmic and discrete movements, e.g. locomotion and reaching. Commands descending from the higher centres in the CNS modulate the activity of spinal networks, which control movement on the basis of sensory feedback of various types, including that from proprioceptive afferents. The computational models will continue to shed light on the central strategies and mechanisms of sensorimotor control and learning.

This research topic demonstrated that computational modeling is playing a more and more prominent role

in the studies of postural and movement control. With increasing ability to gather data from all levels of the neuromechanical sensorimotor systems, there is a compelling need for novel, creative modeling of new and existing data sets, because the more systematic means to extract knowledge and insights about neural computations of sensorimotor systems from these data is through computational modeling. While models should be based on experimental data and validated with experimental evidence, they should also be flexible to provide a conceptual framework for unifying diverse data sets, to generate new insights of neural mechanisms, to integrate new data sets into the general framework, to validate or refute hypotheses and to suggest new testable hypotheses for future experimental investigation. It is thus expected that neural and computational modeling of the sensorimotor system should create new opportunities for experimentalists and modelers to collaborate in a joint endeavor to advance our understanding of the neural mechanisms for postural and movement control.

The editors would like to thank Professor Arthur Prochazka, who helped initially to set up this research topic, and all authors who contributed their articles to this research topic. Our appreciation also goes to the reviewers, who volunteered their time and effort to help achieve the goal of this research topic. We would also like to thank the staff members of editorial office of Frontiers in Computational Neuroscience for their expertise in the process of manuscript handling, publishing, and in bringing this ebook to the readers. The support from the Editor-in-Chief, Dr. Misha Tsodyks and Dr. Si Wu is crucial for this research topic to come to a successful conclusion. We are indebted to Dr. Si Li and Ms. Ting Xu, whose assistant is important for this ebook to become a reality. Finally, this work is supported in part by grants to Dr. Ning Lan from the Ministry of Science and Technology of China (2011CB013304), the Natural Science Foundation of China (No. 81271684, No. 61361160415, No. 81630050), and the Interdisciplinary Research Grant cross Engineering and Medicine by Shanghai Jiao Tong University (YG20148D09). Dr. Vincent Cheung is supported by startup funds from the Faculty of Medicine of The Chinese University of Hong Kong.

#### Guest Associate Editors

Ning Lan, Vincent Cheung, and Simon Gandevia

**Citation:** Lan, N., Cheung, V. C. K., Gandevia, S. C., eds. (2017). Neural and Computational Modeling of Movement Control. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-130-2

# Table of Contents


Reza Sharif Razavian, Naser Mehrabi and John McPhee

*137 Coordinated alpha and gamma control of muscles and spindles in movement and posture*

Si Li, Cheng Zhuang, Manzhao Hao, Xin He, Juan C. Marquez, Chuanxin M. Niu and Ning Lan

*152 A Computational Model for Aperture Control in Reach-to-Grasp Movement Based on Predictive Variability*

Naohiro Takemura, Takao Fukui and Toshio Inui

*167 An Assessment of Six Muscle Spindle Models for Predicting Sensory Information during Human Wrist Movements*

Puja Malik, Nuha Jabakhanji and Kelvin E. Jones

# Editorial: Neural and Computational Modeling of Movement Control

Ning Lan1, 2 \*, Vincent C. K. Cheung<sup>3</sup> and Simon C. Gandevia<sup>4</sup>

*<sup>1</sup> School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China, <sup>2</sup> Division of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, USA, <sup>3</sup> School of Biomedical Sciences, The Chinese University of Hong Kong, Hong Kong, China, <sup>4</sup> Neuroscience Research Australia, Sydney, NSW, Australia*

Keywords: neural circuits, computational modeling, sensorimotor control, movements, postures

**The Editorial on the Research Topic**

**Neural and Computational Modeling of Movement Control**

# INTRODUCTION

EDITORIAL

There exists a gap from experimental data to the understanding of neural control of movements. This research topic was dedicated to promote computational modeling approach that can facilitate data interpretation (Niu et al.; Ranjbaran and Galiana; Pearson et al.; Sharif Razavian et al.; Malik et al.), elucidate control theories (Ueyama; Ota et al.; Takemura et al.), shed light on systemic mechanisms (Buhrmann and DiPaolo; Pearson et al.; Li et al.), suggest testable hypothesis (Loeb and Tsianos; Jiang et al.), and aid design of rehabilitation or therapeutic strategies (Zitella et al.). The 14 articles reflected these different aspects of computational modeling in bridging this gap between functions of neural circuits and observable behaviors. This research topic demonstrated that computational modeling is playing a more and more prominent role in sensorimotor control studies.

#### Edited by:

*Si Wu, Beijing Normal University, China*

> Reviewed by: *Sheng Li, Peking University, China*

\*Correspondence: *Ning Lan ninglan@sjtu.edu.cn*

Received: *01 July 2016* Accepted: *10 August 2016* Published: *31 August 2016*

#### Citation:

*Lan N, Cheung VCK and Gandevia SC (2016) Editorial: Neural and Computational Modeling of Movement Control. Front. Comput. Neurosci. 10:90. doi: 10.3389/fncom.2016.00090* Our knowledge of the neural mechanisms of movement generation is mostly derived from experimental data obtained in animals and humans. For a more comprehensive and holistic understanding of motor control, the ever-mounting experimental information must be integrated to allow general principles of sensorimotor control to emerge. Progress in this integration has been stagnant owing to the fragmented nature of many available data sets, usually recorded from constrained preparations or under very specific behavioral conditions. One mathematical framework for consolidating data is to fit the data using mathematical equations with optimized, "best-fit" parameters (e.g., choosing parameters that maximize the data variance accounted for by the equations). This approach has evolved from a "black-box" type of modeling to building biologically and neurophysiologically realistic, multi-scale models. The success of the latter approach hinges on the assumption that the models represent the underlying computations of neural signal processing in central sensorimotor system.

The modeling approach advocated here is knowledge-based deduction with simulations using computational models. Simulations must be compared to observable states of the system to validate the hypothesis advanced (Cordo et al., 2002; Stein et al., 2004; Lan and He), or to propose new testable hypotheses (Bullock, 1993). Building a multi-scale model of the sensorimotor system from muscles, proprioceptors to skeletal joints, spinal regulating centers, and central control circuits exemplifies part of this endeavor (Cheng et al., 2000; Lan et al., 2005; Mileusnic et al., 2006; Alstermark et al., 2007; Song et al., 2008a,b; Hao et al., 2013; He et al., 2013). The review article by Loeb and Tsianos outlined the necessary elements and challenges in this approach, which is the first step toward integrating a vast body of experimental data into a general mathematical framework for simulation.

Experimental mapping of neural circuits, or neural modeling, provides the essential foundation upon which mathematical descriptions of the neural system are formulated (Baldissera et al., 2011; Prochazka and Ellaway, 2012). New technologies, such as optogenetics (Bernstein and Boyden, 2011; Fenno et al., 2011), have added out ability to dissect neural circuits in the brain and spinal cord. Alstermark and Ekerot described their work in identifying the spino-cerebellar closed-loop circuit via the brainstem lateral reticular nucleus. Jiang et al. reviewed the anatomy and physiology of the direct and indirect spinocerebellar tracts and illustrated how these pathways, originating in the spinal cord, may be the neural substrates for the transmission of internal feedback signals in control models. They proposed a new, testable hypothesis, that the direct pathway is primarily involved in rhythmic motor acts such as locomotion, while the indirect pathway provides the neural substrates for precerebellar sensorimotor integration required for dexterous limb movement.

The complexity of a model depends on the specific question one wishes the model to address. Niu et al. developed a hardware model of a spinal reflex that demonstrated real-time capability in simulation. Ranjbaran and Galiana used a context-dependent model to shed light on potential underlying neural mechanisms of the vestibulo-ocular reflex. Pearson et al. created a fourlink biomechanical model of a cat hind leg to determine how mechanical and neural factors contribute to the updating of working memory of barrier location during locomotion. Sharif Razavian et al. developed an alternative method to understand muscle synergy by using a biomechanical model that is associated with an optimal solution for task control. Malik et al. constructed a bioinformatic model that incorporated limb biomechanics embedded with six muscle spindles to predict sensory outputs.

Behaviors are often the outcome of a complicated process of neural computations in the brain and spinal cord (Shadmehr and Wise, 2005). Computational models can be of much help in elucidating the roles of individual neural computations in movements. Buhrmann and DiPaolo used a simple two-link model to examine whether peripheral feedback is sufficient to coordinate multi-joint motion—i.e., the motion in the presence of intersegmental interaction torques. Their simulation showed that it is plausible that spinal circuity can control multijoint movements even in the absence of internal models of intersegmental dynamics or learned compensatory motor signals. Ranjbaran and Galiana presented a hybrid nonlinear bilateral model for the horizontal angular vestibulo-ocular reflex (AVOR), and investigated a viable switching strategy for the timing of nystagmus. Simulation results replicated experimental data well in all conditions. Li et al. used a corticospinal, virtual-arm model to investigate the central coordination of alpha and gamma controls to muscles and their muscle spindles for movement generation. Simulation results indicated that simple patterns of alpha and gamma drives are sufficient to control a range of movements, and that propriospinal neurons (Alstermark et al., 2007; Hao et al., 2013) may play an essential role in pre-motor processing of descending commands for movements.

It has long been presumed in motor neuroscience that movement generation begins with a motor planning, followed by a motor execution (Hogan, 1988). How motor planning and execution are accomplished has been subjected to much theorizing (Ajemian and Hogan, 2010). Computational models have been used to address these theoretical questions. Ueyama proposed a new control scheme, called mini-max feedback control, in which motor commands are generated by minimizing the maximal cost to the action resulting from worst-case uncertainty; this scheme outperformed the popular optimal feedback control scheme (Todorov and Jordan, 2002) both in stability and task-goal achievement. Ota et al. also studied the question of motor optimality. They argued that human motor planning is suboptimal when the gain associated with the action is "asymmetric." Takemura et al. followed up on the question of motor planning in light of uncertainty by studying human reach-to-grasp task when the target was visually occluded, a condition that led to a larger peak grip aperture when compared with conditions with vision. To account for the increased grip aperture, they formulated a model based on the assumption that grip aperture is controlled to compensate for motor variability and sensory uncertainty.

An important motivator of computational modeling is the potential use of this body of knowledge to design new, efficacious interventions for treating movement disorders (Reinkensmeyer et al., 2016). This use of computational models is exemplified in the article by Zitella et al. They employed a computational model to evaluate the therapeutic potential and side effects of deep brain stimulation (DBS) of the pedunculopontine tegmental nucleus (PPTg) in a Parkinsonian monkey. This model predicted how different DBS stimulation parameters produced different activations of the nerve fibers surrounding the PPTg.

# CONCLUSIONS

This research topic demonstrated that computational modeling is playing a more and more prominent role in the studies of postural and movement control. With increasing ability to gather data from all levels of the neuronal sensorimotor system, there is a compelling need for novel, creative modeling of new and existing data sets, because the more systematic means to extract knowledge and insights about neural computations from these data is through computational modeling. While models should be based on experimental data and validated with experimental evidence (Ajemian and Hogan, 2010), they should also be flexible to provide a conceptual framework for unifying diverse data sets, to generate new insights of neural mechanisms, to integrate new data sets into the general framework, to validate or refute hypotheses and to suggest new testable hypotheses for future experimental investigation (Bullock, 1993). It is thus expected that neural and computational modeling of the sensorimotor system should create new opportunities for experimentalists and modelers to collaborate in a joint endeavor to advance our understanding of the neural mechanisms for postural and movement control.

# AUTHOR CONTRIBUTIONS

NL and VC drafted and edited the final version of the editorial. SG edited the final version.

# REFERENCES


# FUNDING

NL is funded by The Natural Science Foundation of China (No. 81271684 and No. 61361160415) and The Ministry of Science and Technology of China (No. 2011CB013304). VC is supported by startup funds from The Chinese University of Hong Kong.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Lan, Cheung and Gandevia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mini-max feedback control as a computational theory of sensorimotor control in the presence of structural uncertainty

# *Yuki Ueyama\**

*Department of Rehabilitation Engineering, Research Institute of National Rehabilitation Center for Persons with Disabilities, Tokorozawa, Japan*

#### *Edited by:*

*Arthur Prochazka, University of Alberta, Canada*

#### *Reviewed by:*

*David W. Franklin, University of Cambridge, UK J. Andrew Pruszynski, Umea University, Sweden*

#### *\*Correspondence:*

*Yuki Ueyama, Department of Rehabilitation Engineering, Research Institute of National Rehabilitation Center for Persons with Disabilities, 4-1 Namiki, Tokorozawa, Saitama 359-8555, Japan e-mail: ueyama-yuki@rehab.go.jp*

We propose a mini-max feedback control (MMFC) model as a robust approach to human motor control under conditions of uncertain dynamics, such as structural uncertainty. The MMFC model is an expansion of the optimal feedback control (OFC) model. According to this scheme, motor commands are generated to minimize the maximal cost, based on an assumption of worst-case uncertainty, characterized by familiarity with novel dynamics. We simulated linear dynamic systems with different types of force fields–stable and unstable dynamics–and compared the performance of MMFC to that of OFC. MMFC delivered better performance than OFC in terms of stability and the achievement of tasks. Moreover, the gain in positional feedback with the MMFC model in the unstable dynamics was tuned to the direction of instability. It is assumed that the shape modulations of the gain in positional feedback in unstable dynamics played the same role as that played by end-point stiffness observed in human studies. Accordingly, we suggest that MMFC is a plausible model that predicts motor behavior under conditions of uncertain dynamics.

**Keywords: reaching, motor control, H∞ control, robust control, optimal feedback control, feedback gain, stiffness tuning, force field**

# **INTRODUCTION**

It is necessary to interact with various environments to learn how to use tools and to participate in unfamiliar sports, such as tennis and swimming. Skilled actions are achieved via interactive forces, based on human compensation. A considerable amount of research has focused on arm movement, to investigate learning mechanisms for adaptation to perturbed limb dynamics. It has been suggested that there are different mechanisms for adapting to stable and unstable dynamics (Franklin et al., 2003a,b; Osu et al., 2003). Under conditions where the dynamics are stable, it is possible to learn the forces necessary to compensate for perturbed dynamics in a feed-forward manner (Shadmehr and Mussa-Ivaldi, 1994). However, unstable dynamics make it necessary to learn the optimal mechanical impedance as the magnitude, shape, and orientation of the end-point stiffness (**Figure 1A**) (Burdet et al., 2001). Although an internal model can compensate for both stable and unstable dynamics, mechanisms have been identified for adapting to different approaches (Franklin et al., 2003b; Osu et al., 2003). Osu et al. reported that an inverse dynamics model that controlled the net joint torque performed well in a stable environment. However, in an unstable environment, the inverse dynamics model functions in parallel with an impedance controller to compensate for a consistent perturbing force (Osu et al., 2003). It has also been suggested that the impedance controller assists in the formation of the inverse dynamics model and contributes to improved stability (Franklin et al., 2003b). Both approaches are used selectively and combined in accordance with environmental dynamics.

Optimal feedback control (OFC) theory (Todorov and Jordan, 2002), which has been supported by the results of experimental and simulation studies (Liu and Todorov, 2007; Lockhart and Ting, 2007; Izawa and Shadmehr, 2008; Izawa et al., 2008; Nagengast et al., 2009; Pruszynski et al., 2011; Ueyama and Miyashita, 2013, 2014), suggests that the central nervous system sets up feedback controllers that continuously convert sensory input into motor output, optimally tuned to the task at hand, by trading off energy consumption with constraints, such as accuracy, on performance. According to OFC, trajectory planning is not required because the problems of motor planning and control are combined. An important feature of the model is the concept of minimum intervention: i.e., setting up feedback controllers only to correct variation deleterious to the task (Wolpert and Flanagan, 2010). For example, in a tennis serve, variation in the azimuth angle of the racket head should be corrected far less strongly than variation in the elevation angle, because the azimuthal variation has little effect on whether the ball will land in the court, whereas elevation variability can threaten the goal of landing the ball in the court. OFC is based on a linear-quadratic-Gaussian (LQG) design, which is used to describe uncertain linear systems disturbed by additive white Gaussian noise with imperfect state information (Todorov, 2005). However, a precise forward dynamics model is required. The control and sensory noise must be modeled as Gaussian statistics; however, real-world sensorimotor uncertainties are represented by non-Gaussian distributions (Orban and Wolpert, 2011). In the engineering field, robust control design has been used in various situations, because

it does not require precise dynamic models for the control objects. It is necessary to represent the uncertainties of the dynamics model in a quantitatively expressible form, because the objective of robust control is to configure a control system to allow for such uncertainties. There are essentially two ways of representing the uncertainties: as unstructural or structural uncertainties. An unstructural uncertainty is represented as a perturbation of the transform function in the frequency domain. In contrast, a structural uncertainty is represented by an additive disturbance combined with the process and sensory noise, such as environmental dynamics, in the state-space model. H<sup>∞</sup> control is a robust control technique that addresses the issue of worst-case controller design for linear plants subjected to unknown additive disturbances and plant uncertainties, including problems of disturbance attenuation and model matching and tracking (Djouadi and Zames, 2002). Furthermore, the role of game theory in the design of robust controllers, such as H<sup>∞</sup> control, has also been recognized (Anderson and Moore, 1979; Bernhard, 1995), with the terminology "mini-max controller" adapted from statistical decision theory (Savage, 1955). Moreover, the brain might also be treated as an integrated robust control system in which components for sensing, computation, and decision are useful primarily to the extent that they affect action (Doyle and Csete, 2011).

Here, we applied a mini-max feedback controller (MMFC) to a sensorimotor control problem with environmental dynamics as a structural uncertainty (**Figure 1B**). MMFC operates as an extended model of OFC, by incorporating prior influence characterized by familiarity with novel dynamics. Such expansion of motor control and planning models has been recognized as a major factor in movement generalization (Yan et al., 2013). We performed numerical simulations and compared the performance of MMFC with that of OFC, as a reference, in different types of force fields with stable and unstable dynamics. In our simulations, mini-max feedback control showed better performance than OFC under conditions of dynamics, and could predict the impedance modulation in unstable dynamics to improve stability. These observations suggest that MMFC is a plausible model that predicts behaviors under structural uncertainty. Preliminary results of this study were presented in the proceedings of a conference (Ueyama and Miyashita, 2011).

#### **MATERIALS AND METHODS**

In this paper, the solution to a robust control problem was obtained via a mini-max approach applied to dynamic game problems (Ba¸sar and Bernhard, 1995). We modeled the dynamics as structural uncertainties to apply the mini-max approach; the simulations used simple Euler integration with a 5 ms sampling time.

#### **MINI-MAX CONTROL PROBLEM**

The MMFC problem requires a control object to be represented as a generalized plant model. We provide the model with structural uncertainty, and the solution is obtained by minimizing energy consumption under conditions of maximal uncertainty.

#### *Problem definition*

The dynamics of a system are described by the following equation:

$$\begin{cases} \mathbf{x}\_{k+1} = \mathbf{A}\mathbf{x}\_k + \mathbf{B}\mathbf{u}\_k + \mathbf{D}\bar{\mathbf{w}}\_k\\ \mathbf{y}\_k = \mathbf{C}\mathbf{x}\_k + \bar{\mathbf{v}}\_k \end{cases},\tag{1}$$

where **x***k*, **u***k*, and **y***<sup>k</sup>* are the state, input, and output vectors, respectively, at time step *k* (*k* = 0, 1,..., *N* − 1), and the dynamics are described by the three matrices, **A**, **B**, and **C**. **w**¯ *<sup>k</sup>* denotes a disturbance vector, and **v**¯*<sup>k</sup>* is a sensory noise vector, represented as zero-mean Gaussian white noise with unity covariance. The system can be rewritten as follows when a disturbance, such as an environmental perturbation or motor noise, affects the dynamics:

$$\begin{cases} \mathbf{x}\_{k+1} = (\mathbf{A} + \mathbf{A}\_{\mathbf{A}}) \, \mathbf{x}\_k + (\mathbf{B} + \mathbf{A}\_{\mathbf{B}}) \, \mathbf{u}\_k \\ \quad \mathbf{y}\_k = (\mathbf{C} + \mathbf{A}\_{\mathbf{c}}) \, \mathbf{x}\_k + \bar{\mathbf{v}}\_k \end{cases}, \tag{2}$$

where -**<sup>A</sup>**, -**<sup>B</sup>**, and -**<sup>C</sup>** represent disturbances corresponding to the state, input, and output, respectively. A disturbance should be modeled as an uncertainty in the internal model. Thus, these disturbances are assumed to have the following form:

$$
\Delta\_{\mathbf{A}} = \mathbf{D}\_a \mathbf{F}\_{ak} \mathbf{E}\_a,\\
\Delta\_{\mathbf{B}} = \mathbf{D}\_b \mathbf{F}\_{bk} \mathbf{E}\_b,\\
\Delta\_{\mathbf{C}} = \mathbf{D}\_c \mathbf{F}\_{ck} \mathbf{E}\_c,
$$

where **D***a*, **D***b*, **D***c*, **E***a*, **E***b*, and **E***<sup>c</sup>* are constant matrices, and **F***ak*, **F***bk*, and **F***ck* are time-varying matrices satisfying the following conditions: **F***<sup>T</sup> ak* **<sup>F</sup>***ak* <sup>&</sup>lt; **<sup>I</sup>**, **<sup>F</sup>***<sup>T</sup> bk* **<sup>F</sup>***bk* <sup>&</sup>lt; **<sup>I</sup>**, **<sup>F</sup>***<sup>T</sup> ck* **F***ck* < **I**. The system can be transformed to the equivalent system as follows:

$$\begin{cases} \mathbf{x}\_{k+1} = \mathbf{A}\mathbf{x}\_{k} + \mathbf{B}\mathbf{u}\_{k} + \left[\mathbf{D}\_{a}\ \mathbf{D}\_{b}\ \mathbf{0}\right] \bar{\mathbf{w}}\_{k} \\ \mathbf{y}\_{k} = \mathbf{C}\mathbf{x}\_{k} + \left[\mathbf{0}\ \mathbf{0}\ \mathbf{D}\_{c}\right] \bar{\mathbf{w}}\_{k} + \mathbf{D}\_{\gamma}\mathbf{v}\_{k} \end{cases},$$

where,

$$
\bar{\boldsymbol{\omega}}\_{k} = \begin{bmatrix} \mathbf{F}\_{ak} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{F}\_{bk} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{F}\_{ck} \end{bmatrix} \mathbf{z}\_{k}, \; \mathbf{z}\_{k} = \begin{bmatrix} \mathbf{E}\_{a} \\ \mathbf{0} \\ \mathbf{E}\_{c} \end{bmatrix} \mathbf{x}\_{k} + \begin{bmatrix} \mathbf{0} \\ \mathbf{E}\_{b} \\ \mathbf{0} \end{bmatrix} \mathbf{u}\_{k}.
$$

Here, **z***<sup>k</sup>* denotes the regulated output vector. Then, a system with structural uncertainty can be reduced to the following form, known as a generalized plant model (Zhou and Doyle, 1998):

$$\begin{cases} \mathbf{x}\_{k+1} = \mathbf{A}\mathbf{x}\_k + \mathbf{B}\mathbf{u}\_k + \mathbf{D}\mathbf{w}\_k\\ \quad \mathbf{z}\_k = \mathbf{H}\mathbf{x}\_k + \mathbf{G}\mathbf{u}\_k\\ \quad \mathbf{y}\_k = \mathbf{C}\mathbf{x}\_k \qquad + \mathbf{E}\mathbf{w}\_k \end{cases}, \tag{3}$$

where,

$$\mathbf{H} = \begin{bmatrix} \mathbf{E}\_a \\ \mathbf{0} \\ \mathbf{E}\_c \end{bmatrix}, \mathbf{G} = \begin{bmatrix} \mathbf{0} \\ \mathbf{E}\_b \\ \mathbf{0} \end{bmatrix}, \mathbf{D} = \begin{bmatrix} \mathbf{D}\_a \ \mathbf{D}\_b \ \mathbf{0} \ \mathbf{0} \end{bmatrix},$$

$$\mathbf{E} = \begin{bmatrix} \mathbf{0} \ \mathbf{0} \ \mathbf{D}\_c \ \mathbf{D}\_{\mathcal{V}} \end{bmatrix}, \mathbf{w}\_k = \begin{bmatrix} \bar{\mathbf{w}}\_k \\ \mathbf{v}\_k \end{bmatrix}.$$

According to OFC, the cost function *J*(**u**) is given by,

$$J(\mathbf{u}) = \sum\_{k=1}^{N-1} \mathbf{z}\_k^T \mathbf{z}\_k + \mathbf{x}\_N^T \mathbf{Q}\_N \mathbf{x}\_N,\tag{4}$$

where **Q***<sup>N</sup>* denotes a terminal state cost weight matrix. Instead, our proposed model adopts the cost function *J*γ (**u**,**w**), given by the following equation:

$$J\_{\mathcal{Y}}(\mathbf{u}, \mathbf{w}) = f(\mathbf{u}) - \boldsymbol{\nu}^2 \sum\_{k=1}^{N-1} \mathbf{w}\_k^T \mathbf{w}\_k,\tag{5}$$

where γ is a scalar parameter representing the level of disturbance attenuation. The objective of robust control is to determine the appropriate input for a worst-case disturbance. Thus, the robust control problem is related to the mini-max problem of minimizing the input **u** for a maximized disturbance **w**:

$$\inf\_{\mathbf{u}} \sup\_{\mathbf{w}} J\_{\mathcal{V}}(\mathbf{u}, \mathbf{w}).$$

This cost function requires a task to be achieved with minimal energy consumption for the worst case of uncertainty as the maximized disturbance, in a manner analogous to the OFC problem: LQG design, which is described by a quadratic cost, and gives the solution as a combination of the feedback control law and a state estimator.

#### *Solution*

As in the LQG design, a solution of the MMFC problem can be written in a state feedback form:

$$\mathbf{u}\_{k} = -\mathbf{L}\_{k}\hat{\mathbf{x}}\_{k},\tag{6}$$

where **x**ˆ*<sup>k</sup>* and **L***<sup>k</sup>* are the estimated state and feedback gain, respectively. The estimated state and feedback gain are computed from two discrete Riccati differential equations of the following form:

$$\begin{aligned} \mathbf{M}\_k &= \mathbf{Q}\_k + \mathbf{A}^T (\mathbf{M}\_{k+1}^{-1} + \mathbf{B}\mathbf{B}^T - \boldsymbol{\nu}^{-2} \mathbf{D}\mathbf{D}^T)^{-1} \mathbf{A} \\ \text{with } \mathbf{M}\_N &= \mathbf{Q}\_N, \\ \boldsymbol{\Sigma}\_{k+1} &= \mathbf{A} (\boldsymbol{\Sigma}\_k^{-1} + \mathbf{C}^T \mathbf{N}^{-1} \mathbf{C} - \boldsymbol{\nu}^{-2} \mathbf{Q}\_k)^{-1} \mathbf{A}^T + \mathbf{D} \mathbf{D}^T, \\ \text{with } \boldsymbol{\Sigma}\_1 &= \mathbf{Q}\_0^{-1}, \end{aligned}$$

where **M***<sup>k</sup>* and *<sup>k</sup>* denote the solutions of the Riccati equations obtained by the respective backward and forward time calculations. Here, we adopt the following assumptions to simplify the derivations:

$$\mathbf{G}^T \mathbf{G} = \mathbf{I}, \mathbf{H}^T \mathbf{G} = \mathbf{0}, \mathbf{E} \mathbf{E}^T = \mathbf{N}, \mathbf{H}^T \mathbf{H} = \mathbf{Q}\_k.$$

These assumptions do not affect the generalizability, and they allow describing equations in simple forms, maintaining consistency with the OFC. Using these solutions, the feedback gain and estimated state are given by the following:

$$\mathbf{L}\_{k} = \mathbf{B}^{T} \left( \mathbf{M}\_{k+1}^{-1} + \mathbf{B} \mathbf{B}^{T} - \boldsymbol{\nu}^{-2} \mathbf{D} \mathbf{D}^{T} \right)^{-1}$$

$$\mathbf{A} \left( \mathbf{I} - \boldsymbol{\nu}^{-2} \mathbf{E}\_{k} \mathbf{M}\_{k} \right)^{-1}, \tag{7}$$

$$\hat{\mathbf{x}}\_{k+1} = \mathbf{A} \hat{\mathbf{x}}\_{k} + \mathbf{B} \hat{\mathbf{u}}\_{k} + \mathbf{A} \left( \mathbf{E}\_{k}^{-1} + \mathbf{C}^{T} \mathbf{N}^{-1} \mathbf{C} - \boldsymbol{\nu}^{-2} \mathbf{Q}\_{k} \right)^{-1}$$

$$\cdot \left\{ \boldsymbol{\nu}^{-2} \mathbf{Q}\_{k} \hat{\mathbf{x}}\_{k} + \mathbf{C}^{T} \mathbf{N}^{-1} \left( \mathbf{y}\_{k} - \mathbf{C} \hat{\mathbf{x}}\_{k} \right) \right\}. \tag{8}$$

The estimated disturbance diverges to infinity if the level of disturbance attenuation γ is close to zero. Thus, the level of γ cannot be chosen freely and must satisfy the following constraints:

$$\mathbf{M}\_{k+1}^{-1} - \boldsymbol{\gamma}^{-2} \mathbf{D} \mathbf{D}^T > \mathbf{0} \text{ and } \mathbf{E}\_k^{-1} - \boldsymbol{\gamma}^{-2} \mathbf{Q}\_k > \mathbf{0}. \tag{9}$$

The strong concavity condition is given by

$$
\tilde{\mathbf{E}}\_{k+1}^{-1} - \boldsymbol{\chi}^{-2} \mathbf{M}\_{k+1} > \mathbf{0} \text{ or } \mathbf{E}\_k^{-1} - \boldsymbol{\chi}^{-2} \tilde{\mathbf{M}}\_k > \mathbf{0}, \tag{10}
$$

where

$$
\tilde{\mathbf{M}}\_k = \mathbf{A}^T (\mathbf{M}\_{k+1}^{-1} - \boldsymbol{\nu}^{-2} \mathbf{D} \mathbf{D}^T)^{-1} \mathbf{A} + \mathbf{Q}\_k,\\
\tilde{\mathbf{M}}\_N = \mathbf{Q}\_N,\\
$$

$$
\tilde{\mathbf{E}}\_{k+1} = \mathbf{A} (\mathbf{E}\_k^{-1} - \boldsymbol{\nu}^{-2} \mathbf{Q}\_k)^{-1} \mathbf{A}^T + \mathbf{D} \mathbf{D}^T,\\
\tilde{\mathbf{E}}\_1 = \mathbf{Q}\_0^{-1}.
$$

The constraints above can be translated into equivalent conditions on the spectral radius (i.e., the maximum of the absolute values of the eigenvalues) because the spectrum radius is equal to the norms of **M***<sup>k</sup>* <sup>+</sup> <sup>1</sup>**DD***T*, *k***Q***k*, ˜ *<sup>k</sup>* <sup>+</sup> <sup>1</sup>**M***<sup>k</sup>* <sup>+</sup> 1, and *k***M**˜ *<sup>k</sup>*. Thus, Equations (9) and (10) require the following conditions be satisfied: ρ(**M***<sup>k</sup>* <sup>+</sup> <sup>1</sup>**DD***T*) < γ 2, ρ(*k***Q***k*) < γ 2, and ρ(˜ *<sup>k</sup>* <sup>+</sup> <sup>1</sup>**M***<sup>k</sup>* <sup>+</sup> 1) < γ <sup>2</sup> or ρ(*k***M**˜ *<sup>k</sup>*) < γ 2, where ρ is the kernel of the spectral radius.

#### **APPLICATION TO A SENSORIMOTOR SYSTEM**

We applied the MMFC approach to a sensorimotor system. The dynamics model is based on previous studies (Todorov, 2005; Izawa and Shadmehr, 2008; Izawa et al., 2008; Braun et al., 2009). The dynamics were simulated with uncertainties, represented by force fields: a velocity-dependent force field (VF) and a divergent force field (DF), representing stable and unstable environments, respectively (Franklin et al., 2003a,b; Osu et al., 2003). We also designed both optimal and mini-max feedback controllers for the problem and compared their performances.

#### *Sensorimotor system*

*Dynamics model.* We modeled a movement with two degrees of freedom, such as multi-joint flexion and extension of the shoulder and elbow joints, as cursor movements on a screen, described by shifting the position **p**(*t*) = [*x*(*t*), *y*(*t*)] *<sup>T</sup>* to designated targets **p**<sup>∗</sup> = [*x*∗, *y*∗]*T*:

$$m\ddot{\mathbf{p}}(t) = \mathbf{f}(t) - b\dot{\mathbf{p}}(t),\tag{11}$$

where *m* and *b* are the end-point mass and viscosity, respectively, and are set equal to *m* = 1.0 (kg) and *b* = 10 (Ns/m). The combined action of all muscles is represented by the force vector **f**(*t*) ∈ *R*<sup>2</sup> acting on the hand. The motor command **u**(*t*) ∈ *R*<sup>2</sup> is transformed into the force **f**(*t*) by adding control-dependent multiplicative noise and by applying a simplified first-order muscle-like low-pass filter of the following form:

$$\dot{\mathbf{f}}(t) = \frac{(\mathbf{I} + \sigma\_{\mathbf{u}}\mathbf{e}(t))\mathbf{u}(t) - \mathbf{f}(t)}{\mathbf{r}},\tag{12}$$

with time constant, τ = 0.05 (s). The motor command **u**(*t*) is disturbed by signal-dependent multiplicative noise that exists in the neural system (Matthews, 1996), and plays an important role in motor planning (Harris and Wolpert, 1998). The signaldependent noise (SDN) is given by the Gaussian white noise **ε**(*t*) ∼ *N*(**0**,**I**) and the magnitude *σ* **<sup>u</sup>** is set equal to 0.5.

*Observation model.* In our model, the state variables cannot be observed directly. The sensory output **y**(*t*) ∈ *R*<sup>8</sup> is the position, velocity, force, and target position disturbed by sensory noise, and is given by:

$$\mathbf{y}(t) = \begin{bmatrix} \mathbf{p}(t) \\ \dot{\mathbf{p}}(t) \\ \mathbf{f}(t) \\ \mathbf{p}^\* \end{bmatrix} + \sigma\_\mathbf{y} \mathbf{v}(t), \tag{13}$$

where **v**(*t*) ∈ *R*<sup>8</sup> and **σ<sup>y</sup>** ∈ *R*8×<sup>8</sup> are the Gaussian white noise **v**(*t*) ∼ *N*(**0**,**I**) and the diagonal matrix defined by **σ<sup>y</sup>** = *diag*([0.02*c*, 0.02*c*, 0.2*c*, 0.2*c*,*c*,*c*, 0, 0]), respectively. Here, *c* is the scaling parameter, equal to the SDNs: i.e., *c* = *σ* **<sup>u</sup>** = 0.5, similar to a previous study (Todorov, 2005). The task is to move the hand from the starting position **p**(0) = [0, 0] *<sup>T</sup>* to the target position **p**∗, which is located at a distance of 25 cm, and to stop at the terminal period between 600 and 700 ms, in accordance with experiments (Franklin et al., 2003a,b; Osu et al., 2003).

#### *Environmental uncertainty*

We assumed two different types of force field as uncertainty environments, VF and DF. The force fields exert a force **F***ext*(*t*) ∈ *R*<sup>2</sup> on the hand. The force generated by the VF is

$$\mathbf{F}\_{\rm ext}(t) = \mathbf{F}\_{VF} \,\dot{\mathbf{p}}(t), \,\mathbf{F}\_{VF} = \alpha \begin{bmatrix} 13 \ -18 \\ 18 \ 13 \end{bmatrix}, \tag{14}$$

where α is a scaling parameter, set equal to 0.1 to generate effective perturbation for the trajectory. When reaching forward, the force is directed forward and to the left, as the velocity along the *y*-axis is increased (**Figure 2A**). DF produces a negative elastic force perpendicular to the target directions, with a value of zero along the *y*-axis: i.e., no force is applied when the path of the hand follows the *y*-axis, but the hand is pushed away whenever it deviates from the *y*-axis (**Figure 2B**). DF teaches subjects to move in a straight line, but to show no after-effects on the removal of the field. The task is achieved by increasing the stiffness of the arm, but only in the direction of maximum instability (**Figure 1A**). The force generated by DF is described by

$$\mathbf{F}\_{\text{ext}}(t) = \mathbf{F}\_{\text{DF}} \mathbf{p}(t), \mathbf{F}\_{\text{DF}} = \boldsymbol{\beta} \begin{bmatrix} 1 \ 0 \\ 0 \ 0 \end{bmatrix}, \tag{15}$$

where β is a scaling parameter, set equal to 100 to generate effective perturbation for the trajectory. Although end-point trajectories were almost straight without external dynamics, the initial movement direction varied slightly from trial to trial, due to motor output variability (Burdet et al., 2001). Thus, because DF produces an unstable interaction with the arm to amplify such variation by pushing the hand with a force proportional to the deviation from the *y*-axis, the initial trials in DF exhibited unstable behavior, diverging widely to the right or left of the *y*-axis. We also examined the additional DF case of a rotated divergent force field (rDF), which is necessary to reach a rotated position (**Figure 2C**). The exerted force is also rotated. If the target is realigned at an angle θ in the clockwise direction, the force is then given by

$$\mathbf{F}\_{DF} = \beta \begin{bmatrix} \cos^2 \theta & \sin \theta \cos \theta \\ \sin \theta \cos \theta & \sin^2 \theta \end{bmatrix}. \tag{16}$$

In this study, the rotational angle θ was set equal to 30◦. With these force fields, the dynamic uncertainties can be expressed as follows:

$$
\Delta\_{\mathbf{A}} = \begin{bmatrix} \mathbf{0}\_{4 \times 2} \ \mathbf{0}\_{4 \times 2} \ \mathbf{0}\_{4 \times 6} \\ \mathbf{0}\_{2 \times 2} \ \mathbf{F}\_{\mathrm{VF}} \ \mathbf{0}\_{2 \times 6} \\ \mathbf{0}\_{2 \times 2} \ \mathbf{0}\_{2 \times 2} \ \mathbf{0}\_{2 \times 6} \end{bmatrix} \text{ or } \Delta\_{\mathbf{A}} = \begin{bmatrix} \mathbf{0}\_{4 \times 2} \ \mathbf{0}\_{4 \times 2} \ \mathbf{0}\_{4 \times 6} \\ \mathbf{F}\_{\mathrm{DF}} \ \mathbf{0}\_{2 \times 2} \ \mathbf{0}\_{2 \times 6} \\ \mathbf{0}\_{2 \times 2} \ \mathbf{0}\_{2 \times 2} \ \mathbf{0}\_{2 \times 6} \end{bmatrix}.
$$

In both cases, the environmental uncertainties do not depend on the motor command; however, the motor command is disturbed by the SDN. Thus, the uncertainty of the motor command is represented by -**<sup>B</sup>** = *σ* **<sup>u</sup>** · **B** · **ε**(*t*).

#### *Controller design*

We carried out numerical simulations using both OFC and MMFC to compare their performances. In our simulations, the dynamics model was rewritten as a discrete-time system, using a state-space formulation:

$$\mathbf{x}\_{k+1} = \mathbf{A}\mathbf{x}\_k + \mathbf{B} \left(\mathbf{I} + \sigma\_\mathbf{u} \mathbf{e}\_k\right) \mathbf{u}\_k,\tag{17}$$

$$\mathbf{y}\_k = \mathbf{C}\mathbf{x}\_k + \sigma\_\mathbf{y}\mathbf{v}\_k,\tag{18}$$

where **x***<sup>k</sup>* ∈ *R*<sup>8</sup> is a state-space vector at time step *k*, defined by **x***<sup>k</sup>* = [**p***<sup>T</sup> <sup>k</sup>* , *<sup>T</sup> <sup>k</sup>* , **f** *T <sup>k</sup>* , **<sup>p</sup>**∗*T*]*T*. The matrices describing the system, **<sup>A</sup>** <sup>∈</sup> *R*8<sup>×</sup>8, **B** ∈ *R*8<sup>×</sup>2, and **C** ∈ *R*8<sup>×</sup>8, are expressed as follows:

$$\mathbf{A} = \begin{bmatrix} \mathbf{I}\_{2 \times 2} & \Delta \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \mathbf{I}\_{2 \times 2} & \Delta / m \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & (1 - \Delta / \tau) \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{I}\_{2 \times 2} \end{bmatrix},$$

$$\mathbf{B} = \begin{bmatrix} \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} \\ \Delta / \tau \cdot \mathbf{I}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} \end{bmatrix}, \mathbf{C} = \mathbf{I},$$

where is a single time step of the simulation, set equal to - = 0.005 s.

In these simulations, we assumed two types of condition: (i) with structural uncertainty and (ii) without structural uncertainty. Under the condition with structural uncertainty, the system matrix **A** of the state-space Equation (3) is not equal to the actual dynamics including the force field. In contrast, under the condition without structural uncertainty, the force field dynamics are completely represented in the internal model. Thus, the system matrix **A** in Equation (3) is replaced by **A** + -**A**.

*Optimal feedback controller.* An optimal feedback controller also generates motor commands, thus forming state feedback, as in Equation (6). The feedback gain is computed to minimize the following cost function:

$$J(\mathbf{u}) = \sum\_{k=N\_s+1}^{N} \left( \nu\_p^2 \left\| \mathbf{p}\_k - \mathbf{p}^\* \right\|^2 + \nu\_\nu^2 \left\| \dot{\mathbf{p}}\_k \right\|^2 + \nu\_f^2 \left\| \mathbf{f}\_k \right\|^2 \right)$$

$$+ \sum\_{k=1}^{N-1} \left\| \mathbf{u}\_k \right\|^2,\tag{19}$$

where *wp*, *wv*, and *wf* are the cost weights of the end-point position, velocity, and force, with the assigned values, *wp* = 104, *wv* = 103, and *wf* = 102, respectively, to achieve the reaching task adequately without external dynamics, i.e., null force field (NF). In addition, the terminal cost is defined to evaluate the states between *Ns* = 0.6/ and *N* = 0.7/ steps. Thus, the cost function requires the expected state to be stabilized at close to the target in the terminal period (*Ns* < *k* < *N*). The feedback gain is determined by

$$\mathbf{L}\_{k} = (\mathbf{B}^{T}\mathbf{S}\_{k+1}\mathbf{B} + \mathbf{I})^{-1}\mathbf{B}^{T}\mathbf{S}\_{k+1}\mathbf{A},\tag{20}$$

where **S***<sup>k</sup>* <sup>+</sup> <sup>1</sup> is found by solving the Riccati equation

$$\mathbf{S}\_{k} = \mathbf{A}^{T}\mathbf{S}\_{k+1}(\mathbf{A} - \mathbf{B}^{T}\mathbf{L}\_{k}) + \mathbf{Q}\_{k} \text{ with } \mathbf{S}\_{N} = \mathbf{Q}\_{N},$$

where **Q***<sup>k</sup>* ∈ *R*8×<sup>8</sup> is the task cost matrix, given by **Q***<sup>k</sup>* = **q***<sup>T</sup> <sup>k</sup>* **q***k*, where

$$\mathbf{q}\_{k} = \begin{bmatrix} \mathbf{w}\_{p} \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & -\boldsymbol{\omega}\_{p} \cdot \mathbf{I}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \boldsymbol{\omega}\_{\boldsymbol{\nu}} \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \boldsymbol{\omega}\_{\boldsymbol{f}} \cdot \mathbf{I}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \\ \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times 2} \end{bmatrix} \quad (k > N\_{\mathbf{i}}),$$
  $\text{or } \mathbf{q}\_{k} = \mathbf{0} \text{ (}k \le N\_{\mathbf{i}}\text{)}.$ 

The cost weights are also used in the mini-max feedback controller design, and tuned to accomplish the tasks for all conditions in the MMFC. The state of the system is estimated from noisy observation using Kalman filtering and is expressed as follows:

$$
\hat{\mathbf{x}}\_{k+1} = \mathbf{A}\hat{\mathbf{x}}\_k + \mathbf{B}\mathbf{u}\_k + \mathbf{K}\_k(\mathbf{y}\_k - \mathbf{C}\hat{\mathbf{x}}\_k),
\tag{21}
$$

where **K***<sup>k</sup>* ∈ *R*8×<sup>8</sup> is the Kalman gain: i.e., a function of the uncertainty of the estimated state and the measurement noise. We adapted a standard technique to calculate the gain, as follows:

$$\mathbf{K}\_{k} = \mathbf{P}\_{k \mid k-1} \mathbf{C}^{T} (\mathbf{C} \mathbf{P}\_{k \mid k-1} \mathbf{C}^{T} + \sigma\_{\mathcal{Y}} \sigma\_{\mathcal{Y}}^{T})^{-1},\tag{22}$$

where **P***k*|*k*−<sup>1</sup> ∈ *R*8×<sup>8</sup> is the predicted accuracy of the state estimation and is given by

$$\begin{aligned} \mathbf{P}\_{k|k-1} &= \mathbf{A} \mathbf{P}\_{k-1|k-1} \mathbf{A}^T + (\mathbf{B} \boldsymbol{\sigma}\_{\boldsymbol{\mu}} \boldsymbol{\mu}\_k)(\mathbf{B} \boldsymbol{\sigma}\_{\boldsymbol{\mu}} \mathbf{u}\_k)^T \\ \text{with } \mathbf{P}\_{k|k} &= (\mathbf{I} - \mathbf{K}\_k \mathbf{C}) \mathbf{P}\_{k|k-1} .\end{aligned}$$

The Kalman gain is computed concurrently at each time step in the simulation, starting with the initial condition **P**0|<sup>0</sup> = 10−<sup>3</sup> × **I**.

*Mini-max feedback controller.* To apply the MMFC approach, uncertainty must be modeled as familiarity with itself. Thus, we represent the familiarity by the matrices **D***<sup>a</sup>* ∈ *R*8<sup>×</sup>8, **D***<sup>b</sup>* ∈ *R*8<sup>×</sup>8, **D***<sup>c</sup>* ∈ *R*8<sup>×</sup>8, and **D***<sup>y</sup>* ∈ *R*8<sup>×</sup>8, given by **D***<sup>a</sup>* = κ-**<sup>A</sup>**, **D***<sup>b</sup>* = λ*σ* **uB**, **D***<sup>c</sup>* = **0**, and **D***<sup>y</sup>* = λ**σ***y*, where κ and λ are the scaling parameters of familiarity. The parameter κ was set to a range of a closed interval [0, 1]. When the force field dynamics cannot be predicted—i.e., **D***<sup>a</sup>* = **0** (κ = 0)–the structural uncertainty is not modeled. When the force field dynamics are modeled completely as the structural uncertainty, **D***<sup>a</sup>* = -**<sup>A</sup>**(κ = 1). The controller is then designed to maximize the effect of the dynamics as the worstcase assumption. In addition, **D***<sup>b</sup>* and **D***<sup>y</sup>* must be sufficiently large to exceed the maximum value of distribution, and hence the scaling parameter λ is set to λ = 5. This seems sufficient for the disturbances, because the SDN and sensory noise have a standard Gaussian white noise distribution.

Matrices representing the regulated outputs **E***<sup>a</sup>* ∈ *R*8<sup>×</sup>8, **E***<sup>b</sup>* ∈ *R*8<sup>×</sup>8, and **E***<sup>c</sup>* ∈ *R*8×<sup>8</sup> were given to satisfy the assumption **G***T***G** = **I** and **H***T***G** = **0** by:

$$\mathbf{E}\_a = \begin{cases} \mathbf{q}\_k \ (k > N\_s) \\ \mathbf{I} \ \ (k \le N\_s) \end{cases}, \mathbf{E}\_b = \mathbf{I}, \mathbf{E}\_c = \mathbf{0}. \tag{23}$$

The terminal cost matrix **Q***<sup>N</sup>* has already been defined in the optimal feedback controller design, and the initial error cost **Q**<sup>0</sup> ∈ *R*8×<sup>8</sup> is defined as **Q**<sup>0</sup> = **P**0<sup>|</sup>0. Finally, the disturbance attenuation level γ is set equal to 107 to satisfy Equations (9) and (10).

### **RESULTS**

We performed numerical simulations of point-mass reaching movement in different types of force fields–VF, DF, and rDF– using OFC and MMFC. The simulations were carried out 100 times for each case.

#### **COMPARISON OF TRAJECTORIES**

We compared the trajectories of OFC and MMFC. Then, the endpoint distributions were computed from the lateral distances of the target direction (based on curvature > 0.03 mm<sup>−</sup>1), following a previous report (Osu et al., 2003). The trajectories were almost straight lines for OFC in NF (**Figure 3A**). However, under conditions of a force field, reaching the target was difficult. In VF, the trajectories curved to the left (**Figure 3B**). In DF and rDF, the trajectories diverged to the left and right in accordance with the directions of the targets (**Figures 3C,D**). When the force field dynamics were modeled internally, i.e., **A** ← **A** + -**<sup>A</sup>**, the trajectories in VF came close to the targets with a curve; however, the trajectories of DF and rDF did not achieve their targets (**Figures 3E–G**). Under the rDF condition, in particular, some trajectories could not aim toward the target even immediately after the onset of movement. The behavior difference from DF was caused by cross talk in the coordinates. In DF, deviance on the *x*-axis was independent on the hand position on *y*-axis, because the diagonal components of the feedback gain were zero. In rDF, conversely, the lateral deviancy affected the vertical distance between the target hand positions through the feedback gain, and the task required more motor effort to reach the same distance to DF because each actuator acted on only the *x*- or *y-*axis.

As with OFC, the trajectories of MMFC were almost straight lines in NF (**Figure 4A**). In VF, although the trajectories curved gradually after the onset of movement, they turned suddenly toward the target, even when the force field dynamics was not completely known (**Figure 4B**). In DF and rDF, even if the trajectories had diverged after the onset of movement, they finally converged to the target (**Figures 4C,D**). These trajectories were similar to those obtained from the results of initial trials, during adaptation to the same types of dynamics, in human experiments (Osu et al., 2003). However, in VF without structural uncertainty of force fields, i.e., **A** ← **A** + -**<sup>A</sup>** and κ = 0, the trajectories curved slightly to the right direction and achieved the target (**Figure 4E**). Subsequently, the trajectories were straight lines in DF and rDF (**Figures 4F,G**).

The familiarity parameter κ(0 ≤ κ ≤ 1) affects the performance of MMFC directly, because the structural uncertainty of the force fields was not reflected in the motor control when κ = 0. Thus, we evaluated the effect on the trajectories (**Figure 5**). When κ = 0, the trajectories could not reach the targets in all conditions, and those of DF and rDF diverged. With the increase in the parameter κ, the trajectories were close to the targets under all conditions. The variability of the trajectories as well as the end-point errors were decreased in DF and rDF. In addition, the quadratic costs, given by Equation (19), decreased to a slightly

greater degree than those of the end-point errors, and the performances were saturated at around κ = 0.5 in VF, and κ = 0.01 in DF and rDF.

#### **FEEDBACK GAIN GEOMETRIES**

There are mathematical difficulties in incorporating the impedance generated by non-linear muscular properties with a feedback control law. However, several studies have provided evidence that sensorimotor control systems can and do regulate feedback gains for impedance control (Franklin et al., 2007; Krutky et al., 2010; Franklin and Wolpert, 2011). Although the impedance is not actually equal to the feedback gains computed by OFC or MMFC, the gains must contribute to the modulation of impedance. Thus, we computed sensory feedback gains, transferring sensory feedback errors to the motor command as products of the state feedback and filter gains. The sensory feedback gains for OFC and MMFC were then given by products of the state feedback and filter gains, as **L***<sup>k</sup>* · **K***<sup>k</sup>* and **L***<sup>k</sup>* · **A**(*<sup>k</sup>* + **C***T***N**−1*C* − γ <sup>−</sup>2**Q***k*)−1**C***T***N**<sup>−</sup>1, respectively. We visualized the patterns of the positional gain at the midpoints of the movement time (350 ms) as ellipses, similar to the stiffness ellipses used previously (Burdet et al., 2001; Franklin et al., 2003a, 2007; Ueyama and Miyashita, 2014). The orientation, shape, and size of the ellipse are obtained by singular value decomposition of the positional gain matrix.

In NF, the gain of OFC was a vertically long ellipse (**Figure 6A**). In VF and DF of OFC, the gains with structural uncertainty were quite similar to the gain in NF. However, the gain in VF without structural uncertainty was rather small, and varied by ∼4◦ in the clockwise direction; that of DF decreased in a lateral direction.

In rDF, the gain with structural uncertainty was directed to the target at 30◦ in a clockwise direction. However, the gain without structural uncertainty was diminished and directed to −60◦ in a clockwise direction. As mentioned in Section Comparison of Trajectories, the lateral deviancy and target directed movement influenced each other through feedback gain. In particular, the *y*-axis movement was more dependent on the *x*-position than the *y*-position. Thus, the task required complicated cooperative action, and the gain geometry was squashed. However, the gain of mini-max feedback control in NF was a true circle, and larger than that of OFC (**Figure 6B**). The gains in VF also indicated true circles, even if they were larger than those of NF. In DF and rDF, the gains were tuned by the force field, according to the direction of instability, as in the experimental measurements of stiffness (Franklin et al., 2007). In DF, only the lateral axes of the gains were expanded, although the anteroposterior axes were the same as those of NF. In rDF, the gains were similar to the 30◦ rotations of those in DF. The gain without structural uncertainty of VF was a little smaller than that with uncertainty dynamics. In contrast, the gains in DF and rDF also increased toward the unstable directions, as in the conditions with structural uncertainty.

#### **DISCUSSION**

In this study, MMFC is presented as an extension of OFC for use as a robust control technique. This method uses time-varying feedback control for estimated states, including worst-case disturbances expected by familiarity with novel dynamics. The uncertainties of dynamics and noise are defined as disturbances in accordance with a robust control theory. In previous research, the uncertainties were assumed to have a Gaussian distribution (Bays

**FIGURE 5 | Effect of the range of structural uncertainty on the trajectories in MMFC.** Green dotted, red dashed, and blue solid lines indicate lower, middle, and higher uncertainties, respectively, given by the familiarity parameter κ. The top row indicates the end-point trajectories. The middle and bottom rows are single logarithmic plots of the terminal end-point error (mean ± *SD*) and the mean quadratic costs and the 95% confidence intervals (CIs). The CIs were estimated using a bootstrapping procedure with re-sampling 10,000 times. **(A–C)** represent VF, DF, and rDF conditions, respectively.

and Wolpert, 2007; Izawa and Shadmehr, 2008; Crevecoeur et al., 2010); however, it seems unlikely that real-world uncertainties would do so. Accordingly, we modeled the uncertainties of environmental dynamics as structural uncertainties, using the robust control design. The computational method seems adequate, because the central nervous system can minimize the uncertainty of sensory input in two ways: by combining multiple sensory signals with prior knowledge to refine sensory estimates, and by predictive filtering of sensory input to remove less informative components of the signal (Bays and Wolpert, 2007). The simulation results indicated greater performance for environmental dynamics of force fields in terms of robustness and stability, and also reproduced behavioral characteristics. Thus, we consider that MMFC could predict motor behavior in the presence of structural uncertainty, and explain the early process of motor adaptation because it was able to predict a behavior, and achieve the task without environmental information. Furthermore, the feedback gain was increased in unstable directions like the stiffness modulation of a multi-joint arm in arm-reaching movements with unstable dynamics. This suggests that the brain modulates optimal stiffness to obtain efficient robustness, overwhelming the instabilities of the environmental dynamics. Moreover, a recent study suggested that reflex gains (feedback gains) are modulated by the accumulated evidence in support of an evolving decision before the onset of movement (Selen et al., 2012). This seems to support our theory, in that the feedback gains are determined according to the uncertainty of the movement in the motor planning phase before the onset of movement.

The trajectories in VF were somewhat different between our simulations and experimental measurements (Osu et al., 2003). Our simulations of the OFC and MMFC models could not predict the straight trajectory observed in the human study. The result may give the false impression that a trajectory control strategy to reduce motor effort requires a distinct deviation from the nominal straight line. However, the theoretical framework such as OFC actually may not be incompatible with the trajectory control by a cost function that trades off the discordant requirements of target accuracy, motor effort, and kinematic invariance in an acceleration-dependent force field (Mistry et al., 2013). This approach could be considered a MMFC representing the deviation from the straight line with a disturbance. During the period of movement (*k* ≤ *Ns*), we defined the regulated output matrix **E***<sup>a</sup>* as an identity matrix to generalize the MMFC model for motor adaptation problems. However, it was assumed that **E***<sup>a</sup>* transfers the state vector into a disturbance, which is determined by the kinematic constraints, bootstrapping the process of exploration and learning. The kinematic constraints appear reasonable to improve the task, particularly in the early phase of motor adaptation. Thus, we carried out extra VF simulations to examine this assumption. Then we modified the MMFC to replace **E***<sup>a</sup>* (*k* ≤ *Ns*) in Equation (23) with 100 · *diag*([1, 0, 1, 0, 0, 0, 0, 0]) as a kinematic constraint penalizing lateral deviances of the position and velocity (Mistry et al., 2013). Unsurprisingly, the modified MMFC resulted in trajectories that were close to linear (**Figure 7A**). Furthermore, the modified MMFC showed closer trajectories to the linear behavior than other models. These results suggest that kinematic constraints may be applied to determine an MMFC with environmental dynamics to ensure kinematic invariance.

It has been suggested that a cost function should be modulated to increase the ratio of the energy cost, according to the uncertainty of the internal model (Crevecoeur et al., 2010), and standard forms for quantifying cost may not be sufficient to accurately examine whether human motor behavior abides by optimality principles (Berniker et al., 2013). In the model proposed here, the terms expressing the familiarity with the uncertainty are related to the cost values. That is, the cost function is indirectly modulated via the uncertainty of the internal model, which itself may also be reflected in the nervous system's use of impedance control to change the dynamic properties of the body (Burdet et al., 2001; Takahashi et al., 2001; Lametti et al., 2007; Mitrovic et al., 2010). These studies support our proposed model.

#### **LIMITATIONS OF THIS MODEL**

In our simulation results, the changes in the gain in the direction of instability for the DF and rDF of MMFC model are fairly small compared to the magnitude of the experimental measurement (Franklin et al., 2007). This non-conformity may be caused by differences between actual and modeled muscle dynamics. In this study, arm dynamics was simplified as a linear point-mass model.

position gains, respectively, which are opposites in sign. Note that

γ = 108 to satisfy Equations (9) and (10).

of the modified MMFC for VF conditions, and the after-effects of each model. **(B)** Gains for OFC. **(C)** Gains for MMFC. In **(B,C)**, each row

However, the biological arm movement is actually induced by many muscles with non-linear dynamics. The muscle action forms limb stiffness geometry depending on task requirements. Our model did not reflect actual muscle dynamics. Especially, passive muscle mechanisms were not considered in the model. Even when the muscle is relaxed (the activation level is decreased), the active force disappears and the resting length is restored by the passive force (Huxley and Hanson, 1954). Then, limb stiffness is retained during maintained posture without muscle contraction, and the magnitude is not small, compared to muscle contraction effects on that (Osu and Gomi, 1999; Shin et al., 2009). Because the passive limb stiffness acts to inhibit the intended movement, agonist muscles are required to generate active force overwhelming the passive force retaining the posture to initiate movement. Thus, actual limb stiffness may be much higher than that in our simulations.

However, the feedback gain magnitude was small compared to the proprioceptive and visual feedback responses measured in human subjects (Bennett, 1994; Dimitriou et al., 2013). The difference between our simulation and the proprioceptive feedback response (the reflex response) may be attributable to the rigidity of the muscle model, analogous to the magnitude of stiffness modulation. The visual feedback gains measured in humans were purposed not to fall into the sensory feedback gain but the state feedback gain (Dimitriou et al., 2013). Moreover, the response was computed from a time window of 180–230 ms after perturbation onset, and it was not considered how the state estimation was updated for feedback latency. In fact, although the magnitude did depend directly on the cost weights and our model did not separate the visual feedback response from other feedback, our simulations for OFC and MMFC models showed sufficiently large feedback gains, exceed the feedback response reported in humans (**Figures 7B,C**). It has been suggested that the feedback gains show different time profiles. The visual feedback gain showed peaking at the middle of the movement and dropping rapidly at the movement end (Liu and Todorov, 2007; Dimitriou et al., 2013). In contrast, intrinsic feedback gain, measured as stiffness, showed a contrary profile, peaking at the movement onset and end, and dropping in the middle of the movement (Gomi and Kawato, 1997; Ueyama and Miyashita, 2014).

#### **OTHER MODELS**

Although an adaptation algorithm for uncertain dynamics has been proposed (Franklin et al., 2008), it is based on a feedbackerror-learning strategy and requires a desired trajectory (Kawato, 1996; Ueyama and Miyashita, 2014). Thus, the adaptation process and motor planning of the desired trajectory must be considered separately and handled as different problems. In contrast, a MMFC can deal with both issues in the same context, as does OFC.

Friston raised the question of differences between internal models in motor control and perceptual inference in OFC, and suggested that active inference, a corollary of the free-energy principle, reduces to simply suppressing proprioceptive prediction errors (Friston, 2011). Moreover, it has been reported that active inference could acquire complex and adaptive behaviors using a free-energy formulation of perception (Friston et al., 2009), and generate movement trajectories shown to be remarkably robust to perturbations on a limb (Friston et al., 2010). In active inference, the cost function is absorbed into prior beliefs about state transitions and terminal states. Thus, active inference seems attractive as a means of recognizing biological optimization mechanisms, because OFC and MMFC have many free parameters (e.g., cost function and terminal time) that intricately affect the behavior. However, the behavior of active inference seems to be influenced by the estimated probability (i.e., prior assumption of noise and uncertainty) as a substitute for the definition of cost function. We consider that active inference and OFC are not mutually exclusive, and that the free-energy principle is just a "principle" that could unify motor control theories, based on the optimization of a cost. Although the free-energy principle has not been derived from empirical evidence, it can predict neurobiological implementation from the perspective of functional anatomy (Friston et al., 2012). For motor control studies, therefore, the free-energy principle seems to be a useful tool to connect the computational level to the hardware level.

Recently, behavioral studies have focused on understanding how uncertainty, or risk, is represented in motor control tasks, as well as in economic behaviors (Trommershäuser et al., 2008). Violations of risk neutrality have been reported various motor control tasks. For example, subjects exhibited risk-seeking behavior in a pointing task, because they systematically underestimated small probabilities and overestimated large probabilities (Wu et al., 2009). In addition, subjects exhibited risk-average behavior in a motor task that required them to control a Brownian particle with different levels of noise, which is consistent with the notion of a trade-off between the mean and the variance of movement cost (Nagengast et al., 2010). Moreover, it has also been suggested that the sensitivity of the risk is an important factor in motor tasks with speed/accuracy trade-offs (Nagengast et al., 2011). In contrast, when the uncertainty is assumed to have a Gaussian distribution and an exponential-quadratic error criterion, such as the expected unity function describing risk sensitivity, is used as the cost function, the MMFC problem is identified with the risk-sensitive optimal control problem of optimizing the exponential-quadratic error criterion (Speyer et al., 1992). Furthermore, an equivalence has already been established between a deterministic robust control that achieves a prescribed bound on the H<sup>∞</sup> norm of a given closed-loop transfer function and a stochastic optimal control problem (Glover and Doyle, 1988). It has also been shown that the robust control directly connects to the risk-sensitive control via results on maximizing an entropy integral (i.e., the terminal time *N* → ∞). In addition, when the risk sensitivity parameter is equal to zero (in a risk-neutral case), the risk-sensitive control has been identified as an OFC problem. Although, in contrast to previous studies, the MMFC in this paper is derived as a time-varying controller, it is the same as OFC at two conditions: *N* → ∞ and γ → ∞. Thus, a risk-sensitive OFC seems to be a specific case of MMFC with Gaussian uncertainty. However, when there is uncertainty in the equations of motion themselves (e.g., the dynamics of a power tool such as a drill or a screwdriver are different from those of a can, resulting in strikingly different relationships between states and motor commands), structural uncertainty cannot be represented by a Gaussian distribution, and these different structures must be identified and learned (Orban and Wolpert, 2011). The MMFC proposed in this paper can handle the structural uncertainty. However, exploratory risk-taking is directly related to uncertainty in decision-making modulation (Doya, 2008), and the decision making itself may directly relate to motor control systems (Selen et al., 2012). However, the uncertainty problem may not be completely equivalent to the risk-taking problem, because the problems are distinguishable and could be identified as two independent problems (Bach et al., 2011).

#### **LEARNING PROCESS FOR MOTOR ADAPTATION**

Feedback, adaptation, learning, and evolution have been identified as instances of wide sense adaptation, where sensory information is integrated and employed to change the control signals in various techniques and timescales (Karniel, 2011). Adaptive control is the change in the parameters of the control systems generated after the observation of previous control and sensory signals, and learning control is a structural change in the control system to generate a new type of behavior. In human studies, when we perform new or uncertain motor tasks, performance has been found to vary in accordance with the learning process (Shadmehr et al., 2010). Smith et al. reported that adaptation exhibited multiple timescales, driven by fast and slow processes (Smith et al., 2006). They suggested that the fast process, which decays quickly, is strongly affected by errors, but does not produce motor memory, whereas the slow process, which shows little decay, is weakly affected by errors but produces motor memory. On the other side, there are different mechanisms for adapting to stable and unstable dynamics (Osu et al., 2003). It has been proposed that adaptation learning is achieved by a combination of impedance control and an inverse dynamics model. In the early phase of learning, the impedance control also contributes to the formation of the inverse dynamics model, and helps to generate the necessary stability (Franklin et al., 2003b). Previous studies have shown that the function of the fast learning process is to increase the robustness of motor control systems, thereby improving their stability, and the internal model is obtained from multiple trials by impedance control during the slow learning process. We consider that the fast process is provided by instances of feedback and adaptation, whereas the slow process is achieved by adaptation and learning concepts. Thus, we propose MMFC as a robust control to increase the familiarity of both the uncertainty and the impedance in the adaptation of the fast process to improve the stability and reduce the error. The internal model, if it could improve the stability while achieving the task, would learn the actual dynamics across multiple trials, thereby decreasing the uncertainty in the learning of the slow process. Thus, it was recently proposed that complex behaviors in unstable dynamics cannot be explained in terms of a global optimization criterion, but rather require the ability to switch between different suboptimal mechanisms (Zenzeri et al., 2014). We have assumed that the difference between the adaptation and learning mechanisms of stable and unstable dynamics requires that the internal model be represented in different forms, depending on the behavioral policies, off-policy and on-policy algorithms such as Q-learning and SARSA, respectively (Sutton, 1992). For example, unstable dynamics may require a deterministic behavior with an off-policy algorithm, because the cost (or reward) is assumed to be optimized through multiple trials fixing the policy to achieve the motor task in the unstable dynamics. That is, the estimated costs in any trials are required to converge to a value, similar to the idea of the worst-case design in the MMFC. In contrast, stable dynamics are assumed to require stochastic behavior with the on-policy algorithm, because it seems the best way to access the dynamics according to estimations by each trial. In addition, the off-policy algorithm has been recognized as an alternate strategy named "good-enough" control, in which the organism uses trial-anderror learning to acquire a repertoire of sensorimotor behaviors that are known to be useful, but not necessarily optimal (Loeb, 2012).

#### **ACKNOWLEDGMENTS**

A part of this work was supported by JSPS KAKENHI Grant Number 26702023.

#### **REFERENCES**


the environment. *J. Neurosci.* 27, 7705–7716. doi: 10.1523/JNEUROSCI.0968- 07.2007


uncertainty. *PLoS Comput. Biol.* 6:e1000857. doi: 10.1371/journal.pcbi. 1000857


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 July 2014; accepted: 05 September 2014; published online: 24 September 2014.*

*Citation: Ueyama Y (2014) Mini-max feedback control as a computational theory of sensorimotor control in the presence of structural uncertainty. Front. Comput. Neurosci. 8:119. doi: 10.3389/fncom.2014.00119*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2014 Ueyama. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Spinal circuits can accommodate interaction torques during multijoint limb movements

# *Thomas Buhrmann1 and Ezequiel A. Di Paolo1,2,3\**

*<sup>1</sup> Department of Logic and Philosophy of Science, IAS-Research Centre for Life, Mind and Society, UPV/EHU, University of the Basque Country, San Sebastian, Spain*

*<sup>2</sup> Ikerbasque, Basque Foundation for Science, Bilbao, Spain*

*<sup>3</sup> Centre for Computational Neuroscience and Robotics, University of Sussex, Brighton, UK*

#### *Edited by:*

*Ning Lan, Shanghai Jiao Tong University, China*

#### *Reviewed by:*

*Li-Qun Zhang, Northwestern University, USA Patrick E. Crago, Case Western Reeserve University, USA*

#### *\*Correspondence:*

*Ezequiel A. Di Paolo, Department of Logic and Philosophy of Science, Euskal Herriko Unibertsitateko - Universidad del Pais Vasco, Avenida de Tolosa 70, 20080 San Sebastián, Spain e-mail: ezequiel.dipaolo@ehu.es*

torques. One involves the use of internal models to centrally compute predicted interaction torques and their explicit compensation through anticipatory adjustment of descending motor commands. The alternative, based on the equilibrium-point hypothesis, claims that descending signals can be simple and related to the desired movement kinematics only, while spinal feedback mechanisms are responsible for the appropriate creation and coordination of dynamic muscle forces. Partial supporting evidence exists in each case. However, until now no model has explicitly shown, in the case of the second hypothesis, whether peripheral feedback is really sufficient on its own for coordinating the motion of several joints while at the same time accommodating intersegmental interaction torques. Here we propose a minimal computational model to examine this question. Using a biomechanics simulation of a two-joint arm controlled by spinal neural circuitry, we show for the first time that it is indeed possible for the neuromusculoskeletal system to transform simple descending control signals into muscle activation patterns that accommodate interaction forces depending on their direction and magnitude. This is achieved without the aid of any central predictive signal. Even though the model makes various simplifications and abstractions compared to the complexities involved in the control of human arm movements, the finding lends plausibility to the hypothesis that some multijoint movements can in principle be controlled even in the absence of internal models of intersegmental dynamics or learned compensatory motor signals.

The dynamic interaction of limb segments during movements that involve multiple joints creates torques in one joint due to motion about another. Evidence shows that such interaction torques are taken into account during the planning or control of movement in humans. Two alternative hypotheses could explain the compensation of these dynamic

**Keywords: motor control, interaction torques, intersegmental dynamics, spinal circuits, internal model, intralimb coordination, equilibrium-point hypothesis**

# **1. INTRODUCTION**

Human multijoint reaching movements are characterized by invariants such as straight hand paths and bell-shaped velocity profiles (Morasso, 1981; Soechting and Lacquaniti, 1981; Atkeson and Hollerbach, 1985). These invariants hold independently of the amplitude, speed and direction of movement, and therefore independently also of the resulting variation in interaction torques that arise in one joint due to motion about another. The absence of a signature of these time-varying torques in observed kinematics indicates that intersegmental dynamics are compensated for in the planning or execution of arm movements (Hollerbach and Flash, 1982).

Evidence suggests that this compensation is not achieved by executing movements with high stiffness (Gomi and Kawato, 1996; Gribble et al., 1998), which would allow muscle forces to dominate over the passively emerging loads. Rather, muscle activity varies with the direction of interaction torques, such that muscle forces acting at one joint (e.g., the shoulder) are dependent on the direction of motion about another joint (elbow), even when the former joint performs the same motion or remains stationary (Cooke and Virji-Babul, 1995; Latash et al., 1995; Gribble and Ostry, 1999; Galloway and Koshland, 2002; Debicki and Gribble, 2005).

Two possibilities could explain the origin of this intralimb coordination strategy. In computational approaches to motor control the brain is assumed to calculate the time-course of forces necessary to perform desired movements using internal models of the body (Kawato, 1999). This implies the prediction of interaction torques and their explicit compensation through anticipatory adjustment of descending motor commands. The speed and accuracy of skilled ballistic movements (such as throwing), during which feedback may be too slow to mediate compensatory signals, suggests the need for such a predictive strategy. Empirical evidence in its support is provided, for example, by experiments showing a correlation between corticospinal excitability and upcoming interaction torques (Gritsenko et al., 2011), or by patients with hemiparesis whose deficits in reaching movements are consistent with a failure to account for intersegmental dynamics (Beer et al., 2000).

An alternative strategy, based on peripheral feedback, is offered by proponents of the equilibrium-point (EP) hypothesis (Feldman, 1966; Feldman and Levin, 1995; Gribble et al., 1998), which suggests that movements are controlled by simple kinematic shifts in the equilibrium position of the limb, while the required forces result from muscle dynamics and spinal circuitry. The hypothesis predicts that descending motor commands need not take into account upcoming interaction torques during multijoint movements. This would imply that intersegmental dynamics need to be accommodated implicitly through either the neural coupling of muscles acting on adjacent joints (via intersegmental spinal circuitry), or through the mechanical properties of the musculoskeletal system itself. Viscoelastic properties of muscles have been shown to counteract interaction forces in some cases (Hirashima et al., 2003). However, since both viscoelastic forces as well as active muscle forces depend on the level of muscle activation, and can therefore not be controlled independently, they are a poor choice for the precise counteraction of intersegmental loads. Indeed, subjects perform skilled movements despite these viscoelastic properties, rather than because of them (Hirashima et al., 2007; Debicki et al., 2011). If the EP hypothesis is to maintain the idea of simple descending motor commands that are "ignorant" of intersegmental dynamics, then motion about different joints needs to be coordinated appropriately through intersegmental neural coupling of spinal circuits. The question we investigate here is whether such a peripheral coordination strategy is possible.

Several observations indicate that feedback compensation of limb dynamics is plausible at least in principle. Firstly, interaction forces arising at one joint are strongly related to the muscle forces applied to another (Gribble and Ostry, 1999; Galloway and Koshland, 2002), and such muscle forces are encoded reliably in the population response of Golgi tendon organs (Mileusnic and Loeb, 2009). Secondly, Ib afferent activity carrying these forcerelated proprioceptive signals results in widespread modulation of motoneurons innervating muscles acting at adjacent joints (Jankowska et al., 1981). The sensitivity of Ib inhibitory interneurons can be adjusted through input from Ia afferents that carry muscle length and velocity feedback, which allows for precise force regulation throughout a wide range of movements (McCrea, 1992). Though McCrea points out that a hypothesis has yet to emerge that explains this widespread distribution of Ib modulation throughout the limb, it is clear that it would be well suited to play a role in coordinating the simultaneous motion of several joints. Thirdly, functionally deafferented patients have been shown to make systematic movement errors indicative of a failure to counteract interaction forces, demonstrating a functional role for proprioception in the compensation of internal loads (Ghez and Sainburg, 1995; Sainburg et al., 1995). In this experiment the question remains of whether proprioceptive feedback acts in long loops through the CNS, or locally through spinal reflexes. But motion-dependent feedback across spinal segments has also been shown to modulate ongoing limb dynamics in the cat (Smith and Zernicke, 1987; Koshland and Smith, 1989).

Since evidence exists for both predictive and feedback compensation of interaction torques, several authors have suggested that both mechanisms could contribute to the compensation either simultaneously or at different times throughout a movement (Sainburg et al., 1999; Gritsenko et al., 2011). One can also hypothesize that the relative contribution of centrally planned compensation is greater in fast and highly skilled movements, while spinal compensation might be significant in everyday movements such as reaching; or that the spinal contribution is greater early on in development, while being gradually replaced with more precise central corrections acquired by adaptive processes in the CNS. But regardless of the question of when or to what extent it may contribute to a class of movements, the ability and effectiveness of peripheral feedback compensation of interaction torques has yet to be demonstrated. There are, for example, no convincing models of how an EP-approach would work for the peripheral accommodation of interaction torques without recourse to centrally planned compensation.

Our objective in this paper is to fill in a gap in the modeling literature and demonstrate that an EP-based approach can indeed show accommodation of interaction torques. We introduce a model of planar arm movements based on the EP hypothesis, but extended to include spinal reflex dynamics. We show that it is possible to reproduce empirically observed kinematic invariants across a range of directions and magnitudes of interaction torques. Moreover, the model achieves this by transforming simple descending motor commands, derived only from desired movement kinematics, into muscle activation patterns that vary appropriately with upcoming internal loads, independently of their magnitude or other specifics (direction, speed, amplitude) of the movement. Analysis of the model suggests that for some classes of movements the brain may control the motion of limbs as if intersegmental dynamics were absent, while lower level dynamics achieve the necessary coordination locally. For such movements, the brain would not need to rely on internal models of intersegmental dynamics, nor would need to learn a different set of compensatory motor signals for each possible movement.

The biomechanical model employed in this simulation study is deliberately simple. This is because we do not aim primarily at elucidating how exactly, i.e., in quantitative detail, humans compensate for interaction torques. Rather, the aim is to show for the first time that in principle such compensation can be implemented solely on the level of body dynamics and peripheral feedback. We note that intersegmental dynamics, and the problem of accounting for it, not only occurs in human arm movements. It has to be dealt with in movements performed by any (natural or artificial) multijoint system, independent of how it is actuated (whether, for examples, by muscles or motors), as long as movements are executed in a compliant manner. We therefore model a system that structurally is similar enough to certain human arm movements to allow for qualitative comparisons, while abstracting away features that are of little relevance for the question of whether feedback through local spinal circuits is sufficient on its own for the compensation of interaction torques. Specifically, we omit from the model the tendons that connect muscles to the skeleton (which in the case of the upper arm are sufficiently short and inelastic to justify this simplification, see Section 2.2 for further details); and chose to also omit biarticular muscles. The latter choice implies that the results obtained from the model, such as certain muscle activation patterns, cannot necessarily be compared directly with those observed in humans (although we will in fact identify certain similarities). It is certainly the case that biarticular muscles, exactly because they span neighboring joints, may have a special role to play in intersegmental dynamics. However, their omission here allows us to disentangle their potential contribution to interaction torque compensation from contributions due to peripheral feedback mechanisms. In fact, we show here that even without such muscles, feedback between local spinal circuits alone is sufficient for such compensation. In this sense, the level of abstraction chosen in our simulations is analogous to other models that are concerned more generally with (1) the problem of interaction torques, such as Hollerbach and Flash (1982), in which the problem is investigated purely on the level of joint actuation, without any reference to the role of muscles; (2) the function of spinal circuitry in movement control, for example Raphael et al. (2010), who investigate a spinal model of comparable complexity for the control of a single wrist joint actuated by four symmetric muscles; or (3) the inference of unknown neural mechanisms or physiological properties underlying movement control, as for example Izquierdo and Beer (2013), in which properties of the neural circuit underlying *C. elegans* klinotaxis are investigated using a minimal neuroanatomical model and a methodology similar to the one employed in the experiments reported here.

While demonstrating the feasibility of spinal compensation of interaction torques in an EP-control framework for one class of movements, we do not claim that this is the mode employed for all types of movements, nor that it is the main role of spinal circuitry. If, when, or to what extent spinal dynamics contribute to the compensation of intersegmental loads is an empirical question that we do not address in the work presented here. The result should also not be counted as an argument against internal models, but rather in favor of the complex and often unintuitive control that can be achieved from the bottom-up.

#### **2. MATERIALS AND METHODS**

In the following sections we describe in detail the biomechanical model of planar arm movements employed in this study; its control via spinal reflex-like neural networks based on known physiology; the integration of spinal dynamics, proprioceptive feedback, and descending commands at α-MNs according to the equilibrium-point hypothesis; and the optimization procedure used to identify model parameters that enable kinematically realistic multijoint movements subject to varying patterns of interaction torques.

#### **2.1. ARM MODEL**

The biomechanical simulation implements the simplest model that allows for the investigation of interaction torques and their compensation, namely a planar arm consisting of two rigid segments connected by hinge joints (see **Figures 1**, **3**). We will use labels such as "shoulder" and "elbow" for the joints (as well as "flexor" and "extensor" for muscles) in analogy to human physiology, even though aspects of the simplified model may vary in details from their human counterparts. To model the dynamics of the planar arm we use here the formulation by Hollerbach and Flash (1982), which derives joint torques from the arm's kinematics, Newton-Euler equations and d'Alembert's principle. The resulting equations of motion explicitly factor in the contribution of external, inertial, coriolis and centripetal forces:

$$\begin{aligned} \eta\_2 &= \ddot{\theta}\_1 \left( I\_2 + \frac{m\_2 l\_2 l\_1}{2} \cos \theta\_2 + \frac{m\_2 l\_2^2}{4} \right) + \ddot{\theta}\_2 \left( I\_2 + \frac{m\_2 l\_2^2}{4} \right) \\ &+ \frac{m\_2 l\_2 l\_1}{2} \dot{\theta}\_1^2 \sin \theta\_2 \\ \eta\_1 &= \ddot{\theta}\_1 \left( I\_1 + I\_2 + m\_2 l\_2 l\_1 \cos \theta\_2 + \frac{m\_1 l\_1^2 + m\_2 l\_2^2}{4} + m\_2 l\_1^2 \right) \\ &+ \ddot{\theta}\_2 \left( I\_2 + \frac{m\_2 l\_2^2}{4} + \frac{m\_2 l\_2 l\_1}{2} \cos \theta\_2 \right) \\ &- \frac{m\_2 l\_2 l\_1}{2} \dot{\theta}\_2^2 \sin \theta\_2 - m\_2 l\_2 l\_1 \dot{\theta}\_2 \dot{\theta}\_1 \sin \theta\_2 \end{aligned} \tag{1}$$

Here η are torques applied to the joints externally (e.g., by muscles), θ the joint angles, *I* the rigid body inertias, *m* their masses, and *l* their lengths. Subscripts indicate the segment for parameters describing properties of the rigid bodies (1 = upper arm, 2 = lower arm). In the case of torques, angles and their derivatives, subscripts indicate the joint (1 = shoulder, 2 = elbow). We let *m*<sup>1</sup> = 2.25 kg, *m*<sup>2</sup> = 1.3 kg, *l*<sup>1</sup> = 0.33 m and *l*<sup>2</sup> = 0.32 m, and inertias *Ii* = (*mil* 2 *<sup>i</sup>* )/12 (Karniel and Inbar, 1997). For our analysis of the relative contribution of muscle and interaction torques to the net torques observed in a particular movement, we consider interaction torques to consist of the sum of those terms in the above equations that depend on movement in another joint; e.g., terms depending on θ˙ <sup>1</sup> or θ¨ <sup>1</sup> in the equation for η2.

#### **2.2. MUSCLE MODEL**

Each of the two joints is actuated by an antagonistic muscle pair (**Figure 1**). The lumped muscles are described by a Hilltype model that captures the essential non-linear relationships between muscle length, contraction velocity, and force generation (Zajac, 1989). It consists of three components: an active contractile element in parallel with both a passive elastic spring and a viscous damper. The first component describes a muscle's isometric force generating capability *F*ˆ*<sup>a</sup>* as a function of its length and is modeled using the quadratic function

$$
\hat{F}\_a(\hat{l}^m) = 1 - \left(\frac{\hat{l}^m - 1}{0.5}\right)^2 \tag{2}
$$

where *l <sup>m</sup>* is muscle length and variables decorated with the "hat" symbol (ˆ) are normalized: ˆ*l <sup>m</sup>* = *l <sup>m</sup>*/*l m* <sup>0</sup> , *<sup>F</sup>*<sup>ˆ</sup> <sup>=</sup> *<sup>F</sup>*/*Fmax* and *<sup>l</sup> m* <sup>0</sup> the length at which active muscle force reaches its isometric maximum *Fmax*. The passive elastic element is described by a quadratic dependence of force *F*ˆ*<sup>p</sup>* on muscle extension beyond a given threshold ˆ*l m <sup>p</sup>* (Kistemaker et al., 2007):

**FIGURE 1 | Model of two-joint planar arm actuated by antagonistic muscles under control of spinal interneurons.** Shown are two spinal circuits, one for each pair of antagonistic muscles. Connections are drawn between interneurons regulating muscles acting on the same joint, as well as those coupling adjacent joints (only one direction is shown for simplicity; the structure of Ib connections between segments is symmetric in the model). Ia pathways are shown in red, Ib pathways in orange, and Renshaw cells in gray. Flexor related circuitry is drawn as solid and extensor as dashed lines. Excitatory synapses are displayed as triangles and inhibitory synapses as disks. Those that during optimization can be of either type are drawn as

$$
\hat{\mathbb{P}}\_{\mathbb{P}}(\hat{l}^{m}) = \begin{cases}
k\_{\mathbb{P}}(\hat{l}^{m} - \hat{l}\_{\mathbb{P}}^{m})^{2} & \text{if} \quad \hat{l}^{m} > \hat{l}\_{\mathbb{P}}^{m}, \\
0 & \text{if} \quad \hat{l}^{m} \le \hat{l}\_{\mathbb{P}}^{m}
\end{cases} \tag{3}
$$

where *kp* is scaled such that *F*ˆ*<sup>p</sup>* = 0.5*Fmax* at the muscle length where *F*ˆ*<sup>a</sup>* drops to 0 (Kistemaker et al., 2007) and ˆ*l m <sup>p</sup>* = 1 (Zajac, 1989). Both components are shown in **Figure 2A**.

Hill's equation of muscle contraction dynamics is given in a form that describes normalized muscle force *F*ˆ*<sup>v</sup>* = *Fv*/*Fmax* as a hyperbolic function of normalized (lengthening) velocity *v*ˆ = *v*/*vmax*. In the case of contraction (*v* < 0):

$$
\hat{F}\_{\nu}(\hat{\nu}) = \frac{1 + \hat{\nu}}{1 - \hat{\nu}/k\_{sh}} \tag{4}
$$

with *ksh* regulating the curvature of the function (McMahon, 1984). For lengthening muscle (*v* > 0) an analog but inverted squares. Three types of signals descend from higher centers (blue). These are: the stretch reflex threshold λ*<sup>d</sup>* (implying appropriate coordination of α and γ fusimotor drives, see section on threshold control); a coactivation signal λ*co* to the α-MNs, and a GO-signal distributed to all spinal neurons (each receiving this signal via its own weighted connection, not shown here). The topology of the circuits is symmetric, but synaptic strengths can be assigned asymmetrically. The topology is also identical for the two joints, though for clarity some connections of the shoulder joint are omitted in the figure. Muscles wrap around joint capsules of radius *r*1,<sup>2</sup> and insert into arm segments of lengths *l*1,2.

hyperbola is often used, which is parameterized by *kmax*, *kle*, and *km*, which describe respectively the asymptotic value lim*v*→∞ *F*ˆ*v*, the curvature, and the slope at *v* = 0 as a multiple of the corresponding slope in the case of contraction (Kistemaker et al., 2006). Such a hyperbola is given by

$$\hat{F}\_{\mathbf{v}}(\hat{\mathbf{v}}) = \frac{k\_l - k\_{\max}\hat{\mathbf{v}}}{k\_l - \hat{\mathbf{v}}}, \quad k\_l = \frac{k\_{l\epsilon}(1 - k\_{\max})}{k\_m(1 + k\_{l\epsilon})} \tag{5}$$

The resulting function for concentric as well as eccentric contraction is shown in **Figure 2B**.

The total force *F* produced by a muscle depends on *F*ˆ*a*, *F*ˆ*p*, and *F*ˆ*<sup>v</sup>* in a multiplicative way (see **Figure 2C**):

$$F = aF\_{\text{max}}\hat{F}\_a(\hat{l}^m)\hat{F}\_\nu(\hat{\nu}) + F\_{\text{max}}\hat{F}\_p(\hat{l}^m) \tag{6}$$

where *a* describes muscle activation dynamics and is implemented as a filter on neural excitation α, interpreted as firing rate in the range [0, 1], with different activation and deactivation rates β*ac* and β*de* respectively:

maximum activation level *a* = 1. Thick red lines highlight the force-length curve at rest (*v*ˆ = 0), and the force-velocity curve when the muscle is at its

optimal length (ˆ

*l <sup>m</sup>* = 1).

$$\dot{a} = \begin{cases} (\alpha - a) / \beta\_{ac} & \text{if} \quad \alpha \ge a, \\ (\alpha - a) / \beta\_{de} & \text{if} \quad \alpha < a \end{cases} \tag{7}$$

Muscle length *l <sup>m</sup>* is calculated from arm kinematics based on a geometric model of muscle paths wrapping around joint capsules as proposed by Houk et al. (2002). Given a muscle's points of origin and insertion (*po*,*i*), the path depends only on the radii *r*1,<sup>2</sup> of the spherical joint capsules it may wrap. The same model also determines each muscle's changing moment arm *ma*, from which we ultimately compute the torque η*<sup>m</sup>* = *maF* that is applied by a muscle at a joint.

Finally, summing the two individual muscle torques acting on the same joint we arrive at the total external torques η*<sup>i</sup>* (*i* ∈ {1, 2}). These are substituted in Equation 1, which allows us to rearrange for joint accelerations θ¨ *<sup>i</sup>* and to integrate the dynamical equation using the Euler method (step size of 0.001).

Tendons were omitted from the model. The inclusion of tendons is important in some contexts, such as studies of the physiological mechanism underlying the disambiguation of musculotendon length (see e.g., Kistemaker et al., 2012), which we here take for granted. But the majority of muscles in the human upper arm feature short tendons, with ratios of tendon slack length to muscle fiber length on the order of 1–2 (Zajac, 1989; Garner and Pandy, 2003; Kistemaker et al., 2007), as opposed to long elastic tendons with ratios on the order of 10. Such tendons can store only small amounts of elastic energy and therefore have little effect on overall movement dynamics (Zajac, 1989; Gribble et al., 1998; Murray et al., 2000). At sub-maximal levels of muscle activation, as used here, their effect on static musculotendon properties (e.g., the force-length curve) is small too (Zajac, 1989). On that basis, and since current evidence does not suggest a special role for tendons in the creation or compensation of interaction torques, we consider the omission of (or assumption of an inelastic) tendon to be admissible for the purpose of our investigation.

All muscle parameters, such as maximum isometric forces or muscle excursion, were limited to ranges found in the human upper limb (Zajac, 1989; Lemay and Crago, 1996; Garner and Pandy, 2003), and are summarized in **Table 1**.

#### **2.3. SPINAL MODEL**

We include in our neural model of spinal circuitry only the most well-known afferents, interneurons and connections. The architecture in its basic form is similar to previous models (Bullock and Grossberg, 1991; Lan et al., 2005; McCrea and Rybak, 2008; Raphael et al., 2010) and can be described as an antagonistically organized pattern generator (also see Pierrot-Deseilligny and Burke, 2005, for an overview of the connectivity). In particular, we include in our model the following interneurons and their connections (see **Figure 1**).

The Ia pathway includes monosynaptic excitation of the (homonymous) alpha motor neuron pool (α-MN) through muscle spindles, i.e., the myotatic (stretch) reflex; reciprocal inhibition of the antagonist α-MN via Ia interneurons (IaIn); and reciprocal inhibition between IaIns. IaIn further receive descending connections, which in this model carry signals related to the desired contraction of the corresponding muscle.

On the Ib pathway, inhibitory interneurons (IbIn) mediate autogenic inhibition of the homonymous α-MN via afferents from Golgi tendon organs, and also reciprocally inhibit each other. Optional in our model are the Ib reciprocal excitation of antagonist α-MN (interneurons omitted for simplicity), and Ia projections to IbIn (Jankowska, 1992). In addition, Ib afferents project to spinal circuits regulating adjacent joints and form connections there with IbIn and α-MN.

Modeled feedback from Ib afferents is based on the observation that the ensemble activity of Golgi tendon organs provides an estimate of the total force acting on a muscle over a large



*See Materials and Methods for further detail.*

range of force production (Crago et al., 1982; Mileusnic and Loeb, 2009). Therefore, modeled Ib afferents here signal the (normalized) muscle force *F*ˆ. The model also includes recurrent inhibition of homonymous α-MN via Renshaw cells, which reciprocally inhibit each other.

All interneurons in this circuit are modeled as leaky integrators with logistic transfer function, a model commonly used to describe the average neural firing rate as a function of stimulus (e.g., Dayan and Abbott, 2001, pp. 33, 57):

$$
\pi \dot{\boldsymbol{\gamma}}\_i = -\boldsymbol{\gamma}\_i + \sum\_{j=1}^n \boldsymbol{w}\_{ji} \sigma \left( \boldsymbol{\gamma}\_j + \theta\_j \right) \tag{8}
$$

where *yi* is the activation of neuron *i*, τ its time constant, *wji* the strength of the connection from neuron *j* to *i*, θ a bias term, and σ(*x*) = 1/(1 + *e*−*kx*) the logistic activation function with k specifying its steepness. All parameters describing this equation are subject to optimization.

#### **2.4. THRESHOLD CONTROL**

The threshold control formulation of the EP theory, the λ-model, assumes that descending motor signals are integrated at the α-MN membrane with afferent feedback from muscles, such that changes in central commands shift the threshold at which muscles become active (Feldman and Levin, 1995). When a muscle is stretched, the resulting afferent influence will lead to an increase in membrane potential until the muscle reaches a length at which the threshold is exceeded and the motor neuron starts firing. The resulting activation produces muscle shortening and thus tends to move it closer to the threshold length, thereby establishing an equilibrium in spatial coordinates (muscle lengths).

Models of this kind are consistent with empirically observed levels of damping, stiffness, and feedback delays (St-Onge et al., 1997; Gribble et al., 1998; Kistemaker et al., 2006; Pilon and Feldman, 2006), and have successfully been employed to address problems such as motor redundancy (Balasubramaniam and Feldman, 2004), sense of effort (Feldman and Latash, 1982), the relation between kinematics, dynamics and EMG patterns in reaching movements (Feldman et al., 1990; Gribble et al., 1998), load adaptation (Gribble and Ostry, 2000) and anticipatory grip-force modulation (Pilon et al., 2007).

Few studies have addressed the problem of interaction torques in the context of the EP theory. In Flash and Gurevich (1997) the authors propose an EP model that addresses adaptation to loads (such as those arising internally), which requires for each new load the tuning of limb stiffness and the modification of the time-course of the EP shift based on knowledge of the load's force and joint stiffness. Gribble and Ostry (2000) have shown that a simple learning mechanism can make use of only the positional error resulting from unexpected loads to learn for each movement corrections of otherwise simple motor commands. Also, Flanagan et al. (1993) have used an EP model to investigate the nature of control signals underlying two-joint arm movements, but did not systematically vary the direction of movements in a way that would have allowed them to study the effect of interaction torques. Our model differs from these in that it aims to identify a single neuromuscular system accommodating interaction torques independently of their magnitude and direction, i.e., without a different set of compensatory motor signals for each possible movement.

We assume that muscle spindles (together with the complex of static and dynamic γ -MN) can provide information about both muscle length and velocity via type Ia and II afferents. This feedback is used in the model to measure deviations from muscle threshold length λ and acts so as to minimize it (directly via the stretch reflex and indirectly via input to the interneurons). Similar to other models of threshold control (Feldman et al., 1990; Kistemaker et al., 2006), a muscle's α-MN pool activity at time t integrates central commands and afferent feedback according to the following equation:

$$\alpha\_t = \left[ k\_\rho (l\_{t-\delta} - \dot{\lambda}\_t) + k\_\nu (\dot{l}\_{t-\delta} - \dot{\lambda}\_t)^{p\_\nu} + k\_d \dot{l}\_{t-\delta}^{p\_d} + N \right]\_0^1 \tag{10}$$
 
$$\lambda = \lambda^d - \lambda^{co} \tag{10}$$

Here *l* is muscle length, λ the commanded muscle threshold length, δ a feedback delay, *kp*,*v*,*<sup>d</sup>* gain parameters controlling the effect of position error, velocity error and damping respectively, the function [*x*] 1 <sup>0</sup> clamps its argument to the interval [0, 1], and N summarizes the influences of all spinal interneurons with connections to α-MNs. The reference velocity component has been proposed by McIntyre and Bizzi (1993) to account for fast movements executed with low stiffness, and the exponents *pv*,*<sup>d</sup>* allow for modeling of non-linear viscosity effects (Gielen et al., 1984). Both these extensions of the original λ-model are optional, and we will show that they are unnecessary when spinal circuitry is taken into account, but that at least one is needed if the contribution of these circuits is omitted. The duration of the feedback delay is 0.025 s, based on the short-latency EMG response to unloading of human arm muscles (Houk and Rymer, 1981).

For multiple muscle systems, such as the antagonistic setup used here, threshold control proposes that the descending signal λ consists of two additive components, one of which shifts the position of the combined equilibrium of the system, while the other specifies a range of muscle coactivation around the equilibrium point (Feldman and Levin, 1995). These two components are denoted as λ*<sup>d</sup>* and λ*co* in the above equation. In principle, independence of the coactivation component from the positional component can be achieved through coordinative processes in spinal circuits (Feldman, 1993), or centrally using information about muscle and skeletal morphology (e.g., in the form of a learned mapping). For simplicity, we here let the optimization procedure select the coactivation level λ*co*.

For reasons of stability in the reciprocally organized spinal circuits, which in some configurations can be prone to undesired reverberations, all interneurons receive a movement-unspecific, descending, GO signal (Bullock et al., 1998; Raphael et al., 2010), which has the potential to gate the contribution of spinal interneurons to α-MN activity. The GO signal is set to 1 at the beginning of movement and gradually drops to 0 at the end:

$$\begin{aligned} GO(t\_0) &= 1\\ GO(t + \Delta t) &= \begin{cases} GO(t) & \text{if} & \text{ $t \le T$ }\\ 0.95GO(t) & \text{if} & \text{ $t > T$ } \end{cases} \end{aligned} \tag{11}$$

where *t*<sup>0</sup> is the beginning and *T* the desired duration of the movement. Because of this time course, the GO signal cannot modulate neural dynamics during the execution of the movement, but can only alter interneuronal contributions after the movement terminates (and thus potentially prevent undesired oscillations when the limb is supposed to be at rest again).

#### **2.5. SIMULATED MOVEMENTS AND CONTROL SIGNALS**

We consider two types of arm movements (**Figure 3**): "whipping" movements, where elbow and shoulder joints move in the same direction, and "reaching" movements, where the two joints move in opposite directions. Interaction torques broadly oppose the movement in the case of whipping and assist it in the case of reaching, at least in the initial phase (Gribble and Ostry, 1999; Galloway and Koshland, 2002). We try to identify spinal circuits that can produce smooth hand trajectories in both of these conditions. We therefore simulate four distinct movements. For each of two starting poses *SA* and *SB*, specified in shoulder (θ1) and elbow (θ2) joint angle coordinates—*SA* : {θ<sup>1</sup> = 60◦, θ<sup>2</sup> = 90◦} and *SB* : {θ<sup>1</sup> = 80◦, θ<sup>2</sup> = 110◦}—we hold desired elbow kinematics constant (a 30◦ flexion) while changing the direction of shoulder movement (20◦ flexion or extension). This results in different directions of interaction torques arising at the elbow. A similar regime has been used in experiments with human participants (Gribble and Ostry, 1999). The duration of

**and whipping movements (***WA,B***).** Purple arrows indicate the desired hand path. In blue are shown the origins for shoulder (θ1) and elbow (θ2) joint angles. Red arrows indicate the direction of muscle torque required to initiate the desired motion about one joint, and orange arrows the direction of interaction torques due to motion about the other. Generally, muscle torques applied at one joint result in interaction torques of the opposite direction at the adjacent joint. This leads to interaction torques assisting the motion in the case of reaching movements, and resisting it during whipping movements. Note that all movements involve joint rotations of the same amplitude (±20◦ for the shoulder and −30◦ for the elbow), only their direction changes.

the desired movements is 0.3 s, which implies moderate average rotational velocities of 100◦/s and 66.67◦/s for the elbow and shoulder joint respectively. Note that as a result of the non-linear nature of arm kinematics, the four movements all differ in the distance traveled by the hand. These are *WA* = 0.32, *RA* = 0.11, *WB* = 0.28, and *RB* = 0.13 m, where *W* and *R* denote whipping and reaching movements with subscripts A and B indicating the starting position.

The following scheme is used to derive central command signals λ*<sup>d</sup>* for the four movements from their corresponding initial and target postures. Based on the assumption that hand trajectories are controlled in extrinsic space (Morasso, 1981; Flash and Hogan, 1985), the two postures, given in joint coordinates, are translated into hand positions (the tip of the forearm segment in the model) using forward kinematics. From initial and final hand positions we then derive a commanded trajectory, sometimes called the "virtual equilibrium trajectory," which corresponds to the true equilibrium of the system only in the absence of load perturbations. Based on the finding in similar models that the equilibrium-point trajectory might lead the actual movement, i.e., reach the final position before the end of the movement (see e.g., Ghafouri and Feldman 2001), we allow the commanded trajectory to be shorter. Its duration as a fraction of the desired movement is an optimized parameter. To avoid discontinuities in the control signals, which could induce undesired oscillations, the commanded trajectory changes smoothly between the initial and final hand position. Finally, using inverse kinematics, at each time step the commanded hand position is transformed into commanded muscle threshold lengths λ*d*. In other words, each muscle's λ*<sup>d</sup>* for a given commanded hand position is the length of that muscle at the corresponding position. The open-loop component λ*co* gradually increases from 0 to its selected level, remains constant throughout the movement, and then gradually relaxes back to 0 (Feldman and Levin, 1995). Example time series of the two control signals are shown in **Figure 4B**.

In summary, the continuous time-varying control signals are fully determined by three parameters, namely the initial and final hand position as well as a coactivation level, and it is assumed that both the virtual equilibrium position of the hand as well as the open-loop component change smoothly. An inverse kinematic mapping is used to translate hand positions into muscle threshold lengths.

#### **2.6. OPTIMIZATION**

We use a genetic algorithm (GA) to search for parameters of the spinal circuits and muscles such that the combined system produces in all movement conditions smooth hand motion as described by a minimum jerk trajectory. This is done in two separate stages.

#### *2.6.1. Muscle parameters*

First, a set of minimally valid muscle parameters is identified that subsequently remains fixed (22 parameters in total, see **Table 1**). We employ only two criteria in the search for these muscle parameters. Firstly, the parameters have to fall within physiologically plausible ranges (also provided in **Table 1**). Secondly, the resulting musculoskeletal system, when driven with static and sub-maximal activations, has to exhibit stable equilibria at least over the joint range employed in subsequent simulations. This is a property too of biomechanically more realistic simulations and most probably also humans (Kistemaker et al., 2007). The overall complexity of the musculoskeletal model is kept at a level sufficient not to dissolve the problem of interaction torques in the first place, and is not meant to be a high-fidelity reproduction of the human upper arm complex. For this reason the set of lumped muscles was determined using the minimal criteria mentioned above, rather than an average over empirical measures, which would be inappropriate to map to the setup used here. Note that the search criteria for muscle parameters do not include the minimization of interaction torques or any other related objective. In fact, no online control is used at this stage at all, and spinal networks are completely ignored. This ensures that the identification of the muscle setup is completely independent of the subsequent problem of optimizing spinal feedback control.

#### *2.6.2. Spinal network parameters*

Parameters related to neural circuits and control signals (133 in total, see **Table 1**) are optimized with the goal of producing (1) smooth hand motion, (2) low levels of co-activation and (3) insignificant muscle forces before and after movement. To this

end we define an objective function that consists of the three components *fD*, *fC*, and *fF*, which capture each of the previous criteria respectively. The overall performance of the system for a single movement trial, and given a particular set of parameters, is then measured by simply multiplying the three individual components, which we describe next in detail.

As a criterion for smooth hand movements that reproduce empirically observed kinematic invariants we first define an ideal minimum-jerk trajectory (Flash and Hogan, 1985), *r*(*t*), between the hand's initial and final position. The performance criterion *fD* then depends on the mean squared error *D* between actual hand position *p* and reference *r* along the trajectory, at the relative time lag that minimizes the error between the two time series:

$$D = \min\_{-0.1 < d < 0.1} \sum\_{t=0}^{T} \|\vec{p}\_t - \vec{r}\_{t-d}\|^2 \tag{12}$$

$$f\_D = 1 - \sqrt{\frac{D}{T/dt}}\tag{13}$$

where *t* is time (in discrete steps of *dt* = 0.001 *s*) and T the movement duration. The latter includes periods without motion before and after the actual movement (0.1 s and 0.3 s respectively). *D* is the movement error for the best matching time lag *d*, and the final performance measure *fD* scales the error such that its maximum is 1.

Though the periods of stationarity implicitly tend to suppress oscillations at the beginning and end of a movement, we found it necessary to further constrain solutions to reduce force production before and after the desired movement. To this end, we introduce a performance measure *fF* which decreases in proportion to average muscle forces *F*ˆ greater than 4% of their isometric maximum:

$$
\langle \hat{F} \rangle\_{i,t} = \frac{1}{2} \langle \hat{F}^{\dot{t}}(t\_0 + t) + \hat{F}^{\dot{t}}(T - t) \rangle\_{i,t} \tag{14}
$$

$$f\_{\mathcal{F}} = \begin{cases} 0.04/\langle \hat{F} \rangle\_{i,t}, & \text{if} \quad \langle \hat{F} \rangle\_{i,t} > 0.04\\ 1, & \text{if} \quad \langle \hat{F} \rangle\_{i,t} \le 0.04 \end{cases} \tag{15}$$

where the notation *xi <sup>i</sup>*,*<sup>t</sup>* refers to the average of variable x measured over all components *i* of the variable and over the duration of time given by *t*. Here, the component *F*ˆ*i*,*<sup>t</sup>* measures the normalized muscle force averaged over the first and last 0.1 s of a trial (0 ≤ *t* ≤ 0.1 *s*) and over all muscles *i* = 1,..., 4 (*t*<sup>0</sup> is the beginning of a trial and T its end). For average muscle forces *F*ˆ greater than 0.04 (4% of maximum isometric force), the corresponding component *fF* quickly decreases toward 0, and its maximum is 1 if average muscle forces remain below this threshold.

A final performance constraint *fC* aims to rule out solutions that exhibit high levels of joint stiffness, which would allow muscle forces to dominate over the effect of interaction torques. It measure for each joint *j* a measure *Cj* which punishes muscle activation patterns in which coactivation is greater than a given threshold:

$$C\_{j} = 1 - \max\left(0, \max\_{0 \le t \le T} \left(\min\left(\alpha\_{t}^{j\_{\ell}}, \alpha\_{t}^{j\_{\ell}}\right)\right) - 0.2\right) \tag{16}$$

$$f\_{\mathbb{C}} = \prod \mathbb{C}\_{\mathbb{H}} \tag{17}$$

Here, *Cj* becomes increasingly smaller than 1 when the coactivation of muscles acting at joint *j*, measured by the minimum of flexor and extensor α-MN activations α*je*,*<sup>f</sup>* , becomes greater than 20% of the maximum at any point throughout the trial. The final performance component *fC* is then calculated by multiplying the measures *Cj* obtained for different joints *j* to ensure that the constraint is observed by all joints.

Finally, the performance for trial *i* is given by *f <sup>i</sup>* = *fD* · *fF* · *fC*, and the overall performance across all trials by the product *f* = *<sup>N</sup> <sup>i</sup> f <sup>i</sup>* , with N being the number of trials (movements). Individual and composite performance measures are constructed such that their maximum is 1 in the case of optimal performance and 0 in the worst case. Using the product prevents optimization of a subset of movements at the expense of others.

All parameters are optimized by maximizing performance *f* using a version of the microbial genetic algorithm (Harvey, 2011), with mutation implemented as a random offset vector in the unit hypersphere (vector length chosen from a Gaussian distribution, and direction from a uniform distribution).

The set of model parameters optimized in this stage consists of synaptic strengths, neural biases and time constants, feedback gains, as well as the duration of the commanded equilibrium shift and the cocontraction signals λ*co* (**Table 1**). Only the cocontraction components are optimized on a per-movement basis, i.e., a different set of λ*co* is identified for each target position.

Note that the optimization procedure is not meant to be a model of the developmental phase establishing the appropriate connectivity of spinal networks in humans (though potential adaptive processes underlying it are mentioned in the discussion). In particular, we do not claim that the CNS uses a minimum jerk criterion to learn how to perform reaching movements, rather than minimizing, say, energy expenditure or variance in the presence of noise. The sole purpose of the optimization is to identify whether there exists at all a spinal circuit, given the constraints of realistic network topology and the right kind of proprioceptive stimuli, that allows for the feedback compensation of interaction torques.

# **3. RESULTS**

In this section we report on the best neuromuscular system identified in 21 independent runs of the optimization procedure (a list of all fixed and optimized parameters is provided in a supplementary file accompanying this text). This model achieves 98.97% of maximum performance. Both additional performance constraints (coactivation less than 20% and average muscle force less than 4% before and after movement) are completely satisfied, hence the performance level is due solely to trajectory error. **Table 2** summarizes movement errors and kinematic indices for the four movements.

The mean Euclidean distance (MED) between actual and reference trajectory is on average 3.1 mm for whipping movements (which cover an average movement distance of 30 cm), and 1.6 mm for reaching movements (average movement distance of 12 cm). For comparison, when the model is optimized without contribution from spinal interneurons and movements are under sole control of the threshold model as described in equation (9), the MED for *WA* is 25 mm and the average across all four movements 11.8 mm (see below section on the role of IbIn for a more detailed comparison). The results show that the optimization procedure successfully identifies spinal circuits that produce minimum jerk-like hand trajectories.



*MED is the mean Euclidean distance between actual and desired hand paths; vt max the maximum tangential hand velocity; and ve*,*<sup>s</sup> max elbow and shoulder maximum rotational velocities.*

**Table 3 | Summary of muscle setup after optimization with static control signals for achieving stable equilibria at all positions required in subsequent feedback control.**


*Origin and insertion points are measured as distances from the joint axis.*

#### **3.1. MORPHOLOGY**

A summary of the lumped muscle setup is shown in **Table 3**. Note that although muscles were constrained to be symmetric, excursion varies between flexors and extensors in the model (because the range of motion is asymmetric). Also, like mono-articular human elbow muscles, extensors have a constant moment arm, while for flexors the moment arm is changing with joint angle. A salient feature of the optimized morphology, specifically muscle insertions and optimal lengths *l*0, is the fact that the excursion of all muscles is confined mostly to the ascending leg of the forcelength curve, i.e., muscle lengths over the whole joint range are mostly smaller than their respective optimum length (where force production peaks). This can also be observed for the majority of human upper arm muscles (Murray et al., 2000; Garner and Pandy, 2003). Though it is not clear whether this is the reason, it increases the probability that the isometric moment-angle relationship of a joint (given by the sum of the isometric momentangle curves of all muscles acting at the joint) exhibits a single stable equilibrium only (see e.g., Kistemaker et al., 2007).

The maximum isometric force *Fmax* for elbow muscles was constrained to be smaller than that of the shoulder muscles, since the latter need to support and transport a larger mass than the former. Also, in the literature the strongest shoulder muscles (such as the deltoid) are consistently reported to be stronger than elbow muscles (Nijhof and Kouwenhoven, 2000; Garner and Pandy, 2003; Holzbaur et al., 2005). The strength difference in our optimized model is rather big. For the purpose of our investigation, however, this would pose a problem only if optimized muscles were unrealistically strong, and if neural control would exploit this strength to overpower the interaction torques at the elbow joint. But this is not the case. As we demonstrate below, the combination of muscle strength and activation levels leads to a realistic range of dynamic torques throughout the movement. For example, interaction torques can reach the same level as muscle torques, and maximum shoulder torques (on the order of 10 Nm), are a multiple of maximum elbow torques (about 2 Nm), similar to measures from human subjects (e.g., Galloway and Koshland, 2002).

Finally, maximum contraction velocities (*vmax* approx. 11.4 *l*0/*s*) fall within a physiologically plausible range. Zajac (1989), for example, assumes an average of about 10 *l*0/*s*; Ranatunga (1984) measured values between 7 and 13 *l*0/*s*; for the empirically based model used in Kistemaker et al. (2007) *vmax* is not specified numerically, but visual inspection indicates values about or greater than 10.

The described morphology is only one of a wide range discovered by the search procedure. Others were found with similar performance but great variation in selected parameter values, indicating that the task underspecifies the required musculoskeletal properties.

#### **3.2. MOVEMENT KINEMATICS**

To assess whether the optimized model reproduces empirically observed kinematic invariants (Morasso, 1981; Soechting and Lacquaniti, 1981; Atkeson and Hollerbach, 1985; Flash, 1987; Flanagan et al., 1993), in **Figure 4** we show movements performed by the best optimized spinal circuit. Trajectories are approximately straight with slight curvature, feature small hooks at the target positions, and exhibit the characteristic bell-shaped velocity profile. This is true for all four movements, i.e., independent of the direction of interaction torques or starting posture.

In panel B, we plot example joint trajectories for movements *WA* and *RA*. Consistent with earlier findings (Ghafouri and Feldman, 2001), we observe that the commanded trajectory (dashed) is significantly shorter than the movement period. The optimized duration covers 44.6% of the actual movement. Also note that the commanded (and desired) trajectories of individual joints are offset slightly in time, a result of their derivation through inverse kinematics from a hand path planned in extrinsic space.

#### **3.3. MOVEMENT DYNAMICS**

The demonstrated kinematic invariants alone are not sufficient to imply active compensation, or accommodation, of interaction torques by the spinal cord. Although we know that central commands in our model carry no anticipative corrective components—the threshold shift is always of the same monotonic form—we need to show that the spinal cord can transform these identical control signals into muscle activation patterns that differ qualitatively with the direction of interaction torques (Cooke and Virji-Babul, 1995; Latash et al., 1995; Gribble and Ostry, 1999; Galloway and Koshland, 2002; Debicki and Gribble, 2005). **Figure 5** shows torque patterns produced by our model for movements starting from initial posture *SA* (corresponding to kinematics shown in **Figure 4B**).

Firstly, we observe that the interaction torque experienced in one joint (dashed) strongly correlates with the total torque (filled) in the other. Secondly, comparing whipping and reaching movements, we find that elbow torques vary with the direction of shoulder motion even though elbow kinematics are held constant. In whipping movements interaction torques due to shoulder movement initially oppose the movement of the elbow. This effect is significant, as peak interaction torque in the elbow is slightly greater than the muscle torque. Interestingly, in this case the two torque components almost cancel out, which leads to a delay in the onset of elbow motion of exactly the duration necessary to follow the desired hand trajectory. Whipping movement *WB* (not shown) differs in this respect, in that the interaction torque here is slightly smaller than muscle torque, resulting in no such onset delay (again, as the desired hand movement requires). The shoulder, in contrast, is subject to only minimal interaction torques, and its movement is consequently dominated by muscle torques. This was also found to be the case for the second starting posture.

During reaching movements, in comparison, our model shows that interaction torques at the elbow initially assist the motion. They are equal in sign to the muscle torques and on the same order of magnitude. Though this effect is stronger in the elbow, interaction torques also assist shoulder motion, but in this case are significantly smaller than muscle torques. We thus find that both types of movement are characterized by a shoulder-centered pattern, in which initial shoulder motion is generally dominated by muscle torques, while elbow motion is produced with significant contribution from interaction torques.

The pattern of opposing and assisting effects of interaction torques also determines the overall muscle effort required. In the opposing case, elbow muscle torques need to be larger than in the assisting case to compensate for interaction torques, even though the kinematics of the motion is essentially the same (a 30◦ flexion) in both cases. This is particularly salient in elbow torques during reaching. Here the muscles do not contribute at all to the braking forces that terminate the movement, i.e., we observe no forces resulting from antagonist activity. The braking pulse in this case is exclusively due to interaction torques created by shoulder motion.

In short, the movement kinetics in our model confirm that the spinal circuitry successfully transforms simple descending control signals into muscle force patterns that qualitatively differ with the direction of interaction torques in such a manner as to accommodate them.

#### **3.4. NEURAL DYNAMICS**

In **Figure 6** we plot the activity of α-MNs and the individual muscle torques they provoke (taking into account muscle activation dynamics and changing moment arms) for the same movements as shown in **Figures 4**, **5** (*WA* and *RA*). Neural activity exhibits the characteristic bi- or tri-phasic burst patterns observed empirically (Ghez and Gordon, 1987), i.e., we can generally identify an accelerating agonist burst followed by a decelerating antagonist burst, and sometimes a third agonist burst arresting the motion.

Not surprisingly, given the torque patterns described above, elbow muscle activity varies with the direction of motion in the shoulder, despite virtually identical elbow kinematics. Muscle activity is generally greater when both joints move in the same direction, i.e., when interaction torques oppose the movement (whipping). For example, integrating over the corresponding area under the curve, we find that motor neuron activity associated with the first elbow extensor burst (which is the agonist in these movements) is approximately 0.033 for the whipping movement, but three times smaller (0.011) for the reaching movement. The same is true for the antagonist (here the elbow flexor). In the former case the area of its burst is 0.024, and when interaction torques assist the motion it vanishes completely.

In contrast with some empirical data, we observe in our model no systematic time lag between onsets of activity in agonists acting on different joints; in particular, we find no temporal organization from proximal to distal joints (Karst and Hasan, 1991; Gribble and Ostry, 1999). Even though such a lag seems to be present in movement *RA* (in **Figure 6** compare first α-MN bursts of elbow and shoulder in the reaching condition), this was not the case for all reaching movements.

For reasons of space we do not present a full analysis of the neural dynamics exhibited by the optimized spinal circuitry. We suggest, however, an explanation for the suppression of the elbow antagonist burst during reaching movements. In the optimized spinal circuit, the antagonist burst (and its suppression) can be traced back to two opposing influences on its α-MN pool. First, spindle feedback excites the α-MN in proportion to deviations from desired position and velocity. But secondly, spindle activity also drives IaIn activity, and via connections from IaIn to homonymous IbIn (Jankowska, 1992) also the latter interneurons, which inhibit α-MN activity. A second inhibiting influence in the optimized network originates in IbIn intersegmental connections from muscles acting on the other joint. The size and shape of the antagonist burst therefore is the result of a balance between spindle feedback and IbIn activity, which in turn is modulated by Ia interneurons. In whipping movements, presumably because interaction torques initially oppose the movement, position and velocity errors initially grow relatively large, resulting

in spindle feedback sufficient to overcome the aforementioned inhibiting factors. In reaching movements, on the other hand, interaction torques partially "do the work," which leads to smaller deviations from the desired state, and hence spindle feedback that is more easily suppressed by the same inhibiting factors.

almost exclusively due to interactions torques produced by the shoulder.

#### **3.5. GENERALIZATION**

The model presented above has been optimized for movements of a certain amplitude and speed (in joint space) and in two separate regions of the arm's workspace. Here we briefly address whether it generalizes to other types of movement that it has not been optimized for. We start by testing the model's performance as we shift the two starting postures in joint space by values ranging from −25◦ to +10◦ (elbow and shoulder angles are shifted by the same amount), while keeping duration and amplitude fixed at the original values. As **Figure 7A**

**FIGURE 7 | Model generalization ability.** Performance of the optimized controller as the two starting postures are offset by a given number of degrees in joint space **(A)**; as movements vary in duration **(B)** or amplitude **(C)**; and as duration and amplitude are changed in proportion such as to maintain average velocity **(D)**. Measures on horizontal axes are relative to values used for optimization, which are indicated by vertical lines. Performance drops the more movements differ from the optimized kinematics. In red we plot performance when cocontraction signals are adapted separately for each desired movement.

demonstrates, the optimized controller shows specificity for the area of the workspace encountered during optimization. Performance decreases in both directions from the original postures.

Similarly, performance drops quickly as we change the duration of the movements to be shorter or longer than the original (panel B), or as the amplitudes are increased or decreased relative to those used during optimization (panel C). If we scale amplitude and duration equally, such that the average velocity remains constant, the drop in performance is less dramatic, but the overall picture is the same (panel D).

In the above tests, none of the system's control signals were re-adjusted for the varying movement kinematics. It cannot reasonably be expected, however, that the resulting movements should be well executed if, for example, the amount of cocontraction (the component λ*co*) is not tuned to the speed demands of the desired movement. It has been shown in human subjects, for example, that movement velocity correlates with muscle cocontraction (Gribble et al., 1998). Also, the balance of open-loop antagonist muscle activity specifies the equilibrium of the system in statics, implying that a wrong selection of this balance (given a specific target) could lead to spinal circuits and musculotendon system driving toward incompatible equilibria. At least these components of the feed-forward command therefore have to be chosen selectively for each particular movement. To test whether this is indeed sufficient to achieve reasonable performance, we first re-optimized all parameters of the original model after adding two new movements to the performance evaluation: the first is identical to *WA*, except that its duration is 0.4 s instead of 0.3 s; the second one is also similar to *WA*, but here both the duration and amplitude are 20% larger, such that the average velocity remains the same. The purpose of these additional evaluations is to avoid optimization of controllers that are overly specialized on the speed and amplitude demands of the four original movements. In a next step we then choose a few movements that the original system performs badly on and re-optimize only the cocontraction signals.

In **Figure 7D** we compare performance over a range of movement amplitudes (but constant average velocity) when the cocontraction signals are re-optimized (red) with those not adapted for each specific movement (black). In the former case performance remains almost constant, dropping no more than approximately 2% (from 98.9 to 97%). The performance of the non-adapted system, in contrast, drops to only 54%. We also tested a few of the other conditions presented in **Figure 7**. For example, the performance when desired joint rotations are reduced by 10% (while keeping the original duration fixed, thus leading to reduced average velocity), is 98.3%, compared to only 8% in the non-adapted system (see panel C). Equally, if the desired duration of the movement is increased by 20% while keeping the amplitude the same, performance of the adapted system is 98.7%, compared to only 67% for the non-adapted system. Performance is improved further if in addition to cocontraction signals we also tune spindle sensitivities to the desired movement, i.e., when we adapt the properties of the gamma pathways such that the strength of position and velocity feedback depends on the desired amplitude or velocity of the movement (data not shown).

To test the model's capacity for producing movements not only of different amplitudes and durations, but also in different directions, in **Figure 8** we show the performance of the spinal circuit when optimized for an increasing number of movement directions. All movements here have a duration of 0.3 s and follow desired center-out hand paths 10 cm in length. Generally, hand paths are essentially straight (panel A, MED averaged over four movements: 1.8 mm). When the number of directions increases (panels B and C), some paths remain almost perfectly so, while others begin to show slight curvature (average MED for six movements: 2.3 mm; for eight movements: 4.2 mm). Almost all paths

movements have an amplitude of 10 cm, a duration of 0.3 s and proceed from the center (black dot) to targets spaced equally along a circle. Hand paths for most movements are approximately straight, except for those in the directions of 90◦ and 270◦ (dashed), which reflect a limitation of the biomechanical model.

resemble the variation of curvatures observed in human reaching movements, except for two movements in panel C (toward targets located in the directions of 90◦ and 270◦). Their large curvature and inaccurate termination reflect biomechanical constraints due to our muscle model, as these movements could not be optimized successfully even in isolation (average MED without these movements: 1.8 mm). This is indicative of the limitations of our simplified musculoskeletal system.

We also note that all of the optimizations described in this section were performed without the velocity error term in the threshold model, and without the power transformation of the viscosity term. Further tests, which we do no present here in detail, show that a control model without spinal interneurons depends on the presence of at least one of these terms to approach the performance of the spinal circuit.

In summary, when control parameters are tuned to the kinematics of the desired movement (e.g., cocontraction is matched to desired velocity), smooth movements can be performed accommodating interaction torques. This is the case for movements of different amplitude and velocity, in different areas of the workspace, and in different directions.

#### **3.6. THE ROLE OF IBIN ACTIVITY**

As mentioned in the introduction, there is reason to speculate that intersegmental force feedback may provide a mechanism by which the nervous system compensates for interaction torques during multijoint movements. One can in fact conceive of three different roles: force-related feedback could modulate descending commands by acting in long loops through the CNS; coordinate muscles acting on different joints through intersegmental neural connections; or act indirectly through the mechanical coupling of joints and the sensed effect of internal loads in a manner akin to non-neural coordination mechanisms in the stick insect (Schmitz and Stein, 2000; Cruse et al., 2007).

We investigate here the latter two options by performing simulated ablation experiments in which we re-optimize the system's parameters after removal of different parts of the spinal circuit. **Figure 9** shows hand trajectories for the worst of the four movements in each ablation condition, which happens to be *WA* in all cases. Optimizing the system without intrasegmental Ib connections reduces the system's performance to a relatively small degree (compare panel A and B). The curvature of the trajectory becomes more pronounced and the movement is arrested less effectively. Movement error, measured by MED, increases from approximately 3.5 to 6.6 mm (and from 2.37 to 3.14 mm when averaged over all four movements). When Ib interneurons are completely removed from the network (panel C), curvature and endpoint oscillations become even more salient (MED = 12.9 mm; average MED = 6.24 mm). In addition, the time course of the trajectory deviates significantly from the reference. It initially leads the desired movement, but then fails to reach the required velocity and falls behind toward the end. For comparison we show in panel D the result when all interneurons are removed from the network, leaving only the threshold model, i.e., α-MNs and direct proprioceptive feedback. Although the trajectory in space shows no more curvature than seen in the other conditions, its time profile deviates much more

strongly from the reference, which is most salient in the velocity profile.

#### **3.7. MODEL SENSITIVITY**

To assess the sensitivity of the model we evaluate its performance over a range of deviations from the solution optimized for the four original movements (*WA*,*<sup>B</sup>* and *RA*,*B*). The optimization procedure encodes all parameters in a vector with component values in the range [0, 1]. For a given level of deviation μ we add random perturbations chosen uniformly from the range [−μ, μ] to all components, i.e., we select a random vector from within a hypercube of size μ centered on the optimized parameter vector. For every level of deviation we sample 100 such vectors and measure their mean performance. The result is shown in **Figure 10**.

It can be seen that performance drops gradually as the amount of deviation increases, indicating that the procedure has not found a "needle in a haystack." The relative smoothness of the error surface suggests that other optimization methods, for example those based on trial-and-error or gradient descent, should also be able to find good solutions. Gradient-based learning using trajectory errors has in fact been demonstrated in a similar model of movement control using spinal-like neural networks (Raphael et al., 2010; Tsianos et al., 2011). On the other hand, performance starts decreasing immediately. The absence of a plateau around the optimal set of parameters suggests that there is not a great variety of different models leading to the same performance. However, since we are only probing (an increasing) neighborhood of one optimal solution, we cannot rule out that other solutions with similar performance exist in other regions of parameter space (in which case local optima might in fact hinder the performance of gradient-based algorithms).

# **4. DISCUSSION**

The dominating view in motor control today suggests that the CNS controls the body using intricate internal models of its kinematics and dynamics in order to predict and directly control the muscle forces required to perform a desired movement. An alternative view, expressed in the equilibrium point hypothesis, proposes that the combined dynamics of spinal circuitry and musculoskeletal system provide a level of abstraction in the control hierarchy that allows the CNS to plan and control movements without requiring a representation of complex bodily dynamics. In this paper we provide evidence that this is plausible. Instead of anticipating upcoming interaction torques and adjusting central control signals accordingly, our model suggests that the CNS may in some cases—such as the type of reaching movements considered here—be ignorant of musculoskeletal dynamics and offload the coordination of muscle forces required for a particular, kinematically defined, movement goal to circuitry at the spinal level. We do not rule out that prediction or implicit anticipatory mechanisms might be involved in other cases, such as faster or more complex movements.

Several studies (e.g., Almeida et al., 1995; Gottlieb et al., 1996; Gribble and Ostry, 1999) have found that during human limb movements EMG activity in muscles acting on one joint correlates with interaction torques arising from motion in another, and is often timed such that it precedes the onset of movement. These findings have been taken to imply that central motor commands are adjusted predictively to compensate for interaction torques. But it need not be true that any such adjustment takes place on the level of central control signals, nor that any form of prediction is involved. Firstly, it is clear that EMG activity has to vary systematically with upcoming interaction torques. If it were not to we would not observe hand paths that are approximately straight. Also, some muscle activity has to precede the movement, as it is necessary to initiate it. Furthermore, since we do not yet have sufficient knowledge about the precise nature of descending control signals, which only after integration with afferent and interneuronal signals results in observed EMG activity, it is impossible to conclusively deduce from current empirical data whether central commands are already adjusted for interaction torques, or whether they are transformed at a lower level to this effect. The model presented here demonstrates that accommodation of intersegmental loads on the spinal level is possible. Also, the fact that the temporal order of muscle activity—including prior to movement onset—seems to be relatively fixed and organized such that agonists in proximal joints precede distal ones, may reflect a more general organization of the control hierarchy, rather than specific and detailed predictions of upcoming dynamics.

The muscle and reflex model developed here involves significant simplifications when compared with the problem of controlling multijoint arm movements in the human body. For example, it does not take into account the effect of biarticular muscles, tendons or gravity. Spinal interneurons are modeled as simple leaky integrators, and fusimotor drives are represented only implicitly through the λ-model. Also, the Hill-type muscle model is employed here in its simplest form, ignoring, for example, the effect of calcium sensitivity on the force-length characteristic (Kistemaker et al., 2007), or the dependence of maximum shortening velocity on activation level (Chow and Darling, 1999). While the inclusion of tendons may be important, for example, when studying the physiological mechanism underlying the estimation of joint position (Kistemaker et al., 2012), we have argued in Section 2.2 that its effect on upper arm dynamics is limited. And if biarticular muscles are particularly well suited for internal load compensation (Gritsenko et al., 2011), then their inclusion in a model such as the one presented here should be expected to make the problem of feedback compensation easier. We also note that the problem of interaction torques in multi-segment limbs is in principal independent of how the joints are actuated, as long as the mechanism of actuation is not infinitely stiff (the problem hence also arises, for example, in any type of compliantly actuated robots). We therefore believe that none of the simplifications introduced in the model interfere with the basic goal of this study, which is to demonstrate that a single spinal-like neural network can transform simple descending control signals into muscle activation patterns that differ qualitatively with the direction and magnitude of interaction torques in a manner that is appropriate for the generation of smooth and straight hand trajectories. Moreover, the model achieves this with muscle and interaction torque patterns comparable to those observed empirically, and without the need for inverse dynamics calculations, prediction of upcoming loads, or having to learn adjustments to control signals for each individual movement. Nevertheless, we believe that further work aimed at testing the proposed mechanism in more realistic models is required to show how and whether the control scheme illustrated here is in fact employed in the case of human reaching movements and to generate detailed predictions that can be tested empirically and that go beyond the demonstration of a functional role for proprioception in intersegmental coordination (Ghez and Sainburg, 1995; Sainburg et al., 1995).

Despite the simplifications just mentioned, the model presents a significant complexity. This is mostly due to the need to represent spinal circuits explicitly in order to do justice to the hypothesis being investigated. Though the modeled circuits are still far simpler than real spinal circuits, we do not claim to have found the simplest or most efficient model able of interaction torque compensation, which was not our objective. In any case, the work presented here also demonstrates that the chosen methodology is indeed practical for asking questions about the potential functionality of spinal circuits, despite their complexity.

In addition to demonstrating the feasibility of feedback compensation for interaction torques in general, our model reproduces several features of reaching movements performed by humans. As demonstrated in Section 3.2, the hand trajectories generated are approximately straight and exhibit bell-shaped velocity profiles (Morasso, 1981; Flash, 1987; Flanagan et al., 1993). This is not surprising, since the movements were optimized to match minimium jerk profiles (Flash and Hogan, 1985). Exceptions from this mathematical ideal, such as uni- or bimodal curvature and hooks at the endpoints, are also in correspondence with empirical data (Flash and Hogan, 1985).

Regarding muscle and interaction torque patterns, Gribble and Ostry (1999)report that muscle torque applied at the elbow varies with the direction of shoulder motion across movements in which elbow kinematics are held constant (and, conversely, shoulder muscle activity varies with the direction of elbow motion when shoulder kinematics are constant). These dependencies are found even when active movement in only one joint is required (but the other is free to move) or when one of the joints is fixed (Debicki and Gribble, 2005). The direction of this covariation depends on whether the two joints move in the same or opposite direction. When shoulder and elbow move in the same direction, interaction torques arise in each joint that initially oppose that joint's intended motion and muscle torques in each joint increase with the emerging interaction torque. In contrast, when the joints move in opposite directions, interaction torques initially assist the desired movement, and muscle torques decrease proportionally. As we have shown in Section 3.3, our model behaves in the same way. Simulated elbow torques vary with shoulder kinematics, and overall effort depends on whether interaction torques are assistive or opposing. As reported in Section 3.4, this effect can be strong enough to completely suppress the normal antagonist burst in the elbow. In this case it is the interaction torque at the elbow alone that arrests the motion. Similar effects can be observed in human subjects, where in some cases assisting interaction torques reach levels such that an initial counteracting antagonist burst is required, instead of the more usual movement-initiating agonist activity (Cooke and Virji-Babul, 1995). Our model also exhibits a shoulder-centered pattern, where elbow dynamics are the result of approximately equal muscle and interaction torque contributions, while in the shoulder muscle torques are generally greater than the passively arising loads (at least initially). This is consistent with some empirical data (e.g., Galloway and Koshland 2002), although it has not been reported by Gribble and Ostry (1999).

With regard to the timing of muscle activity, we find no systematic temporal organization of agonist onsets from proximal to distal joints (Karst and Hasan, 1991; Gribble and Ostry, 1999). Investigation of a larger range of movements would be necessary to identify whether the model does or could exhibit such a strategy. We do, however, observe instances of onset delays in joint rotations (Karst and Hasan, 1991; Virji-Babul and Cooke, 1995), which result from the interaction of muscle forces and internal loads, and from the planning of movements in extrinsic space.

The optimized spinal circuits produce desired hand trajectories over a range of different amplitudes, speeds and directions (Section 3.5), with the exception of movements along a single direction, which was shown to be a limitation of the biomechanical model rather than its control. Further work would be required to determine whether the addition of biarticular muscles, for example, would broaden the operating range of the system, or whether the optimization procedure simply failed to identify a more capable biomechanical setup given the model's constraints. For the model to achieve good performance in all movement conditions, we have shown that some control parameters (such as coactivation level, or muscle spindle sensitivity) need to be selected on the basis of the desired movement speed and amplitude. But crucially, no details about the dynamics of the movement need to be known. The control signals are always simple—a monotonic shift in muscle reflex threshold and a constant level of coactivation throughout the movement—and do not depend on anticipated interaction torques or other aspects of the system's dynamics. The results suggest some further questions, however. For example, what are the *minimal* changes in control signals that allow for the control of movement speed or amplitude? Can the dynamics of the spinal circuits be adjusted using non-specific central control signals according to a simple scaling rule? Or does the CNS have to learn an explicit mapping between desired kinematics and control parameters?

We have speculated in the introduction that it might be intersegmental force feedback carried by Ib afferents that provides the mechanism by which the nervous system compensates for interaction torques. Such signals reliably encode muscle force (Mileusnic and Loeb, 2009), can be adjusted in sensitivity through interaction with Ia afferents (McCrea, 1992), and result in widespread modulation of motor neurons innervating muscles acting at adjacent joints (Jankowska et al., 1981). Moreover, functionally deafferented patients in some cases make systematic movement errors indicative of a failure to counteract interaction forces, which demonstrates a functional role for proprioception in the compensation of internal loads (Ghez and Sainburg, 1995; Sainburg et al., 1995). And motion-dependent feedback across spinal segments has been shown to modulate ongoing limb dynamics in the cat (Smith and Zernicke, 1987; Koshland and Smith, 1989). Our model only allows us to confirm that force feedback can indeed play a role in the compensation of internal loads, though we have not identified precisely what that role is. It is clear from our results that without the contribution of Ib afferents the performance of the model is greatly diminished. But both intra- as well as intersegmental effects of Ib afferents seem to contribute to the appropriate modulation of spinal neurodynamics. Further work is required to separate and study these two effects in more detail.

Our model does not address the question of how spinal circuitry might come to be organized in the manner presented here. The evolutionary optimization procedure does not serve as a model of how the appropriate connectivity could be learned. Nevertheless, it is known that neural circuits are subject to activity-dependent plasticity both in the developing as well as the mature spinal cord (Changeux and Danchin, 1976; Nelson et al., 1990; Lo and Poo, 1991; Wolpaw and Carp, 1993; Schouenborg, 2003; Wolpaw, 2010; Tahayori and Koceja, 2012). Moreover, the reciprocal structure typical of spinal circuits can arise through self-organization enabled by simple Hebbian-like learning rules in initially undifferentiated networks undergoing spontaneous activity (van Heijst et al., 1998; Petersson et al., 2003; Marques et al., 2013). Together with models demonstrating the feasibility of trial-and-error learning (Raphael et al., 2010) this evidence suggests that acquisition of appropriately tuned neural circuits in the spinal cord is possible.

Further work is needed to investigate the exact role of the different feedback modalities (position, velocity, and force) and interneurons in the accommodation of interaction torques in model spinal circuits. Given that the organization of the human spinal cord in reality is significantly more complicated than modeled here, and our substantial yet incomplete knowledge regarding its structure and functionality, an interesting avenue to explore would be to study the complete ensemble of spinal models that fill in unknown data and conform with behavioral and neurophysiological data. Such a methodology has been applied, for example, to study the mechanism underlying klinotaxis in *C. elegans* and to propose experiments that can distinguish between different hypotheses regarding its neural implementation (Izquierdo and Beer, 2013).

In conclusion, the work presented here demonstrates the feasibility of equilibrium point control for multijoint reaching movements subject to varying intersegmental loads. The model shows that internal models and predictive compensation of such loads are not required for the range of movements studied here. But it does not allow us to refute the possibility that such mechanisms are indeed used by the CNS for this or other purposes. The model also indicates that EP style control of reaching movements is dependent on a mapping of desired movement kinematics to control parameters and on the appropriate self-organization of spinal circuitry. We do not propose that it is the sole, or even most central, function of spinal circuitry to implement equilibrium point control, or to coordinate different muscles for interaction torque accommodation. Indeed, while other models not incorporating spinal interneurons might also be able to accommodate interaction torques to some extent, reflex circuitry in the spinal cord in conjunction with central modulation may allow for greater flexibility in the execution of a given class of movements (such as adaptation to varying energy, speed and accuracy trade-offs), and may underlie the ability to perform different classes of mechanical action (such as control of position, force or stiffness) as and when needed.

# **FUNDING**

This work is funded by the project "eSMCs: Extending Sensorimotor Contingencies to Cognition," FP7-ICT-2009-6 no: 270212.

#### **ACKNOWLEDGMENT**

The authors are grateful to Marieke Rohde, Hugo Gravato Marques and Mike Beaton, as well as the reviewers of this manuscript, for their comments and recommendations on an earlier version of this article.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fncom. 2014.00144/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 May 2014; accepted: 23 October 2014; published online: 11 November 2014.*

*Citation: Buhrmann T and Di Paolo EA (2014) Spinal circuits can accommodate interaction torques during multijoint limb movements. Front. Comput. Neurosci. 8:144. doi: 10.3389/fncom.2014.00144*

*This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Buhrmann and Di Paolo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Emulated muscle spindle and spiking afferents validates VLSI neuromorphic hardware as a testbed for sensorimotor function and disease

# *Chuanxin M. Niu1, Sirish K. Nandyala2 and Terence D. Sanger 2,3,4\**

*<sup>1</sup> Department of Rehabilitation, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China*

*<sup>2</sup> Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA*

*<sup>3</sup> Biokinesiology, University of Southern California, Los Angeles, CA, USA*

*<sup>4</sup> Neurology, University of Southern California, Los Angeles, CA, USA*

#### *Edited by:*

*Misha Tsodyks, Weizmann Institute of Science, Israel*

*Reviewed by: Eric Jon Perreault, Northwestern University, USA Richard Nichols, Georgia Tech, USA*

#### *\*Correspondence:*

*Terence D. Sanger, Sanger Lab, Department of Biomedical Engineering, University of Southern California, 1042 Downey Way, Los Angeles, CA 90089, USA e-mail: tsanger@usc.edu*

The lack of multi-scale empirical measurements (e.g., recording simultaneously from neurons, muscles, whole body, etc.) complicates understanding of sensorimotor function in humans. This is particularly true for the understanding of development during childhood, which requires evaluation of measurements over many years. We have developed a synthetic platform for emulating multi-scale activity of the vertebrate sensorimotor system. Our design benefits from Very Large Scale Integrated-circuit (VLSI) technology to provide considerable scalability and high-speed, as much as 365× faster than real-time. An essential component of our design is the proprioceptive sensor, or muscle spindle. Here we demonstrate an accurate and extremely fast emulation of a muscle spindle and its spiking afferents, which are computationally expensive but fundamental for reflex functions. We implemented a well-known rate-based model of the spindle (Mileusnic et al., 2006) and a simplified spiking sensory neuron model using the Izhikevich approximation to the Hodgkin–Huxley model. The resulting behavior of our afferent sensory system is qualitatively compatible with classic cat soleus recording (Crowe and Matthews, 1964b; Matthews, 1964, 1972). Our results suggest that this simplified structure of the spindle and afferent neuron is sufficient to produce physiologically-realistic behavior. The VLSI technology allows us to accelerate this behavior beyond 365× real-time. Our goal is to use this testbed for predicting years of disease progression with only a few days of emulation. This is the first hardware emulation of the spindle afferent system, and it may have application not only for emulation of human health and disease, but also for the construction of compliant neuromorphic robotic systems.

**Keywords: spindle, motor control, afferents, emulation, neuromorphic**

# **INTRODUCTION**

The multi-scale nature of human nervous system makes it difficult to measure all relevant information about sensorimotor function. Moreover, such inadequacy of multi-scale information also prevents us from understanding the mechanism of neurological diseases, whose expression depends on networks and interconnectivity of neurons. In the case of movement disorders in childhood, the ultimate clinical effect of cellular injury may take years or even decades to fully emerge (Sanger, 2003) due to a complex interplay among the child's injury, development, and experience. Therefore, the lack of detailed and complete sensorimotor measurements, both in multi-scale and across long timespan, makes it extremely difficult to quantify the mechanism of disease or to predict the efficacy of long-term treatments.

Here, we attempt to understand sensorimotor functions using a synthetic approach. This requires creating sensorimotor models that are biologically realistic but emulated on nonbiological hardware. The history of synthetic approaches in neuroscience dates back to the 1940s, when scientists started creating artificial neurons and neural networks using electronic circuits (McCulloch and Pitts, 1943). Models of neuron dynamics (Hodgkin and Huxley, 1952; Rosenblatt, 1958) soon emerged to be simulated on digital computers. Since the 1980s, specialpurpose hardware with Very Large Scale Integrated-circuit (VLSI) technology started to benefit from some of the key insights in neural computation, including asynchrony among neurons, spike representation of information, and self-improving mechanisms such as plasticity (Mead, 1989; Serrano-Gotarredona et al., 2009; Indiveri et al., 2011; Neftci et al., 2014). This category of designs, termed "neuromorphic" hardware, has been successful in understanding mechanisms of memory (Chicca et al., 2003), visual representation (Lichtsteiner et al., 2008), and recently cognitive function (Eliasmith et al., 2012). For sensorimotor function, the synthetic approach should not only describe neurons, but also the physiological environment (muscles, proprioceptors, peripheral nerves, skeletal system) that the neurons interact with. If the goal is to predict long-term changes of sensorimotor function, the emulation speed must exceed real-time. These requirements pose a major challenge when designing the hardware, as the running speed of may decrease significantly with model complexity. In this case, our approach for sensorimotor modeling must accommodate enough detail for multi-scale simulation, but it must also maintain high speed when scaling to larger network sizes.

Field Programmable Gate Array (FPGA), a programmable VLSI device, can parallelize neuron simulations at very high speed (Guerrero-Rivera et al., 2006; Li et al., 2010). The inherent parallel structure of FPGAs permits that when model complexity increases, the running speed can be maintained by adding units. The parallel design uses each clock cycle to update multiple neurons, access a wide range of memory, and send out neuron spikes for inter-chip communication; therefore FPGAs are able to accomplish more calculations than clock-cycled computers at the same central clock frequency. In our previous work (Niu et al., 2012), we showed the technical details of the emulation platform using FPGAs, which achieved a running speed 365× faster than real-time for a simulation of the spinal components of the monosynaptic stretch reflex. In the current study, we simulate the muscle spindle proprioceptors and their attached sensory afferent neurons. The spindle system is an essential part of the monosynaptic stretch reflex, as well as a major source of state information for feedback control and stabilization. Because of the model complexity of spindles, there has not previously been an attempt to simulate them in hardware. Therefore, our high-speed hardware simulation of the muscle spindle will provide an essential element for studying the stretch reflex as well as for understanding stabilization and feedback control of biological and neuromorphic motor systems.

Among the many studies investigating muscle spindles in mammals (Eldred et al., 1962; Lennerstrand, 1968; Boyd et al., 1977; Hulliger, 1984), the series of experiments conducted by Matthews and colleagues (Crowe and Matthews, 1964b; Matthews, 1964, 1972) characterized cat soleus spindles in considerable detail. Their representative data provide the reference for the rate-based spindle model (Mileusnic et al., 2006) chosen for our study. We expect our spike-based emulation to enrich the original rate-based spindle model: all spiking afferents will be available for analysis, and the spiking behavior should be compatible with physiological recording. The advantage of converting to a spike-based representation is that we can compare our results directly against physiologically-measured signals, we can investigate questions related to the number or firing pattern of sensory afferents, and we can investigate the effect of spike timing on plasticity. For the purpose of emulating sensorimotor function and disease, we argue that the model of spindle afferents does not have to include all anatomical details, but its outcome must satisfy at least two constraints: (1) distinguishable firing pattern between Group Ia and Group II afferents, and the difference should be compatible with physiological data; (2) distinguishable change in behavior that reflects changes in dynamic and static gamma fusimotor drive. We verified the VLSI emulation against these two constraints. Because of the physiological differences between cats and humans (Prochazka and Hulliger, 1983), we also directly compared our emulated signals with human spindle afferents (Edin and Vallbo, 1990) to see whether this model can be used to emulate human data. If successful, our results will provide a testbed with sufficient detail to represent known sensorimotor physiology in order to describe and predict the effect of neural injuries over many years.

# **MATERIALS AND METHODS**

#### **APPARATUS**

The emulated neurological system comprises a muscle spindle with three types of fibers (bag1, bag2, and chain), two groups of afferents (Group Ia and II) responding to two types of fusimotor drives (gamma dynamic and static), illustrated in **Figure 1A**. The mathematical descriptions of muscle spindle are implemented using (FPGA, Xilinx Spartan-6), a programmable version of VLSI electronic chips.

We favor FPGAs over pipelined hardware such as Graphic Processing Units (GPUs) or clustered CPUs due to its inherent parallelism that resembles neural circuitry. It is also argued that when networking multiple units for large-scale disease emulation, FPGAs are significantly more flexible for enabling custom-built communication protocols, including neuromorphic transmission protocols that directly transmit neuron-like spikes.

#### **FPGA HARDWARE DESIGN**

The implementation requires translating the equations of neurons and spindles into digital circuits described in Verilog Hardware Description Language (HDL). We enforce a combinatorial circuit design as opposed to a sequential clocked design when translating all models, therefore we can maximize performance since circuits do not have to wait for clock signal boundaries. A central scheduler periodically updates all emulated models and distributes the latest information to corresponding parts. Each update generates neurological activities accounting for 1 ms in the real world. The frequency of update can be easily adjusted on digital VLSI, therefore when increasing the updating frequency the emulation will operate faster than real-time. The speed of emulation can keep increasing as long as all models finish their state update within each accelerated clock cycle. The maximal speed of emulation is constrained by how fast the electronic signals can propagate through the combinatorial circuits.

**Figure 1B** shows a working setup of the hardware platform. The spiking activities of up to 80 neurons are measurable directly from the FPGA using an oscilloscope for "virtual neuron recording," which on-site verifies the accuracy of emulation with hardware in the loop. Other activities are transferred to a data-logging computer for offline analysis. Note that when collecting data for offline analysis, the emulation has to be slowed below real-time due to limited bandwidth between the FPGA and the datalogging computer; the emulation is accelerated to 365× real-time only when investigating long-term changes. The FPGA communicates with the data-logging computer through OpalKelly development kits (XEM6010, OpalKelly Inc.). More technical details can be found in Niu et al. (2012).

#### *Floating-point arithmetics in combinational logic*

Spindle models are evaluated in IEEE-754 single-precision floating-point numbers. Typical floating-point arithmetic IP

cores are either pipe-lined or based on iterative algorithms such as CORDIC, all of which require clocks to schedule the calculation. In our testbed, no clock is allowed for model evaluation thus all arithmetics need to be executed in pure combinational logic. The implementations of adder and multiplier are inspired by the open source project "Free Floating-Point Madness," available at http://www.pldworld.com/\_hdl/2/\_ip/www.hmc.edu/ chips/fpmul.html. The modified code used in this study is available upon request.

Floating-point division is more resource demanding than multiplications. We approximated the division with additions and multiplications therefore avoided the direct implementation of floating-point division. Our approach is inspired by an algorithm described by Lomont (2003), which provides a good approximation of the inverse square root for any positive number *x* within one Newton-Raphson iteration:

$$Q(\mathbf{x}) = \frac{1}{\sqrt{\mathbf{x}}} \approx \mathbf{x} \left( 1.5 - \frac{\mathbf{x}}{2} \cdot \mathbf{x}^2 \right) (\mathbf{x} > \mathbf{0}) \tag{1}$$

where Q (x) only contains additions and multiplications. Any division with a positive divisor can be achieved if two blocks of Q (x) are concatenated:

$$\frac{a}{b} = \frac{a}{\sqrt{b} \cdot \sqrt{b}} = a \cdot Q(b) \cdot Q(b) \,\mathrm{(b > 0)}\tag{2}$$

It is trivial to adjust this algorithm for negative divisors (b < 0).

#### *Serialize neuron evaluation using time-shared multiplexing*

Consider that the spindle model only needs to update at 365 kHz even in the highest speed (1 ms time granularity, 365× real-time), there is room for time-sharing the FPGA logic-gates among neurons, thus using fewer logic-gates to emulate larger amount of neurons. The maximal number of neurons that can be serialized (*Nserial*) is constrained by the following relationship:

$$C \times N\_{serial} \times \text{365} \times F\_{emu} \le F\_{f\text{çça}} \tag{3}$$

Here *Ffpga* is the fastest clock rate that a FPGA can operate on; *C* is the minimal clock cycles needed for updating each state variable in the on-chip memory, in our case *C* = 2 due to an optimized design for memory access; *Femu* = 1 *kHz* is the time granularity of emulation (1 ms), and 365 × *Femu* represents 365× real-time. Consider that Xilinx Spartan-6 FPGA devices have a maximum 200 MHz central clock frequency, the theoretical maximum of neurons that can be serialized is

$$N\_{serial} \le \frac{200 \, MHz}{2 \times 365 \times 1 \, kHz} \approx 274 \tag{4}$$

In the current design we choose *Nserial* = 128.

#### **EMULATION OF MUSCLE SPINDLES**

As the sensory organ that provides the main source of proprioceptive information, a typical muscle spindle produces afferents Group Ia (transducing muscle length and a function of velocity) and Group II (primarily sensing muscle length). The activity of spindle is modulated by two types of fusimotor drive (gamma dynamic and gamma static). At present there are no analytical equations that can accurately describe the dynamics of a spindle. However, spindle behavior can be approximated using ordinary differential equations for numerical solutions. Though few such models are available, the one presented by Mileusnic et al. (2006) showed a close fit to the firing rate recorded from cat soleus spindles. We implemented this spindle model as the first step to deduce the essential elements for realistic spindle afferents.

The chosen spindle model used differential equations to describe the relationship between afferent firing rates and the instantaneous muscle status (length and lengthening velocity). The overall structure of the spindle model can be summarized as follows:

$$\dot{R}(t) = f\left(L, \dot{L}, \,\chi\_{\text{dynamic}}, \,\varphi\_{\text{static}}, t\right) \tag{5}$$

$$
\hat{R}\left(t\right) = \int \dot{\vec{R}}\left(t\right)dt\tag{6}
$$

where *R* (*t*) is the instantaneous firing rate of an afferent fiber at time *t*, *L* is the length of the muscle, γ*dynamic* and γ*static* are the firing rates of gamma fusimotor drives. This model described each spindle fiber as coupled springs and damping; it characterized how three types of fiber (bag1, bag2, nuclear chain) contribute to each group of spindle afferent (Ia and II). The thixotropic property of spindle, i.e., that the spindle output depends on the history of stretching (Hasan and Houk, 1975), was not captured in this model.

A full combinatorial implementation of the spindle model requires more silicon resources than available on a single Spartan-6 FPGA chip. Therefore, we optimized the model by trimming its transcendental functions and replacing some on-chip calculations with pre-calculated values. For example, a subset of the model for bag1 fiber is shown below:

$$\dot{\mathbf{x}}\_0 = \left(\frac{\Gamma\_{dynamic}}{\Gamma\_{dynamic} + \Omega^2} - \mathbf{x}\_0\right) / \mathbf{r} \tag{7}$$

$$
\dot{\mathbf{x}}\_1 = \mathbf{x}\_2 \tag{8}
$$

$$\dot{\mathbf{x}}\_2 = \frac{1}{M} \left[ T\_{SR} - T\_B - T\_{PT} - \Gamma\_1 \mathbf{x}\_0 \right] \tag{9}$$

where

$$T\_{\rm SR} = K\_{\rm SR} \left( L - \mathbf{x}\_{\rm I} - L\_{\rm SR0} \right) \tag{10}$$

$$T\_{\rm SR} = (B\_0 + B\_1 \chi\_0) \cdot (\chi\_1 - R) \cdot \text{CSS} \cdot |\chi\_2|^{0.3} \tag{11}$$

$$T\_{PR} = K\_P R \left(\mathbf{x}\_1 - L\_P R \mathbf{0}\right) \tag{12}$$

refer to Mileusnic et al. (2006) for the definitions of variables. As can be seen from Equation (11), the absolute value of lengthening velocity (*x*2) was raised to the power of 0.3, which is trivial to program in Matlab/C++ but has no corresponding syntax in Verilog. We approximated |*x*2| <sup>0</sup>.<sup>3</sup> as |*x*2| <sup>0</sup>.25, because the power of 0.25 is straightforward to acquire by iterating on *Q* (*x*) of Equation (1):

$$Q\left(Q\left(\mathbf{x}\right)\right) = \left(\left(\mathbf{x}^{-0.5}\right)\right)^{-0.5} = \mathbf{x}^{0.25}\left(\mathbf{x} > 0\right) \tag{13}$$

In addition, most of the equations were expanded to polynomials, such that the constant parts in the equations were pre-calculated without consuming multipliers, e.g., 1/*M* in Equation (9) was pre-calculated as a single number instead of doing an on-chip division.

After optimization we are able to host an entire spindle using only 60% of a single Spartan-6 FPGA chip, with a running speed peaking at 500× real-time. Typically a single Spartan-6 chip can support one spindle connected with at least 1024 neurons at 365× faster than real-time.

#### **EMULATION OF SPIKING NEURONS**

From the perspective of information theory, the function of a neuron is to convert post-synaptic currents into a train of binary spikes with limited bandwidth (Sanger, 2011). In an emulated neurological system focusing on its functional role, a neuron can be modeled to any level of detail as long as it satisfies the protocol of "current-in, spike-out." In the current study we adopted the neuron model developed by Izhikevich (2003), which approximates the Hodgkin–Huxley model (Hodgkin and Huxley, 1952). In our emulation, we set the four parameters (a, b, c, d) required in the Izhikevich model to ensure all sensory neurons fit Hodgkin's description of Class 1 excitatory neurons (Hodgkin, 1948; Izhikevich, 2007). Class 1 excitatory neurons are one of the major types found in human spinal cord recordings (Prescott et al., 2008). Since the firing rate of Class 1 excitatory neurons is a monotonic representation of post-synaptic current over a large dynamic range, it allows straightforward conversion from the spindle output (in firing rate) to the input of the neuron (in post-synaptic current).

Although we enforce a pure combinatorial design to maximize the running speed, neurons still time-share physical circuits in order to maximize the population size with limited silicon surface. Pseudorandom white uniform noise (5 mV amplitude) was added to the membrane potential of each neuron to introduce variability for population firing. The noise is introduced to represent the large number of inputs that a neuron usually receives from the dendritic tree. The noise level was set to create a typical 4.8 mV fluctuation in the membrane potential (Fellous et al., 2003). Pseudorandom noise is generated using a linear-feedback-shift-register (George and Alfke, 2001).

#### **SPIKING RESPONSES OF SPINDLE TO VIRTUAL MUSCLE STRETCH**

There has been a long-lasting history studying rapid excitatory responses of a muscle following stretch that dates back to 1751 by Robert Whytt (Pearce, 1997). Over the past centuries it has been revealed that the process of response is a complex muscle reaction, with multiple excitatory responses occurring at different latencies following a muscle stretch (Hammond, 1955). We emulated a series of classic muscle stretch experiments that were originally performed in cat soleus muscles (Crowe and Matthews, 1964b; Matthews, 1964, 1972). Since these datasets are matched in the original spindle model (Mileusnic et al., 2006) to produce compatible firing rates, our emulation is intended to produce compatible spike patterns under similar muscle stretch. Four types of stretch stimulus were introduced to the emulated spindle: linear stretch, tap, sinusoidal stretch and release. We focus on whether the spindle afferents show: (1) distinguishable firing pattern between Group Ia and Group II afferents, and the difference should be compatible with physiological data; (2) distinguishable change in behavior that reflects changes in dynamic and static gamma fusimotor drive.

We further analyzed the emulated spindle activity using a white-noise approach, which has been widely used in system identification for non-linear biological systems (Marmarelis, 2004). We stimulated the emulated spindle using low-pass filtered white noise. It is expected that the firing rate reconstructed from our emulation should be statistically correlated with the firing rate produced by the rate-based spindle model. In addition, the emulated spindle activity was compared to spindle afferent recordings from humans. Due to the known difference between spindles of cats and humans (Prochazka and Hulliger, 1983), we do not expect our emulation to exactly match human data but the comparison should reflect the known difference.

It is noteworthy that all virtual recordings in this study required slow emulation below real-time, otherwise the amount of data produced by FPGA per unit time would exceed the bandwidth for data logging. The mathematical correctness of emulation when accelerated to the full 365× real-time speed is ensured by in-loop fault checks.

## **RESULTS**

#### **QUALITATIVE DIFFERENCE BETWEEN Ia AND II AFFERENT SPIKES**

We first replicated the stretch-and-hold experiment (Matthews, 1972) conducted with the soleus muscle of a decerebrate cat. In their work, the cat soleus muscle was stretched by 14 mm within 200 ms with the muscle maintained elongated after stretching. The spiking responses of cat were measured using intramuscular recording and are shown in **Figure 2A** (reproduced with permission). Responses were recorded both in the presence (ventral roots intact) and absence (ventral roots cut) of tonic fusimotor activity. In both cases the Primary (Ia) afferent exhibited stronger phasic response than Secondary (II) afferent, especially when the stretching velocity is greater than zero. In contrast, the Secondary (II) afferent produced stronger tonic response during the entire process of stretching.

Equivalent stretch-and-hold experiments were tested in VLSI emulation. Note that the original spindle model requires all muscle lengths normalized to the length of relaxed muscle (resting length, *L*0). The spindle was stretched to 36.8% of *L*0, corresponding to the 14 mm elongation of cat soleus muscle (Matthews, 1972) from an average rest length of 38 mm (Scott et al., 1996). **Figure 2B** shows the emulated action potentials in response to the 0.368*L*<sup>0</sup> virtual stretch lasting for 200 ms. We observed a clear distinction between Ia and II afferents in the emulated results. It can also be seen that in the presence of gamma fusimotor drive (ventral roots intact), both Ia and II afferents were more active in a similar way between real and emulated responses. The difference between Ia and II afferents is qualitatively similar to physiological data.

Note that in **Figure 2A** the muscle tensions were presented to show the muscle's mechanical response to external stretch, whereas the actual waveform of stretch should be in the shape shown in **Figure 2B** (muscle length). In our emulation, the muscle tension was only implicitly calculated for the spindle as an intermediate variable, and was hence not directly measurable.

#### **QUALITATIVE DIFFERENCE BETWEEN STATIC AND DYNAMIC FUSIMOTOR DRIVE**

We also compared the biological and emulated afferents from a primary afferent neuron, when the spindle fiber is stimulated with different fusimotor drives. In particular, two fusimotor drives (gamma dynamic and static) were selectively stimulated during a sinusoidal stretch of 1 mm peak-to-peak at 3 Hz (Crowe and Matthews, 1964a). The original recording is shown in **Figure 3A**. External stimuli to the gamma fibers were activated between the 3rd and 4th cycle (horizontal line in **Figure 3A**). As expected, the effect of gamma dynamic drive was to facilitate the phasic response of cat soleus spindle, while the tonic responses were mostly facilitated by gamma static drive. Emulated spindle afferent spiking is shown in **Figure 3B**. A virtual sinusoidal stretch of 0.026*L*<sup>0</sup> peak-to-peak at 3 Hz was applied to the emulated spindle. Gamma fusimotor drives were initially set to 0 Hz and activated to 80 Hz between the 3rd and 4th cycle. Again we observed distinctive spiking patterns in response to gamma static and dynamic stimuli.

#### **POPULATION FIRING IN RESPONSE TO VARIOUS MUSCLE STRETCHES**

One significant advantage of neuromorphic emulation is that the spikes of a large number of neurons can be stored and analyzed together with many other emulated activities. Here we test whether the neuron ensemble shows expected spiking behaviors in response to various types of stretching waveforms. **Figure 4A** shows four types of stretch waveforms (linear stretch, tap, sinusoidal stretch and release) along with hand-drawn schematized spike responses summarized from experimental observations (Matthews, 1964) in absence of fusimotor drive, reprinted with permission.

The emulated responses are shown in **Figure 4B**. In all cases except for muscle release, the emulated spike train from a sample neuron matches the empirical responses. In addition, both the Primary and Secondary ensembles, each comprising 128 neurons, show variable but congruent patterns compared to a single neuron. The only noticeable exception is in muscle release (**Figure 4B**, rightmost column), where the Primary afferents show a burst of firing after the release stops. This burst is caused by the fact that a sudden stop is equivalent to a momentary stretch at very high velocity. It was not schematized in **Figure 4A**, however, probably because the original figure was hand-drawn such that some details were unnecessarily omitted. According to Poliakov and Miles (1994), sudden bursts of electromyography (EMG) were indeed present at the end of masseter releasing. Such discrepancy demonstrated the ability of our platform to validate inconsistent observations.

representative neuron and the neuron ensemble when the four types of muscle stretching were applied. The raster shows 128 Primary afferent neurons and 128 Secondary afferent neurons. Changes in spike pattern are similar between experimental and emulated results, with emulated results showing irregularities in spiking due to added random noise. Note that in the case of muscle release (last column), the Primary afferents show a burst of firing after the release suddenly stopped. This burst was not schematized in **(A)**, probably because sudden stops are usually smoothed significantly by skin, ligaments, and other tissues.

In addition, there is a noticeable distinction between the regularity of spikes of **Figures 4A,B**. One explanation is that our emulation explicitly introduced synaptic noise, and therefore the individual neuron will exhibit a Poisson-like random firing pattern. This particular randomness was perhaps not the focus when **Figure 4A** was schematized.

Another feature of muscle spindle is that its afferents usually spike at a significantly high rate during the beginning of muscle stretch, resulting in an "initial burst." This feature was measure in rats by Haftel et al. (2004) as shown in **Figure 5A**. We used the same triangular waveform (**Figure 5B**) to stretch our emulated spindle for 25 repetitions. The instantaneous frequency averaged across 25 repetitions (**Figure 5C**) showed similar initial bursts compared to experimental data.

#### **CORRELATION BETWEEN RATE-BASED MODEL AND SPIKE-BASED EMULATION**

We tested the spindle afferents using a pseudo-white-noise waveform, which examines the input-output relationship of a dynamic system with rich frequency components (Marmarelis, 2004). The waveform includes a series of pseudorandom numbers low-pass filtered at 5 Hz cutoff frequency. The low-pass filtering screens out the unrealistic high-frequency stretching that is usually damped by skin and ligaments. One sample of a 4-s stretch is shown

in **Figure 6A**. The corresponding Primary Ia response in firing rate (**Figure 6B**) generates stochastic spiking shown in a raster plot (**Figure 6C**, raster of 16 neurons). The firing rate produced by the original spindle model is compared to the total spike count recorded from the 256 emulated spindle afferent neurons (**Figure 6D**). The total spike count shows irregularity and variability in each run because of the stochastic spiking in the emulated neuron population.

We stretched the spindle for an equivalent of 160 s in realtime using the low-pass filtered pseudo-white-noise waveform. The total spike count is significantly correlated with the expected firing rate for both Primary Ia (*p* < 0.0001, *r* = 0.813) and Secondary II (*p* < 0.0001, *r* = 0.810). These results verify that our emulation provides a consistent spike-based representation of muscle length and lengthening velocity; the spiking outcome is statistically compatible with the original rate-based spindle model on which it is based.

#### **COMPARISON TO HUMAN SPINDLE AFFERENTS**

Although the original spindle model was developed based on cat soleus spindles, we replicated the experiments done with human spindle to compare our emulation with human data. We introduced 0.7*L*<sup>0</sup> stretch using the waveform reported by Edin and Vallbo (1990). The human spindle recording and emulated activity are shown in **Figure 7**.

As can be seen, the overall spike rate is lower in emulation than human data; the emulated spike rates show less contrast between long and short muscle length as compared to human data, i.e., reduced static response. These differences are compatible with the finding that human spindles contain more intrafusal fibers, which are probably dominated by nuclear chain (static) fiber (reviewed

in Prochazka and Hulliger, 1983). Quantitative comparison is not performed due to limited human data. In principle, the spindle model could be tuned to fit human spindles by manipulating the damping terms of the spindle fiber.

#### **DISCUSSION**

Using our recently developed technique of neuromorphic emulation on programmable digital VLSI hardware, we showed that a collection of spiking afferent fibers driven by a detailed model of muscle spindles suffice to produce biologically-realistic spindle afferents. The observed firing pattern from emulated spindle neurons matches classical intramuscular recordings from cat soleus muscle, and the emulated responses to gamma fusimotor drive show no qualitative difference from experimental recordings. All emulations can be accomplished at 365× real-time, which allows estimating long-term changes and large population behavior efficiently. The emulated spindles differ from human spindle activity but the difference is compatible with the known difference between cat and human spindles. Our results provide a strong validation of using neuromorphic emulation as a testbed for neurophysiological studies. We can now test the roles of more complex structures, such as realistic muscles, inhibitory neurons, or supra-spinal circuitry in producing movement behavior. The multi-scale design enables us to emulate pathological conditions that are physiologically tenable, e.g., death of neurons or absence of gamma fusimotor drive. It creates an essential step toward investigating how such pathological conditions could contribute to disease progression in childhood.

Synthetic emulation using neuromorphic hardware provides three major advantages compared to empirically studying the real biological system. First, it isolates the subsystem-of-interest from the compounding factors that are very difficult to tease out *in-vivo*, including studying the spinal reflex isolated from

supra-spinal influence. Second, it allows interventions that are usually difficult to introduce when studying biological systems, e.g., differentially adjusting the relative weight of gamma dynamic vs. static drive. Third, the hardware acceleration allows for predicting the system's long-term development with significantly less time. Our vision for the synthetic approach is not to replace empirical studies but rather to inspire testable hypotheses for experiments and clinical applications. Another important feature is that this platform can be used to verify the source of motor variability from different physiological origins, such as the intrinsic firing variability of neurons, motor noise associated with muscles, or the individual properties of mechanoreceptors.

The purpose of this study is to validate the neuromorphic hardware as a testbed for spindle activity, therefore we focused on (1) implementing a selected model of spindle (Mileusnic et al., 2006), (2) adapting it to enable spiking afferents, (3) accelerating it to 365× real-time; we did not focus on improving the model to match more experimental data than it originally could. Nevertheless, the limitations of this spindle model should be acknowledged. One major limitation is that the longknown thixotropic property of spindle activity (Hasan and Houk, 1975) was not captured; also the model was based on cat data, thus it must be re-calibrated for modeling human movement disorders. It is worth noting that other excellent models of spindle also exist (e.g., Hasan, 1983; Lin and Crago, 2002), they either used a less computationally expensive approach to model the non-linear velocity dependency (Hasan, 1983), or succeeded in unifying spindle and Golgi Tendon Organ with the same structure (Lin and Crago, 2002). Recent work also refined the original spindle model by examining the non-linearity in its components (Lan and He, 2012). Our design of the platform is open and flexible for including additional features in future improvements, or switching to different models if necessary.

The original model of spindle (Mileusnic et al., 2006) focused on the firing rates of spindle afferents instead of their spiking patterns. Such rate-based models are incompatible with our overall goal of disease emulation, where the spike timing is crucial for motor variability and neuroplasticity. In acknowledgement of the soundness of the original rate-based spindle model, our major improvement is to enable a large number of spiking neurons driven by a spindle. Moreover, the original model was developed in Matlab Simulink operating slower than real-time, while our hardware implementation permits faster-than-real-time performance. This is the first physical, portable, and realistic proprioceptor that can provide synthetic proprioception for robots, virtual neurophysiological studies, and prediction of clinical outcomes.

#### **ACKNOWLEDGMENTS**

The authors are grateful for support from the National Institute of Neurologic Disorders and Stroke (R01-NS069214), and the James S. McDonnell Foundation. The authors thank Dr. Gerald Loeb for the help with the spindle model.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 June 2014; accepted: 17 October 2014; published online: 04 December 2014.*

*Citation: Niu CM, Nandyala SK and Sanger TD (2014) Emulated muscle spindle and spiking afferents validates VLSI neuromorphic hardware as a testbed for sensorimotor function and disease. Front. Comput. Neurosci. 8:141. doi: 10.3389/fncom.2014.00141 This article was submitted to the journal Frontiers in Computational Neuroscience.*

*Copyright © 2014 Niu, Nandyala and Sanger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Hybrid model of the context dependent vestibulo-ocular reflex: implications for vergence-version interactions

# *Mina Ranjbaran\* and Henrietta L. Galiana*

*Department of Biomedical Engineering, McGill University, Montreal, QC, Canada*

#### *Edited by:*

*Misha Tsodyks, Weizmann Institute of Science, Israel*

#### *Reviewed by:*

*Petia D. Koprinkova-Hristova, The Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Bulgaria Arthur Prochazka, University of Alberta, Canada*

#### *\*Correspondence:*

*Mina Ranjbaran, Department of Biomedical Engineering, McGill University, 3775 University, Montreal, QC H3A2B4, Canada e-mail: mina.ranjbaranhesar maskan@mail.mcgill.ca*

The vestibulo-ocular reflex (VOR) is an involuntary eye movement evoked by head movements. It is also influenced by viewing distance. This paper presents a hybrid nonlinear bilateral model for the horizontal angular vestibulo-ocular reflex (AVOR) in the dark. The model is based on known interconnections between saccadic burst circuits in the brainstem and ocular premotor areas in the vestibular nuclei during fast and slow phase intervals of nystagmus. We implemented a viable switching strategy for the timing of nystagmus events to allow emulation of real nystagmus data. The performance of the hybrid model is evaluated with simulations, and results are consistent with experimental observations. The hybrid model replicates realistic AVOR nystagmus patterns during sinusoidal or step head rotations in the dark and during interactions with vergence, e.g., fixation distance. By simply assigning proper nonlinear neural computations at the premotor level, the model replicates all reported experimental observations. This work sheds light on potential underlying neural mechanisms driving the context dependent AVOR and explains contradictory results in the literature. Moreover, context-dependent behaviors in more complex motor systems could also rely on local nonlinear neural computations.

**Keywords: sensory-motor mapping, vestibulo-ocular reflex, context dependent reflex, mathematical model, disconjugate eye movement, ocular nystagmus**

# **1. INTRODUCTION**

The vestibulo-ocular reflex is an involuntary eye movement that stabilizes gaze in space during head movements for clear and blur-free vision. The rather simple neural substrate of the VOR, the so-called *three neuron arc* (de No, 1933), makes it an appropriate model to study sensory-motor behavior. Rotational and translational head movements are sensed by the vestibular system (semicircular canals and the otolith organs) in the inner ear. Vestibular afferents relay sensory information to the vestibular nuclei (VN) and prepositus hypoglossi (PH) centers in the brainstem. These centers act as the main controller and combine sensory signals with internal efference copies of the controlled plant(s), eye orientation, to drive motor-neurons appropriately. Extraocular muscles then apply torques on the eyeball that result in the eye movements.

VOR nystagmus consists of compensatory (slow phase) and reorienting (fast phase) segments. The slow phases of the VOR stabilize gaze in space by moving the eyes in the opposite direction to the head movement, while the fast phases redirect the gaze at high speeds in the direction of the head velocity. We focus on the angular VOR (AVOR), tested with passive whole-body rotation in the dark while recording conjugate (eye movements in the same direction) or monocular horizontal eye movements. **Figure 1** shows an example of the VOR during sinusoidal head rotations using electrooculography (EOG) in the dark: some slow and fast phase segments are marked. The sawtooth-like pattern of the eye movement is a characteristic of most types of eye movements and is known as ocular nystagmus. In clinical tests, the VOR is characterized by its *gain* defined as the ratio of peak eye velocity to peak head velocity during harmonic testing or short pulse perturbations.

While the head movements initiate the VOR, this reflex is also influenced by contextual factors such as viewing distance (Viirre et al., 1986; Crane and Demer, 1998). Since the eyes are not centered on the head, holding gaze on a near target requires more ocular rotation than for a relatively far target during head movements. In other words, the AVOR gain increases as a function of decreasing fixation distance, that can be described geometrically. The majority of models that attempted to explain target-distance dependent VOR responses relate this property to (i) an internal signal proportional to the inverse of target distance that scales VOR gain (Viirre et al., 1986; Chen-Huang and McCrea, 1999), (ii) cortical computations (Snyder and King, 1992), (iii) parametric changes (Green, 2000), (iv) multiplication of vestibular and eye position signals (Zhou et al., 2007) or (v) parallel linearnonlinear pathways (Lasker et al., 1999). All these models are only focused on the slow phases of VOR nystagmus.

In our recent work (Ranjbaran and Galiana, 2013a), we presented a nonlinear bilateral model for AVOR slow phases in the dark. The model is developed based on known realistic physiological mechanisms and anatomical connections including the semicircular canals, the VN and PH neural populations, motorneurons and eye plants (**Figure 2A**). Based on geometrical relations, we showed that combining monocular and vergence angle

(eye movements in opposite directions) information is sufficient to locate a target in space relative to the eyes. By assigning properly tuned nonlinear neural computations at the VN level, this slow phase model is capable of replicating target-distance dependent VOR responses that meet geometrical requirements. Nonlinear computation in neural responses, so-called *gain modulation*, exists in many cortical and subcortical areas (Salinas and Sejnowski, 2001). Different mechanisms are proposed to explain them, such as recurrent neural networks (Salinas and Abbott, 1996), changes in the synchrony of inputs to a neuron (Salinas and Sejnowski, 2000) or varying the level of background synaptic input (Chance et al., 2002). In this slow phase model (Ranjbaran and Galiana, 2013a), it is postulated that the sensitivity of the VN cells to vestibular signals modulates nonlinearly with eye position and vergence state, enabling auto-adjustment of the VOR to the set point of both eyes- a great improvement over the initially proposed model that only used ipsilateral monocular signals (Khojasteh and Galiana, 2009b). In addition to the near ideal AVOR gain modulation with target distance, the central premotor responses in that model are also consistent with experimental observations. The model also reproduces experimental observations of the VOR responses with simulated unilateral canal plugging, an emerging property. Due to nonlinearities in the sensors and premotor circuits, the model predicted a disconjugate VOR in the dark. However, prior explorations of the model behavior were examined only during high frequency head pulses or low amplitude sinusoidal rotations to remain in the range of feasible eye rotations. In order to have more relevance to the clinical VOR, we now examine the predicted responses to low frequency sinusoidal and large head rotations. This requires the implementation of a fast phase circuit to replicate realistic nystagmus patterns in the AVOR and compare simulations to experimental data.

Classically, the two phases of the VOR are believed to be generated by independent and parallel pathways as originally suggested by Chun and Robinson (1978). Based on this approach, the two phases function independently from each other and a switching strategy implements the timing of changes from one system to the other. Such a black box approach was also employed by other researchers to study VOR slow and fast phase interactions (Winters et al., 1984). However, more recent data demonstrate that slow and fast phases of the VOR share efference copies of eye position from PH and premotor cells in VN (Fukushima and Kaneko, 1995). The first model to include shared connections between the slow and fast circuit was proposed by Galiana (1991) where distinct dynamics for slow and fast phases are generated here through structural modulation. In other words some of the projections during slow phases alter their response characteristics during a fast phase: e.g., position vestibular pause (PVP) and eye head velocity (EHV) cells in the VN pause for ipsilaterally directed fast-phases (McFarland and Fuchs, 1992) and burster cells that are only active during fast phases or saccades, play an important role in facilitating response changes on premotor cells (Kitama et al., 1995). It should be noted that structural modulation does not refer to any change in anatomical connectivity, but rather to changes in the available set of active pathways (Galiana, 1991). Another comparable physiologically relevant model was also developed using realistic spiking neurons that replicated VOR nystagmus in the guinea pig with shared connections between the slow and fast circuits (Cartwright et al., 2003). However, both these models do not address VOR gain modulation with target distance nor vergence interactions. The hybrid model developed by Khojasteh and Galiana (2009b) considered VOR gain modulation; however, they considered a nonlinear block in a feedback loop in their slow phase model

**FIGURE 2 | (A)** Right and left eye positions (*ER*, *EL*) for an eccentric target at a location given by (*D*, θ) during head rotation about radius *r*. *I* is the interocular distance. **(B)** Bilateral model of slow phase horizontal AVOR in the dark. **(C)** Model structure for a rightward fast phase. Inactive projections and paused cells are indicated by dashed-gray lines. Long dashed-black lines are the centers that are only active during fast phase and solid black lines are shared

projections during both slow phase and rightward fast phase of the VOR. Silenced cross midline projections are not shown for simplicity. PVP cells are located on the opposite side from their physical location to better view the connections crossing the midline. See **Table 1** for projection weights that are shown on the connections. Projection weights that are not shown are assumed to be 1. Parts **(A)** and **(B)** partially adapted from Ranjbaran and Galiana (2013a).

to account for VOR gain modulation which resulted in variable VOR dynamics with context and head velocity profiles. Moreover, in their model only ipsilateral monocular signals are used to modulate VOR gain, thus it is not possible to test VOR gain modulation during simultaneous vergence goals and harmonic vestibular inputs.

In this paper, the fast phase circuit shares premotor centers with the bilateral nonlinear slow phase circuit previously presented (Ranjbaran and Galiana, 2013a) to form the VOR hybrid model. A novel feature is that for the first time, VOR and vergence interaction is included in a physiologically relevant hybrid model that can replicate experimental observations, i.e., modulation of the VOR gain in response to simultaneous variable vergence goals and vestibular inputs in the dark. A viable switching strategy is also implemented to trigger and stop VOR fast/slow phases, originally suggested by Galiana (1991). Simulation results are presented to evaluate the performance of this hybrid model under different rotation profiles and the results are compared

(1b)

with experimental observations. Such a model allows new interpretations of the underlying mechanisms in the VOR system and explains contradictory observations in experiments. Preliminary results are presented in Ranjbaran and Galiana (2013b).

The remainder of this paper is organized as follows. Materials and Methods in Sections 2.1 and 2.2 review briefly the reference coordinates and the previously developed slow phase model. Sections 2.3 and 2.4 describe the fast phase model and the nystagmus strategy. Simulation results in Section 3 are followed by discussion and concluding remarks in Section 4.

#### **2. MATERIALS AND METHODS**

In developing mathematical representations for the slow and fast phases of the VOR, we are modeling population responses of cells. Therefore, each element of the models represents the average behavior of a particular cell type rather than the response of any individual cell. Moreover, only firing modulation around a population resting rate is considered; biases due to resting rates are not included and a negative firing rate refers to a cell firing below its resting rate. Finally, we wish to represent the simplest model that can replicate general VOR characteristics, i.e., a minimalist approach. Adding more projections and loops between elements in a bilateral model will only affect the current assigned projection weights and not the general characteristics of the model.

#### **2.1. REFERENCE COORDINATES**

For each eye, zero position is defined as looking straight ahead at optical infinity; temporal deviations are considered positive and nasal deviations, negative. Conjugate and vergence eye positions are thus defined as *Econj* = <sup>1</sup> <sup>2</sup> (*ER* − *EL*) and *Everg* = −(*ER* + *EL*), where *ER* and *EL* refer to the right and left eye angle, respectively (see **Figure 2A**).

#### **2.2. SLOW PHASE MODEL**

The original nonlinear model for slow phases of the AVOR (**Figure 2B**) is described in detail in Ranjbaran and Galiana (2013a). The input is head velocity, *sH*(*s*), sensed by semicircular canals. The canals are modeled as high-pass filters of head velocity, *V*(*s*) = *sTc sTc* <sup>+</sup> <sup>1</sup> , followed by a static nonlinearity on sensory modulation. The nonlinear block accounts for the mechanoneural transduction process that causes asymmetric changes in the firing rate on the primary afferents (Goldberg and Fernandez, 1971). The nonlinear block has asymmetric gains around zero (*knegative* = 0.4 and *kpositive* = 0.6) and limits the primary afferent output *VR*,*<sup>L</sup>* by saturation (+110 spikes/s) and cutoff levels (−90 spikes/s), appropriate for primary vestibular afferents with the 90 spikes/s resting rate. PVP and EHV cell populations in the VN are distinct in the model and receive sensory projections from the canals as well as efferent copies of eye position from PH. T-II in our model refers to type II neurons in the medial VN (Shimazu and Precht, 1966) that receive projections from the contralateral VN. We assume that contralateral VN projections arise from from PVP cells and form a feedback loop between the two sides of the VN (Keller and Precht, 1979). These commissural pathways play an important role in the dynamics of the VOR system (Galiana and Outerbridge, 1984). Premotor PVP and EHV cells project to motor neurons (MN) to drive the eye plants. The eye globe and muscles smooth the motoneural signals. This is represented mathematically as a low-pass transfer function. Therefore, the eye plants as well as neural filters in PH are modeled with first order low pass dynamics as *P*(*s*) = *kp sT* <sup>+</sup> <sup>1</sup> and *<sup>F</sup>*(*s*) <sup>=</sup> *kf sT* <sup>+</sup> <sup>1</sup> . Here, we assume that efference copies of the ipsilateral monocular eye position, *E*ˆ*R*,*L*, and of the vergence eye position, *E*ˆ *verg* , reach EHV cells and define their sensitivities (gain) to vestibular signals in a nonlinear fashion; i.e., *EHVR*,*<sup>L</sup>* = *gR*,*<sup>L</sup>* - *<sup>E</sup>*ˆ*R*,*L*, *<sup>E</sup>*<sup>ˆ</sup> *verg p*2*VR*,*L*, where *gR*,*L*{.} is the nonlinear sensitivity of EHV cells to vestibular afferents. These nonlinear computations account for the target distance related gain modulation of the VOR.

The equations for conjugate and vergence angles in the model are

$$E\_{conj} = \frac{k\_{\mathcal{P}}(c-1)(\mathbf{g}\_L\{\mathbf{.}\}p\_2V\_L - \mathbf{g}\mathbf{g}\{\mathbf{.}\}p\_2V\_R) - ak\_{\mathcal{P}}p\_1(V\_L - V\_R)}{2\left((c-1)(T\mathbf{s}+1) + adk\_f\right)}\tag{1a}$$

$$E\_{verg} = \frac{k\_{\mathcal{P}}(c+1)(\mathbf{g}\{\mathbf{.}\}p\_2V\_L + \mathbf{g}\mathbf{g}\{\mathbf{.}\}p\_2V\_R) - ak\_{\mathcal{P}}p\_1(V\_L + V\_R)}{(c+1)(T\mathbf{s}+1) - adk\_f}$$

Modulation of *gR*(.) and *gL*(.) changes the context gain but not the system dynamics (poles). In simpler terms, the sensory signals from the semi-circular canals are smoothed and tuned by the brainstem circuits to generate the eye movement, described by nonlinear low pass dynamics. For complete justification of the model elements and connections see (Ranjbaran and Galiana, 2013a).

The slow phase model is originally designed to replicate VOR responses in the dark with no visual cue. In order to evaluate the effect of far vs. near target flashes during sinusoidal rotation in the dark, additional inputs to trigger vergence eye movements are required. It is postulated that viewing a flashed target causes signals to be relayed to the neural filters in the PH from any cortical or brainstem center coding visuomotor error commands, such as superior colliculus (SC) (Cova and Galiana, 1996; Green, 2000). We define these visual error signals (*VeR* and *VeL*) as additional input signals to the PH (**Figure 2B**). In the absence of head velocity input, i.e., *VR*,*<sup>L</sup>* = 0, the conjugate and vergence response to *VeR*,*<sup>L</sup>* are obtained as

$$E\_{conj} = \frac{ak\_\text{p}k\_f dq (V\text{e}\_\text{R} - V\text{e}\_\text{L})}{2(\text{Ts}+1)\left((\text{c}-1)(\text{Ts}+1) + adk\_f\right)}\tag{2a}$$

$$E\_{\text{verg}} = \frac{-ak\_{\text{p}}k\_{\text{f}}dq(\text{Ve}\_{\text{R}} + \text{Ve}\_{\text{L}})}{(\text{Ts} + 1)(\text{c} + 1)(\text{Ts} + 1) - adk\_{\text{f}}} \tag{2b}$$

Assigning identical visuomotor error commands, i.e., *VeR* = *VeL*, results in pure vergence with no conjugate response due to the bilateral structure of the model. It should be noted that we are not including light conditions or continuously visible targets in the dark since the VOR dynamics will change as additional visual loops are added to the circuit (Green, 2000). This is a question for future studies.

Two sets of parameters are provided in **Table 1** that simulate two different time constants for the conjugate slow phase system obtained from Equation (1A), i.e., *Tconj* <sup>=</sup> *<sup>T</sup>*(*c*−1) *adkf* <sup>+</sup> *<sup>c</sup>* <sup>−</sup> <sup>1</sup> equals


**Table 1 | Numerical values of the model parameters.**

5 s or 1.2 s and *Tverg* = *<sup>T</sup>*(*<sup>c</sup>* <sup>+</sup> 1) <sup>−</sup>*adkf* <sup>+</sup> *<sup>c</sup>* <sup>+</sup> <sup>1</sup> from Equation (1B) equals 0.4 s with both parameter sets. By simply changing projection weights or filter gains, e.g., *c* and *d* or *kf* , new time constants for the model can be obtained. However this then requires retuning of the nonlinear surfaces at the EHVs. The canal time constant is set to *Tc* = 6 s. Nonlinear surfaces assigned to the EHV cells (Ranjbaran and Galiana, 2013a) are provided in Appendix 5.1. Here interocular distance is: *I* = 6 cm and the axis of rotation is: *r* = *rhead* = 8.8 cm.

#### **2.3. FAST PHASE MODEL**

The model structure for a rightward fast phase circuit is shown in **Figure 2C**. A leftward fast phase is generated with a mirror image of this model. Similar to the slow phase circuit, only modulations in cell populations are provided. Summing junctions are linear except for the nonlinear EHV cells (Ranjbaran and Galiana, 2013a). The bilateral structure of the slow phase system with reciprocal signals across the midline switches to a unilateral structure during fast phases. This is the result of silenced VN cells such as the ipsilateral PVPs and EHVs as well as the cross-midline projections during a fast phase (gray dashed lines and circles in **Figure 1B**, cross midline projections are not shown for simplicity). Omnipause neurons (OPN) and burster-driving neurons (BDN) as well as excitatory and inhibitory burst neurons (EBN and IBN) are included in the fast phase circuit (long dashed black lines and circles in **Figure 1**). OPNs are located near the midline of the pons and act as triggers for the initiation of fast eye movements in all directions (Scudder et al., 2002). OPNs discharge at high firing rate and exert a tonic inhibition on premotor BNs during slow eye movements and fixation. Prior to a fast eye movement or saccade, OPNs cease firing, remain silent during the saccade, and resume firing as the saccade ends (Yoshida et al., 1999). OPNs receive projections from the SC as well as projections from cells in the medial VN (Ito et al., 1986). In our model, it is assumed that the projections from the medial VN, specifically PVPs, to the OPNs play a role in triggering and ending the fast phases (see Section 2.4). BDNs are located below the PH and are found to be excited by contralateral horizontal head rotation and they send projections to contralateral BNs (Kitama et al., 1995). We assume BDN excitation is a result of an excitatory vestibular drive that comes from contralateral vestibular-only (VO) cells. BDNs also modulate with a PH response (eye position efference copies) to close the loop and shape bursts during fast phases (Kitama et al., 1995). Therefore, the output signal from left BDNs during rightward fast phase is: *BDNL* = *m* × *VR* − α × *E*ˆ*<sup>R</sup>* where α is the projection weight from the PH to contralateral BDN (connections are simplified in **Figure 1**). BDNs project to contralateral EBNs and IBNs that are located in the reticular formation and have monosynaptic connections to abducens motoneurons. EBNs send excitatory projections to ipsilateral MNs while IBNs with similar firing patterns send inhibitory projections to contralateral MNs. In our model, the same projections to MNs are sent to PH neurons that produce efferent copies of eye position for VN cells (Fukushima and Kaneko, 1995).

In order to achieve faster dynamics in the fast phase circuit, it is assumed that the feedback loops including PVP and EHV cells between the VN and PH nuclei change their net sensitivity direction as originally suggested in Galiana (1991). This can be a result of competition between parallel inhibitory and excitatory projections from PVP and EHV cells whose balance is modified by EBN/IBN effects. Projections from burst neurons to VN are studied in Igusa et al. (1980) and have been used in former models and studies (Curthoys, 2002; Cartwright et al., 2003). We assume that the population response of the VN cells is a mixed combination of individual inhibitory and excitatory projections. During slow phases, the population response is dominated by the excitatory PVPs and inhibitory EHVs as the burst neurons are silenced. During a fast phase, however, as the burst neurons become dis-inhibited, along with silencing of the ipsilateral PVPs and EHVs, the inhibitory projections dominate the response of PVPs and excitatory projections dominate the response of EHVs; thus, the net projections from contralateral PVPs and EHV populations appear to change their direction of sensitivity. Similar incrementingdecrementing behavior is also observed on the BDN activity during slow-fast intervals (Ohki et al., 1988). The change in effective connectivity assumed here does not conflict with basic knowledge about the neural firing patterns. Thus, in our model during a fast phase, PVPs and EHVs contralateral to the fast phase direction change their sensitivity and their activity profiles decay.

As in the slow phase model, the eye plants and neural filters in the PH remain as first-order low-pass dynamics as *Pf*(*s*) <sup>=</sup> *kpf Ts* + 1 and *Ff*(*s*) <sup>=</sup> *kff Ts* <sup>+</sup> <sup>1</sup> . EHV cells are silenced during ipsilateral fast phases and are active during contralateral fast phases with the same nonlinear sensitivity to the ipsilateral canal signal, i.e., *g*(.) (Ranjbaran and Galiana, 2013a). Due to the unilateral structure of the fast phase of the VOR, distinct monocular dynamics for eye responses are obtained during fast phases, i.e., during a rightward fast phase (see Appendix 5.2)

$$E\_R = \frac{k\_{\mathcal{pf}}(-ap\_1V\_L + mb\_EV\_R)}{T\text{s} + 1 + k\_{\mathcal{ff}}(ad + b\_E\alpha)}\tag{3a}$$

$$E\_L = \frac{k\_{\rm{pf}} \left(\beta\_1 V\_L + \beta\_2 V\_R\right)}{(1 + T\text{s})(T\text{s} + 1 + k\_{\rm{fl}}(ad + b\_E\alpha))}\tag{3b}$$

where

$$\begin{cases} \beta\_1 = (1 + \text{Ts} + k\_{\overline{\mathcal{H}}}(ad + b\_E \alpha)) \mathbf{g} \{ .\} p\_2 - b\_I \alpha k\_{\overline{\mathcal{H}}} a p\_1 \\ \beta\_2 = -(1 + \text{Ts} + k\_{\overline{\mathcal{H}}}(ad + b\_E \alpha)) m b\_I + b\_I \alpha k\_{\overline{\mathcal{H}}} m b\_E \end{cases} (4)$$

During a leftward fast phase, the dynamic equations for *ER* and *EL* are obtained by switching the *R* and *L* subscripts in Equations (3A,B). The above equations imply that monocular eye trajectories have different dynamics during rightward and leftward fast eye movements.

The model parameters (**Table 1**) are selected to preserve the stability of the fast phase system with a small time constant.

#### **2.4. STRATEGY FOR NYSTAGMUS**

So far, the slow and fast phase models with shared connections are described. However, an important feature of the VOR is the switching mechanism between these two phases. The linear range of the VOR is improved by nystagmus as eye excursions are kept inside a reasonable limit. Therefore, a proposed switching strategy is based on limiting eye deviations by avoiding cut-off and saturation limits in the responses of premotor neurons (Galiana, 1991). It is known that the activity of OPNs acts as a logical circuit to trigger and end a fast phase. In this model, the OPN circuit constantly monitors the output of PVPs. If the firing rate of PVPs on one side reaches a threshold (ON-Th spikes/s), a fast phase is triggered ipsilateral to the PVPs' side. During the fast phase, the ipsilateral PVPs and EHVs are silenced and the contralateral PVPs and EHVs change direction and decay as explained in the fast phase circuit structural modulation. The fast phase ends as the firing rate of the contralateral PVPs decays to a second threshold (OFF-Th spikes/s). The fast phase intervals are therefore generated in the same direction as head movement and eye position signals are kept below their physical limits. ON-Th and OFF-Th control the frequency of fast phases and their duration. For instance, with fixed model parameters and dynamics, increasing ON-Th results in later triggering of fast phases and lowering the OFF-Th leads to longer fast intervals. We have imposed a refractory period of 20 ms in the model after switching back to a slow phase to enforce a minimum time interval before triggering a new fast phase.

The performance of the model under different conditions are provided next. All simulations were performed using MATLAB Simulink (The MathWorks Inc., USA), with a first order Euler approximation and a step size of 1 ms.

### **3. RESULTS**

The model is designed to simulate the human AVOR responses during yaw rotations around a vertical axis centered on the head. We focus here on the global, behavioral aspects of the AVOR model rather than on individual components. PVP and EHV firing behavior is previously addressed in Ranjbaran and Galiana (2013a).

#### **3.1. RESPONSE TO SINUSOIDAL ROTATION IN DARKNESS**

**Figure 3** depicts the response of the hybrid model with *Tconj* = 5 s at two different rotation frequencies: 1/6 Hz (A,B) and 1/2 Hz (C,D) with velocity peaks of 180 degree/s. As in experimental observations, the number of fast phases *per cycle* decreases for higher frequency sinusoidal head rotations. In other words, fast phases are triggered more often during low frequency head rotations. This is due to the band pass characteristics of the central

neurons in the VOR pathway. At lower frequencies the gain of central neurons is higher which increases the possibility of exceeding their firing thresholds and triggering a fast phase. Furthermore, at a given rotation frequency, fast phases are more frequent at the higher head velocity levels; consistent with experimental observations (Buettner et al., 1978).

We have also compared our model performance with *Tconj* = 1.2 s in response to a specific rotation profile (180 degree/s at 1/6 Hz) where binocular records are available in our archive; for details of the experiment see (Khojasteh and Galiana, 2009a). **Figure 4** depicts the recorded conjugate eye position (A) and eye velocity (B) (gray), as well as our model responses to this rotation stimulus (black). Clearly, general nystagmus characteristics in the simulation and data are similar: the amplitude of conjugate eye position, the number of fast phases and their timing, suggest that the switching mechanism is plausible.

Given nonlinear canals and nonlinear premotor (EHV) computations, eye movements are disconjugate and a vergence component is now present in the response of our hybrid model to head perturbations. This vergence response shows a carrier frequency that is twice that of the stimulus, as also seen in the experimental data (**Figure 4C**). The peak-to-peak amplitude of the vergence component is greater than EOG resolution limits, suggesting that it cannot be a result of inappropriate calibration in the binocular recording. Instead, as predicted from the nonlinear model, this vergence component can be a direct result of nonlinearities at the premotor and sensory levels. It should be noted that here we did not attempt to identify a system directly from eye recordings but rather compare the general characteristics of recorded AVOR and our model response.

#### **3.2. CONTEXT DEPENDENT RESPONSE TO SINUSOIDAL ROTATION**

The experimental work of Paige et al. (1998) studied the role of fixation distance in adjusting the gain of the VOR during *sinusoidal* angular head rotation. Here, we will test the response of our model during sinusoidal rotations while fixating on a near or far flashed target. Vergence movement as a result of fixating a target in the model is obtained by assigning proper visuomotor error commands, i.e., *VeR*,*L*.

Starting with zero initial conditions (i.e., looking straight ahead at optical infinity), *ER*(0) = *EL*(0) = 0, *VeR* = *VeL* are set to replicate a flashed target in the dark appearing in the sagittal plane between the eyes. This flashed target appears 5 s after the start of head rotation, at *D* = 43 cm from the eyes requiring 8 degree of vergence (given the interocular distance of *I* = 6 cm; see Ranjbaran and Galiana, 2013a for geometrical relations). At *t* = 10 s, a new flashed target appears at *D* = 21 cm that requires 16 degree vergence. At *t* = 15 s a far flashed target appears to reset vergence to zero.

The simulation results are obtained with model parameters where *Tconj* = 1.2 s and *Tverg* = 0.4 s (see **Table 1**). In the absence of head rotation, there is no conjugate eye movement, only a vergence response (**Figure 5**). During sinusoidal head rotation at 0.5 Hz with 120 degree/s peak velocity and the same visuomotor inputs, the conjugate and vergence AVOR responses are appropriate (**Figure 6**). The gain of the VOR (peak of envelope of eye velocity/peak head velocity) is near unity during the first 5 s with no visuomotor response as expected. During *t* = 5 → 10 s interval, the first flashed target, causes a VOR gain increase from unity to compensate for the visuomotor command and the vergence movement. As the second target appears at *t* = 10 s, a still larger vergence is required and therefore the VOR gain also increases further. As the final far target appears at *t* = 15 s, the vergence position decays to zero (looking far ahead) and the AVOR gain decreases smoothly to default unity. This is in agreement with the experimental observations (Paige et al., 1998; Viirre et al., 1986). Note that the pure vergence response seen in **Figure 5** is combined with vergence modulations during each fast phase of nystagmus in **Figure 6**: this has been seen by Sylvestre et al. (2002) during visual disjunctive saccades, an emerging property.

We also tested the effect of head rotation frequency on the gain of the VOR while fixating central flashed targets at different depths. According to observations by Paige et al. (1998), the AVOR gain increases with rotation frequencies while fixating an *imaginary* earth fixed target in darkness. Moreover, in plots of the resulting VOR against concurrent vergence, they report that both the slope, and the intercept of this graph at 0 vergence, increase with rotation frequency (see Figure 8 in Paige et al., 1998). We emphasize the context of imaginary target, since our model does

eye position (degree) and visuomotor command (*VeR* + *VeL*, gray). Vertical dashed lines mark the time of change in the visuomotor command.

not include vision related loops in the light or constant visual input, but rather a vergence cue given by a flashed target.

To replicate this experiment, the hybrid model with *Tconj* = 1.2 s is simulated with sinusoidal head rotation in the range of 1/6–4 Hz, with 30 degree/s peak head velocity while fixating central flashed targets at one of these distances: *D* = [2000 85.9 42.9 28.54 21.34 17.01] cm. Given the interocular distance and the rotation radius, the vergence angles for these target distances are: [0 4 8 12 16 20] degree, respectively. **Figure 7** describes the effects of rotation frequency. The AVOR gain during low frequency rotation, 0.5 Hz, is closer to the ideal gains obtained from geometrical relations (for detail see Ranjbaran and Galiana, 2013a) compared to higher frequency rotation at 4 Hz (**Figure 7A**). Moreover, the slope and intercept plots (**Figures 7B,C**) show an increase with increasing frequency. These emerging results are in agreement with experimental observations of Paige et al. (1998).

#### **3.3. DO CHANGES IN VOR ANTICIPATE CHANGES IN VERGENCE ANGLE?**

The work of Snyder and King (1992)investigated the contribution of the vergence angle to VOR performance. In their experiments, they measured eye velocity in rotating monkeys while the vergence angle was required to change by flashing targets at different distances. Their results demonstrated that the VOR gain changed toward its correct value for the new target distance, before the correct vergence was acquired. They concluded that VOR gain modulation by target distance anticipates changes in vergence angle; thus, the binocular vergence angle alone, derived either from proprioception or from efference copy of motor command, is not sufficient to drive VOR modulation. They suggested that the transient discharge of gaze velocity Purkinje cells in the flocculus, associated with changes in vergence angle early enough, could drive VOR gain modulation with target distance (Snyder and King, 1992).

Here, we replicate their experiment with our nonlinear model. The visuomotor inputs, *VeR*,*<sup>L</sup>* are applied after *t* = 100 ms such that a vergence movement is generated from 0 degree to 8 degree. This emulates a subject initially fixating on a far target straight ahead, and then fixating on a near central target at *D* = 43 cm. During this vergence movement lasting ≈ 2.5 s in our model, a vestibular input, i.e., a pulse of head velocity (accelerating with 500 degree/s2 to 30 degree/s, maintained for 40 ms and then decelerated) is added to the model and the peak eye velocity response is measured. As done by Snyder and King (1992), this vestibular input is applied at different times from 0 to 2.5 s in steps of 50 ms or 100 ms during the time course of the vergence movement (one pulse in each trace). Given the brief and small head perturbation, no fast phase is triggered. **Figure 8** shows that the velocity of the eye movement evoked by VOR changed smoothly over the course of the 8 degree convergence. Both VOR peak eye velocity and vergence angle were normalized and replotted as functions of time to resemble the experimental results obtained by Snyder and King (1992) (their **Figure 3**). Our model simulations replicate their observations: VOR gain changes lead the vergence angle changes, and the change in VOR was completed before the vergence angle reached its goal.

Contrary to the suggestion by Snyder and King (1992), here, this observation is a result of VOR gain modulation by the *efference copies* of the vergence and monocular angles at the premotor level. In our nonlinear model, the effect of the required change in vergence appears in the efference copies *E*ˆ*R*,*<sup>L</sup>* before *ER*,*<sup>L</sup>* via visuomotor projections to the PH. Consequently, the projections to the EHVs from the PH carry early vergence information to modulate the VOR gain, before the vergence movement actually begins or completes. It may appear that VOR changes anticipate vergence angle changes, but instead we suggest that the efference copies of the vergence angle modulating the VOR gain carry the vergence command before vergence responses appear at the behavioral level.

#### **3.4. RESPONSE TO STEPS IN HEAD VELOCITY**

Raphan et al. (1979) explored extensively the characteristics of VOR nystagmus during and after steps of passive head velocity,

both in the dark and in the light. Eye velocity profiles in man and monkey decay to zero in the dark if the rotation interval exceeds 2–3 times the time constant of the canals. This is expected from a high-pass sensor. The simulations below focus on nystagmus in the dark, the context of our current model, using steps of head velocity of variable amplitude and duration.

#### *3.4.1. Per and post rotatory nystagmus*

The simulations in **Figure 9** replicate the main characteristics of VOR nystagmus during and after steps of head velocity. First, for long 45 s steps of rotation, it is clear that the post-rotatory nystagmus velocity appears equal in magnitude but opposite in direction to that during the rotation. In addition, the nystagmus velocity peak scales in both per- and post-rotation with the amplitude of the head velocity (**Figures 9B,C**). Second, for short duration rotations (**Figures 9D,E**), the initial post-rotatory nystagmus in the opposite direction is reduced in magnitude as the interval of rotation shortens. This is expected since the high-pass dynamics of the sensor will cause the *change* of nystagmus velocity to remain constant, if measured from the current eye velocity at the moment rotation ceases (see arrow in **Figure 9D,E**). As found in experimental data (Raphan et al., 1979), the frequency of fast phases in all cases is also modulated by the concurrent level of slow phase eye velocity.

#### *3.4.2. Dynamics of nystagmus decay*

Raphan et al. (1979) also studied the decay rate of nystagmus velocity. As commonly done in the literature, they evaluated the dynamics of the VOR slow phase velocity by removing fast phases and replacing the gaps by interpolation: the reconstructed *envelope* was deemed to represent VOR dynamics, fitted with exponentials. The main result is that nystagmus decay appears much slower than the underlying slow-phase system (≈ 15 s vs. 4–6 s canal), hence the term *velocity storage* in the VOR coined by Raphan et al. However, an envelope fit ignores the contribution of initial conditions introduced at the start of each slow phase segment, biasing estimates of the slow phase time constant. To illustrate, **Figure 10** provides the hybrid model response to a step of −250 degree/s in head velocity, with a canal time constant of 6 s and a conjugate slow-phase time constant of 1.2 s (see **Table 1**). The slow phase central time constant is intentionally low to highlight the effects, but these hold whenever there is nystagmus, especially at very low frequencies like steps. In **Figure 10A**, the *envelope* of slow phase velocities decays with a time constant of 5.55 s, despite slow phase central dynamics of 1.2 s. Such a response is often seen in unilateral vestibular patients. As discussed for sinusoidal rotations in Galiana (1991), ignoring the effect of nystagmus results in biasing the estimated conjugate VOR dynamics. There is a plateau-like response in the initial nystagmus velocity also seen by Raphan et al. (1979) at higher head speeds. Here it is caused by nonlinearities in the canal sensitivity, now exceeded by the input range. In addition, with the hybrid model, we predict the appearance of vergence nystagmus (**Figure 10B**) during step rotations in the dark. In order to extend *velocity storage* beyond both canal and central time constants, it is sufficient to incorporate the resting rates of sensors and central components (Galiana, 1991); at this time we only include modulations at all sites about resting rates, so only the central time constants can be *masked* during nystagmus.

## **4. DISCUSSION**

This paper introduces a hybrid nonlinear model to replicate human AVOR nystagmus in the dark. This bilateral model includes nonlinear sensors as well as nonlinear surfaces assigned to EHV cells to account for the target distance dependence of the VOR. It is shown that vergence can appear with both vestibular and visual depth stimuli. A physiologically relevant fast phase circuit and a nystagmus strategy are imbedded to generate nystagmus automatically and extend the functional range of the AVOR. In our former work (Ranjbaran and Galiana, 2013a), a comprehensive study was performed on the slow phase aspects of the VOR and their characteristics under different conditions. Here, we evaluated the performance of the hybrid model through simulations in response to passive head rotations with or without flashed visual goals. For the first time, a hybrid model based on known brainstem connections replicates different VOR characteristics, consistent with experimental observations. The comparison between the hybrid model simulation responses and experimental observations are qualitative at this stage, showing the capacity to explain reported behavior. Clearly experiments on individual subjects results in distinct numerical responses which require retuning of the parameters in the hybrid model.

**to constant head velocity rotation. (A)** Conjugate position (degree) and **(B)** Conjugate vel. (degree/s) for for 45 s head rotation at −60 degree/s. **(C)** same as **B**, but doubling the speed of head rotation to −120

now with short intervals of 10 and 3 s respectively. **(A)** black: conjugate position and gray: negative head vel./3 . **(B–E)** black: conjugate vel., gray: negative head vel.

Simulated nystagmus patterns replicate reported experimental observations (**Figure 3**) and trajectories that resemble human data (**Figure 4**). This suggests that the switching mechanism in our model is both plausible and testable with lesions and new inputs.

### **4.1. DISCONJUGACY OF THE AVOR:**

Contrary to common belief, the AVOR is not purely conjugate in the dark; binocular recordings during sinusoidal rotations in darkness confirmed a vergence component in the AVOR (Khojasteh and Galiana, 2009a). In our model, this vergence component is a result of nonlinear sensors as well as nonlinear premotor cell responses that account for context dependent VOR responses. This suggests that local nonlinearities in the VOR circuit are the underlying mechanism for the disconjugate VOR in the dark.

## **4.2. VESTIBULAR VERGENCE INTERACTIONS:**

In addition to the vestibular input, i.e., head movement, visuomotor commands are included to enable vergence movements in response to *flashed* targets in the dark. Simulations show the effect of vergence goals during sinusoidal rotations: they confirm that the context dependency of the AVOR gain in the model is preserved with nystagmus and variable vergence goals. AVOR gain dependency on rotation frequency is also in agreement with experimental observations (Paige et al., 1998), an emerging property.

It appears experimentally that AVOR gain modulation with target distance precedes changes in vergence (Snyder and King, 1992), which questions vergence itself as the drive for AVOR gain modulation. We replicated this experiment (Snyder and King, 1992) with our model and found the same result: AVOR gain changes anticipate or precede the vergence profile

(**Figure 8**). We conclude that AVOR gain modulation using efference copies of the vergence angle can support this anticipatory VOR modulation, even in the dark: we postulate that efference copies with visuomotor inputs affect EHV cells immediately, and so modulate the AVOR gain before the behavioral vergence is fully executed.

### **4.3. AVOR DYNAMICS DURING STEPS OF HEAD VELOCITY:**

Per and post rotatory nystagmus characteristics are influenced by the peak velocity and duration of the stimuli. The proposed model replicates these data patterns (**Figure 9**). Given the switching aspect of nystagmus, we demonstrate that *envelope* measures provide biased estimates of slow-phase dynamics (**Figure 10**). So an important goal is to develop algorithms that provide unbiased estimates of nystagmus dynamics.

### **4.4. TESTABLE PREDICTIONS:**

The goal of modeling a sensory-motor system is revealing potential strategies in the brain to control motion and to gain insight for clinical applications. Since the modeling results are fully consistent with available experimental data, the model structure warrants further study. Several assumptions or predictions remain to be verified: (i) Assumptions regarding anatomy of the VOR:


• The presence of premotor (e.g., PVP) projections to OPN cells, to enable the proposed switching strategy.

(ii) Predictions regarding VOR dynamics and behavior:


In summary, we explored AVOR online gain modulation with target distance by introducing a physiologically relevant hybrid nonlinear model. We proposed local nonlinear computations at VN levels to account for the gain modulation of the VOR with context. It is likely that this hypothesis could also support long term adaptation or lesion compensation in the VOR. Furthermore, this hybrid model, given the realistic aspect of its simulated data, can also be used to generate virtual data for validation of algorithms that classify nystagmus segments and identify reflex dynamics.

# **FUNDING**

This work has been supported by Canadian Institutes of Health Research (CIHR), Natural Sciences and Engineering Research Council of Canada (NSERC) and Fonds de recherche du QuÉbec (FQRNT).

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 July 2014; accepted: 14 January 2015; published online: 09 February 2015. Citation: Ranjbaran M and Galiana HL (2015) Hybrid model of the context dependent vestibulo-ocular reflex: implications for vergence-version interactions. Front. Comput. Neurosci. 9:6. doi: 10.3389/fncom.2015.00006*

*This article was submitted to the journal Frontiers in Computational Neuroscience. Copyright © 2015 Ranjbaran and Galiana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **APPENDIX**

#### **5.1. NONLINEAR SURFACE**

The nonlinear surface assigned to the EHV cells, *gR*,*L*, is obtained as a 3th order polynomial of *x* = *E*ˆ*<sup>R</sup>* for the right EHV and *x* = *E*ˆ*<sup>L</sup>* for the left EHV and a 1st order polynomial of *y* = *E*ˆ *verg* according to the following equation:

$$\mathbf{g}\_{R,L} = m\_0 + m\_1 \mathbf{x} + m\_2 \mathbf{y} + m\_3 \mathbf{x}^2 + m\_4 \mathbf{x} \mathbf{y} + m\_5 \mathbf{x}^3 + m\_6 \mathbf{x}^2 \mathbf{y} \tag{A1}$$

For the model with *Tconj* = 1.2 s, the coefficients in Equation (A1) are: *m*<sup>0</sup> = 2.59, *m*<sup>1</sup> = −8.051*e* − 5, *m*<sup>2</sup> = 0.12, *m*<sup>3</sup> = −4.8*e* − 6, *m*<sup>4</sup> = 1.52*e* − 5, *m*<sup>5</sup> = −4.12*e* − 6, *m*<sup>6</sup> = −1.19*e* − 7.

For the model with *Tconj* = 5 s, these coefficients are re-tuned to: *m*<sup>0</sup> = 1.68, *m*<sup>1</sup> = −6.43*e* − 5, *m*<sup>2</sup> = 0.09, *m*<sup>3</sup> = −3.84*e* − 6, *m*<sup>4</sup> = −1.21*e* − 5, *m*<sup>5</sup> = −3.29*e* − 6, *m*<sup>6</sup> = 9.54*e* − 8.

#### **5.2. FAST PHASE MODEL EQUATIONS**

In this section, the equations to obtain the fast phase dynamic relations in Equations (3A,B) are described. Subscripts *R* and *L* refer to the right and left side of the brainstem, respectively. PVP, EHV ,BDN, EBN, and IBN here refer to the net output from these cell populations. The parameters *a*, *p*1, *p*2, *d*, *m*, *bI*, *bE*, and α define weight of the projections between cell types or brainstem centers according to **Figure 2C**. Lower case letter *s* is the complex Laplace variable.

During a rightward fast phase, ipsilateral *PVP*s and *EHV*s and contralateral *IBN*s and *EBN*s are silenced. Signals from *BDNL* = *m* × *VR* − α × *E*ˆ*<sup>R</sup>* are projected to *EBNR* and *IBNR*; therefore

$$\begin{cases} EBN\_R = mV\_R - \alpha \triangle\_R & EBN\_L = 0\\ IBN\_R = mV\_R - \alpha \triangle\_R & IBN\_L = 0 \end{cases} \tag{A2}$$

PVPs receive projections through linear summation of the ipsilateral canal *V*, contralateral PVPs and the PH. EHVs receive ipsilateral canal projections *V* as well as efference copies of ipsilateral eye position and vergence angle. The net output of EHVs ,however, is scaled by a nonlinear gain that modulates the weight of canal projections according to the concurrent ocular angles (Ranjbaran and Galiana, 2013a); therefore,

$$\begin{cases} \text{PVP}\_{L} = d\triangle\_{R} + p\_{1}V\_{L} & \text{PVP}\_{R} = 0\\ \text{EHV}\_{L} = \text{g}\_{L}p\_{2}V\_{L} & \text{EHV}\_{R} = 0 \end{cases} \tag{A3}$$

Signals from VN and BN are added at MNs and then relayed to the eye plant *Pf*(*s*) <sup>=</sup> *kpf Ts* <sup>+</sup> <sup>1</sup> to generate ocular movements:

$$\begin{cases} E\_R = (-aPVP\_L + b\_E EBN\_R) \left(\frac{k\_{pf}}{Ts + 1}\right) \\ E\_L = (EHV\_L - b\_IIBN\_R) \left(\frac{k\_{pf}}{Ts + 1}\right) \end{cases} \tag{A4}$$

Eye position efference copies *E*ˆ*R*,*<sup>L</sup>* are available through projections from PH with similar dynamics as the eye plant, *Ff*(*s*) = *kff Ts* <sup>+</sup> <sup>1</sup> ; therefore,

$$
\triangle\_{R,L} = \frac{k\_{\mathcal{ff}}}{k\_{\mathcal{pf}}} E\_{R,L} \tag{A5}
$$

By substituting Equation (A5) into Equations (A2) and (A3) and combining with Equation (A4), one can obtain the dynamic equations provided in Equations (3A,B), to describe *ER*,*L*.

# Major remaining gaps in models of sensorimotor systems

#### Gerald E. Loeb<sup>1</sup> \* and George A. Tsianos <sup>2</sup>

<sup>1</sup> Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA, <sup>2</sup> L-3 Applied Technologies Inc., HEM Division, San Diego, CA, USA

Experimental descriptions of the anatomy and physiology of individual components of sensorimotor systems have revealed substantial complexity, making it difficult to intuit how complete systems might work. This has led to increasing efforts to develop and employ mathematical models to study the emergent properties of such systems. Conversely, the development of such models tends to reveal shortcomings in the experimental database upon which models must be constructed and validated. In both cases models are most useful when they point up discrepancies between what we think we know and possibilities that we may have overlooked. This overview considers those components of complete sensorimotor systems that currently appear to be potentially important but poorly understood. These are generally omitted completely from modeled systems or buried in implicit assumptions that underlie the design of the model.

#### Edited by:

Vincent C. K. Cheung, The Chinese University of Hong Kong, China

#### Reviewed by:

William Zev Rymer, Rehabilitation Institute of Chicago, USA John Kalaska, Université de Montréal, Canada

#### \*Correspondence:

Gerald E. Loeb, Department of Biomedical Engineering, University of Southern California, Denney Research Center (DRB)/Downey Way, Los Angeles, CA 90089, USA gloeb@usc.edu

> Received: 16 March 2015 Accepted: 21 May 2015 Published: 04 June 2015

#### Citation:

Loeb GE and Tsianos GA (2015) Major remaining gaps in models of sensorimotor systems. Front. Comput. Neurosci. 9:70. doi: 10.3389/fncom.2015.00070 Keywords: sensorimotor control, sensorimotor learning, sensorimotor integration, sensorimotor systems modeling, biological neural networks

# Introduction

"When you can measure what you are speaking about, and express it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the stage of science."

—William Thomson, 1st Baron Kelvin, 1883

Quantitative methodology has gradually replaced qualitative "butterfly collecting" in biology. Lord Kelvin's comment was motivated primarily by its importance for reductionistic science, whereby hypotheses about deep and unobservable structure and function can be tested according to their often subtle effects on measureable phenomena. More recently, the challenge of science has more often been too much rather than too little quantitative data, and too many well-understood but complex mechanisms whose interactions defy intuitive understanding of how complete systems actually function. This problem combined with the rapid advance of computing power has led to rapidly increasing interest in systems modeling. For the 21st century, one might replace "measure" with "model" to update Lord Kelvin's 19th century exhortation.

Modeling is a never-ending task. It is most useful when it reveals discrepancies between what we think we know about a component or a complete system and how it actually behaves experimentally. The existence of a model (whether explicit or implicit) is an inspiration to experimentalists to identify phenomena or conditions for which the model predictions are in error—the basic scientific method of hypothesis formation and falsification. Assuming the experimental results are valid, the discrepancy can only be resolved by correcting and usually complexifying the model to reflect better the properties of the physical system being modeled. As more experimental data become available, it becomes possible, indeed necessary, to break complex systems into subsystems that can be studied and modeled in relative isolation. This leads to a proliferation of models of subsystems that must eventually be combined but that may be at very different stages of development and accuracy. This problem is amplified by the natural tendency of scientists to focus on and further refine those subsystems that have already yielded to their efforts. Put simply, we have a lot of quantitative information incorporated into accurate models of some subsystems but little or none about other subsystems that are likely to be just as important to overall function or about most of the connections between subsystems.

The nature of the modeling challenge depends on where a subsystem is located in a system that is inherently hierarchical (**Figure 1**). The subsystems in the spinal cord and musculoskeletal system have mostly been studied in isolated or highly reduced preparations. Such subsystems are amenable to "bottom up" modeling strategies in which individual elements are characterized and then combined into larger models. Shortcomings in the constituent models tend to arise because it is difficult for the experimenter to observe or create the full range of natural conditions of use. The subsystems in the brain have been studied mostly in intact, naturally behaving animals. Those subsystems are generally modeled using "top-down" strategies in which the form of the model is intuited from observable phenomena of the whole system. Shortcomings in those models tend to arise because the experimenter must make assumptions about what is or is not happening in the other subsystems that are present. This overview is not an encyclopedic review of what models already exist. Rather it attempts to prioritize those subsystems that currently appear to be both important and relatively poorly modeled and to identify opportunities to rectify this.

# Systems Model Architecture

"It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." (Frequently paraphrased as "A scientific theory should be as simple as possible, but no simpler.") —Albert Einstein, 1933

Biological systems (and their models) tend to be much more inherently complex than the physical systems that Einstein had in mind. This is an inevitable consequence of their gradual evolution over hundreds of millions of years of intense competition with other species. Relatively simple mechanisms such as actin-myosin binding for force generation and a stretch reflex to stabilize posture can have their essence captured by mathematical curve-fitting (Hill's equation) and engineering metaphor (servo-control), respectively. Real organisms, however, achieve their competitive performance by huge and disorderly elaborations of those underlying mechanisms. This poses a potential conflict of interest between the modeler, who often aspires to simple and elegant models, and the experimentalist, who needs to capture realistic performance.

The modeler must decide what performance is to be modeled, thereby defining the functional elements that must be included in the model system. This is itself a form of modeling that is fraught with opportunities for errors of omission. **Figure 1** provides one framework for the restricted set of learned, voluntary sensorimotor behaviors such as manipulating objects. Smaller and simpler models have been used with substantial success to account for preprogrammed behaviors such as locomotion, breathing, and mastication. Larger and more complex models will be required to account for multimodal behaviors such as eye-hand coordination.

The framework in **Figure 1** can be divided hierarchically and phylogenetically into brain, spinal cord and peripheral/mechanical subsystems. The first organisms to achieve motility evolved by steadily enhancing their abilities to make movements that were environmentally responsive, mechanically stable, and energetically efficient. Sensory transduction and electromechanical activation were already well on their evolutionary paths before there were recognizable nervous systems at all. As organisms became larger and more complex mechanically, coordination required centralization of sensorimotor connections in invertebrate ganglia that eventually coalesced into the vertebrate spinal cord. Systemslevel modeling is often applied to behaviors that are learned by the brain but implemented by the "lower" but highly evolved subsystems, whose intrinsic properties thereby define the control problems that the brain must solve. Model systems that omit or substantially simplify components are implicitly hypothesizing that those components do not make a significant contribution to the observed behavior, even though they are known to be necessary and perhaps even sufficient for many other behaviors of the same organisms. Unfortunately, such omissions more often reflect the unavailability of computational models rather than such a plausible hypothesis.

Explicit or implicit models that have been incorrectly simplified lead to progressively more complex and implausible models as they try to account for new data. This is analogous to the problem of "the music of the spheres" in which an intuitive and simple earth-centered universe requires ever more artificial structure to account for the observed motion of the planets. Such a problem may be starting to emerge in motor learning models that assume that the motor cortex is directly responsible for converting visual targets in extrapersonal space into sequences of muscle activation that cause limbs to reach to those targets. Such models often assume that the cortex learns an "internal model" of the musculoskeletal plant and inverts that model to compute the commands required to perform a given task. As Nikolai Bernstein pointed out (Bernstein, 1967) (English translation of 1934 publication in Russian), such a computational problem is ill-posed because of redundancy. There are usually more muscles and degrees of freedom than required to perform the task, so the computation requires either an arbitrary constraint on allowable strategies (d'Avella et al., 2006) or an optimization criterion such as minimizing effort that will result in a singular solution (reviewed in Loeb, 2012). If there were a single internal model in a single place subject to a single computational strategy, then one would expect this to give rise to rather consistent patterns of

sensorimotor adaptation and learning. Experiments on learning and adaptation instead reveal many different "rules" whereby subjects learn and forget how to deal with distortions in the visual space, loads on the limb, and changes in posture and muscle function and performance criteria, as well as interactions among those variables (see Shadmehr and Mussa-Ivaldi, 1994; Gandolfo et al., 1996; Krakauer et al., 2000, 2006; Baraduc and Wolpert, 2002; Mattar and Ostry, 2007, 2010; Pearson et al., 2010; Brayanov et al., 2012; Berniker et al., 2014). Substantial evidence has shown, however, that internal representations of behavior are neither intuitive nor simple (Brayanov et al., 2012) and that sensorimotor learning generalizes poorly in many situations (Gandolfo et al., 1996; Mattar and Ostry, 2007, 2010; de Rugy et al., 2012b; Coelho et al., 2013; Berniker et al., 2014), which is inconsistent with the predictions of a singular internal model. If these behaviors were actually the result of several different sensorimotor subsystems, each with their own relatively simple rules, the inferred models might actually be simpler as well as more realistic. Such a collection of interacting subsystems is known to subserve control of gaze, which consists of anatomically distinct subsystems for classes of behavior such as voluntary and involuntary saccades, smooth pursuit, and reflexive stabilization, and the coordination of eye and head movement to achieve them. The anatomical structures responsible for gaze control include the mesencephalic tectum, pontine, and other brainstem nuclei and cerebellum as well as sensory and motor cortical areas, all of which also have strong sensory and motor connections with the limbs.

# Gaps in Peripheral and Mechanical Models

Models of the musculoskeletal system include several different types of models that must be combined to generate their complete input/output properties.

## Musculoskeletal Dynamics

Control engineers divide any system into a plant and a controller and describe the tasks to be performed using cost functions that weight the relative importance of quantifiable state variables such as accuracy, time to completion and energy consumption. The job of the sensorimotor nervous system (controller) is to compute and implement control signals that cause the musculoskeletal system (plant) to achieve desired performance. The mechanical dynamics of the plant obviously constrain the set of useful solutions for the controller. More complex plants make it more difficult for engineers to develop mathematical models that can be used to compute solutions, but they do not necessarily make the plant inherently more difficult to control. For example, the complex intrinsic properties of muscles (e.g., dependency of force output on instantaneous muscle length and velocity) may provide rapid stabilizing effects that compensate in part for the relatively long delays inherent in signal transmission in a neural controller (Hogan, 1984; Loeb et al., 2002) but see (Crevecoeur and Scott, 2014). Nevertheless, the mechanical dynamics of multilinked skeletal segments such as a limb or vertebral column are inherently complex and can result in highly unstable conditions that must be prevented by a controller with widely distributed inputs and outputs (Lackner and DiZio, 2000).

Sophisticated algorithms for solving the mechanical dynamics of free-body systems with mechanical constraints have been widely applied to the biomechanics of musculoskeletal systems. These work reliably when the biological architecture lends itself to decomposition into discrete and independent entities—the inertial segments, joints, and actuators typical of most limbs. Mechanical modeling becomes difficult and less reliable when the discrete entities interact through distributed connective tissues. In the hand, this arises when tendons from multiple muscles insert onto capsular structures around finger joints rather than directly onto individual bones and when separately controlled neuromuscular compartments of multiheaded digit muscles are loosely coupled to each other (Schieber and Santello, 2004). Distributed viscoelasticity in the skin tends to dominate the mechanics of the lightweight digits at low muscle recruitment. In the neck and trunk, modeling challenges arise when the length and/or pulling direction of one muscle depends on the position or activation of other muscles (Richmond et al., 2001). In the shoulder and hip, this occurs when muscles wrap over and around other muscles (van der Helm, 1994). In theory, these complexities can be approximated by decomposition into multiple, discrete and classical entities. In practice, most of the parameters that need to be specified to define such finite element models are unavailable.

Models of the rapidly conducting proprioceptors (myelinated nerve fiber groups I and II) are well-developed but they must be driven by perhaps uncertain musculoskeletal dynamics (muscle fascicle length and velocity for spindles and active muscle force for Golgi tendon organs). Several other types of mechanoreceptors have been described in muscles, ligaments, and joint capsules (Grigg and Hoffman, 1984), but mathematical models are not available and their functional roles are uncertain.

## Non-stationarity of Muscle Physiology

Models of the contractile properties of mammalian skeletal muscle are perhaps the most developed of all components. They were developed initially to explore reductionist models of contractile mechanisms under highly limited and generally unphysiological operating conditions (e.g., isometric or isotonic twitch or tetanus). They were extended to account fairly well for the full range of kinematic conditions and contractile properties of mammalian slow-twitch and fast-twitch muscle fibers at various physiological levels of recruitment. More recently they were extended to account for energy consumption (Tsianos et al., 2012), which may be an important cost-function used by the brain to improve performance during motor learning. What is missing are models of how the properties of muscles change over time as a result of patterns of use or disuse. The control strategies learned by the brain must anticipate or at least cope well with these changes. Understanding such changes is important when modeling is used to account for performance in pathological systems (see below).

In the short term, muscles are subject to fatigue—a reduction in force output for a given set of operating conditions (Enoka and Duchateau, 2008). Such reductions may arise from changes in the many cellular processes involved in muscle activation and deactivation. Models of muscle that are composed of computational elements that correspond to the energyconsuming processes (e.g., cross-bridge turn-over and calcium flux) should make it possible to account for those changes, but this remains to be modeled. The effort will require a great variety of experimental data, much but perhaps not all of which is already available in the literature. Many diseases and injuries are associated with disuse atrophy of muscles, which tends to increase greatly their susceptibility to fatigue, but the relative contributions of the underlying processes may be different from normal muscles.

Musculotendinous injuries account for the majority of emergency medical treatment and long-term disabilities, so there is a great deal of interest in how they occur, how they affect performance and how they heal. Structural engineering has benefitted greatly from finite-element analytical modeling (FEM) of complex structures. There have been a few attempts to develop such models for muscle, which is essentially a composite material consisting of contractile elements (muscle fibers) embedded in a structurally critical matrix (endomysial collagen) (Trotter, 1993).

Muscle is particularly responsive to exercise, which results in rapid and profound changes in key properties such as forceoutput, speed of contraction and relaxation, and resistance to fatigue. Quantitative models of these changes are becoming more feasible as the cellular mechanisms responsible for signaling and managing plasticity are starting to be revealed. Data from formal physiological experimentation and the effects of athletic training tend to emphasize covariances in physiological properties; e.g., intense but brief exercise tends to develop fast-twitch muscle fibers with larger force output, faster rise and fall dynamics, higher rates of maximal shortening (Vmax) and lower fatigue resistance than slow-twitch muscle. Mathematical models that represent explicitly the mechanisms underlying these properties are a better starting point for models of plasticity because the properties do not necessarily covary. Little systematic data are available about the rates at which the individual properties other than maximal isometric force change during normal training. For example, muscles subject to disuse atrophy produce lower force (typical of slow twitch fibers) but also have lower fatigue resistance (typical of fast twitch fibers).

The anatomy and physiology of mature neuromusculoskeletal systems are the result of myriad mechanisms that orchestrate their development (Crawford and Horowits, 2011). Most musculoskeletal models assume that the pinnation and sarcomere lengths of muscle fascicles tend to be optimized for the range of lengths that the muscles experience during normal function and that the connective tissues that support them are matched to the stresses that the contractile elements apply during those functions. The trophic factors that drive the deposition and structural properties of collagen in tendons, aponeurosis, and endomysium are starting to be understood qualitatively but quantitative models of their dynamics are not yet available. Assumptions of covariance and optimality are likely to breakdown if models are applied to pathological conditions. For example, passive tensile properties of mammalian muscle are dominated by the endomysial connective tissue, so do not necessarily covary with active tension or force-length properties that are determined by the myofilaments (Brown et al., 1996).

#### Tactile Mechanics and Sensory Transduction

Musculoskeletal systems interact with the world via contact regions composed of skin and related epidermal structures (nails, claws, teeth, hair, or fur). The mechanical compliance of skin defines what happens during object manipulation, which is essentially a series of collisions between the object and various parts of the hand; the resulting deformations of the skin define what information will be available to the CNS from tactile mechanoreceptors, which are all essentially strain gauges. The mechanical properties of these interfacial regions are starting to be captured by finite element modeling (Kumar et al., 2015). Such models can be added to classical free-body models to describe accurately the mechanical events that occur during collisions and manipulation between, for example, fingertips and objects to be grasped, but the computational load is often daunting. Modeling the tactile sensory signals that will result from myriad, independent receptors in the soft tissues remains challenging. Such sensory signals are known to be essential for dexterity, which is severely impaired when skin is anesthetized while proprioception and motor commands are left intact. The histological structure and physiological responsiveness of the receptor modalities are fairly well-understood, so it should be possible, albeit computationally challenging, to integrate model populations of cutaneous mechanoreceptors into finite element models (Kumar et al., 2015).

# Gaps in Spinal Cord Models

# Fusimotor Control of Muscle Spindles

Muscle spindles have long figured prominently in theories of sensorimotor control. The size and speed of their sensory axons, their numbers, and distributions, and their highly evolved structure and fusimotor control mechanisms all suggest that they are functionally critical. All of these elements are well-described in existing computational models (Mileusnic et al., 2006). The problem is that only sparse information is available about how their sensitivity is controlled by the fusimotor system during natural behaviors (Loeb, 1984; Prochazka, 1999; Taylor et al., 2000).

The fusimotor apparatus has undergone a huge elaboration in mammals, including specializations within various parts of the musculoskeletal system (Richmond et al., 1986). Servocontrol models of sensorimotor control originally emphasized the role of spindle afferents in the clinically prominent monosynaptic stretch reflex, but this represents only a tiny fraction of their central projections. Spindle afferents contribute to a multitude of oligosynaptic circuits in spinal and brainstem interneurons where they are combined with descending command signals. They are also the dominant source of information about posture and kinesthesia (Scott and Loeb, 1994; Gandevia, 1996), which are necessary for high-level planning and evaluation of motor strategies in cerebral cortex and cerebellum. Simplistic rules for fusimotor control have been hypothesized [e.g., alpha-gamma coactivation (Vallbo, 1974), optimal transducer programming (Loeb and Marks, 1985)] but these are speculations rather than known facts.

### Connectivity of Spinal Interneurons

Signals from the brain and sensory afferents are mixed in spinal interneurons that project to the alpha motoneurons (Pierrot-Deseilligny and Burke, 2005). As a result, the influence of brain activity on muscle contractions and movement depends substantially on the connectivity of spinal interneurons. In the past century, many neural pathways from sensory receptors (cutaneous and proprioceptive) to alpha motoneurons have been identified. These pathways include the monosynaptic Ia excitation of alpha motoneurons and polysynaptic pathways involving propriospinal, Renshaw, Ia, and Ib interneurons. The pathways were identified mainly through electrophysiology techniques by perturbing sensory signals from one muscle and investigating the timing of enhancement or depression of alpha motoneuron activity of the same muscle as well as muscles that were functional antagonists, synergists, or both (for a recent review, see Loeb, 2014). Neural pathways involving cutaneous receptors are poorly defined relative to those involving proprioceptors because it is difficult to stimulate a large group of afferents from a homogeneous type of cutaneous receptor. In electrophysiology experiments, the connectivity of cutaneous pathways is investigated typically by stimulating cutaneous nerves or patches of skin, which contain neurons from many different types of cutaneous receptors (Pierrot-Deseilligny and Burke, 2005). Through this method, it is not possible to distinguish the contribution of each type of receptor on the resulting alpha motoneuron activity. Information from these experiments, therefore, cannot be used to accurately predict the effects of physiological activity of the various cutaneous receptor afferents on interneuron or alpha motoneuron activity in the spinal cord.

Neural pathways from cutaneous receptors and proprioceptors to alpha motoneurons involving three synapses or more are not well-defined because current experimental techniques are limited. In electrophysiology experiments, if one of the neurons in the polysynaptic chain is hyperpolarized substantially via descending control, the whole chain will be invisible to the experimenter. Even if the effect on alpha motoneuron activity is observable, it will likely be highly variable and even differ in sign across subjects and experiments because descending control of each one of the neurons in the chain will likely depend strongly on the conditions of the experiment and physiological state of the subject. Furthermore, sensory afferent collaterals may activate several pathways in parallel with similar latencies, making them indistinguishable in the alpha motoneuron signal.

# Role of Presynaptic Modulation

Neurotransmitter release by the presynaptic neuron depends not only action potential rate, but also on the level of presynaptic inhibition/facilitation induced by interneurons forming axoaxonic connections with the presynaptic neurons. The activity of these interneurons are controlled by descending projections as well as by sensory afferents, but this connectivity is poorly understood (Rudomin and Schmidt, 1999). In theory, presynaptic control could allow selective gating of signals from different sensory modalities, muscles, and interneurons depending on the task. It is not clear, however, what portion of the presynaptic terminals the brain can control independently (Sirois et al., 2013). Furthermore, it is not clear how presynaptic input varies across tasks. There may be a set of presynaptic inputs that are useful for a wide range of tasks (see Fink et al., 2014), thereby reducing the number of control parameters the brain would have to learn to perform new movements.

# Status of Brain-level Models

Models of different parts of the brain tend to be abstract because of the lack of specific knowledge of the neural circuits that process signals from their input sources and the lack of knowledge of the specific neural circuits that their output projections influence. Spinal interneurons receive input from many parts of the brain, including cerebral cortex, brain stem, and tectum. The distribution of these inputs among spinal interneurons as well as the specific source locations are poorly understood. New tools for tracing and modulating connections based on genetic engineering of specific cell-types are just starting to reveal the functional relationships between brain and spinal cord circuits (Akay et al., 2014; Azim et al., 2014; Esposito et al., 2014).

# Cerebral Cortex

Voluntary motor control is usually assumed to reside in a small set of frontal lobe cortical areas. Damage to these areas in humans such as from stroke results in profound losses of such behaviors. Top-down models often attribute to cortex most or all of the control function involved in learning and executing these behaviors. Cortex seems to be necessary for learning new motor skills but may not be sufficient (see below) or even necessary for their execution (Kawai et al., 2015).

It has long been known that the cerebral cortex is organized in layers that are associated with specific types of neurons, input sources and output destinations. These could provide the substrate for bottom-up modeling. However, little is known about the local connections among neurons from different layers and even less about the connectivity among the thalamus and distant cortical columns that span the many cortical areas involved in sensorimotor control (Hooks et al., 2013; Kaneko, 2013). For this reason, circuit models tend to rely heavily on correlations of activity between a very small subset of cortical neurons either in behaving animals and/or in response to electrical stimulation (Chadderdon et al., 2014). The apparent circuits derived from these studies apply only to the small number of neurons observed and may deviate substantially from true circuitry because there are many pathways with several interneurons between recording and stimulation sites; these interneurons may block activity through some pathways. Interneuron activity depends on experimental conditions so it is likely that the net excitatory/inhibitory influence observed during the highly constrained experiments does not apply to many sensorimotor behaviors. The specific output pathways to cerebellum and subcortical structures such as the brain stem nuclei, propriospinal interneurons, and segmental interneurons are also poorly understood.

The cerebral cortex has long been an attractive candidate for computational models because of its obvious importance for learning new tasks. The rise of digital logic in the 1940s and 50s coincided with the development of electrophysiological methods to study the activity of individual neurons, leading to the compelling notion of neurons as logical and gates (McCulloch and Pitts, 1943) and learning as rules for changing the weighting of the inputs (Hebb, 1949). Models of cortical function have mostly been elaborations of this basic scheme (Marr, 1970). The problem is that the principal output cells of the cortex appear to be vastly more complex in their computational functions. Each of the numerous tiny spines that extend from their dendrites appears to function as a sophisticated temporospatial signal processor whose output gain can be individually adjusted before contributing to the all-or-none output of the neuron as a whole (Polsky et al., 2004; Jadi et al., 2012). Ambitious attempts are underway to develop computational models of cerebral cortex based on exhaustive analysis of neural connectivity (Markram, 2006), but the computational algorithm for the individual cortical neurons is now less clear than was originally assumed.

## Basal Ganglia—Thalamus

The thalamus relays and processes information from the cerebral cortex, cerebellum, brain stem, and spinal cord, and its output neurons project to various areas of the cerebral cortex. Despite its important role, the interneuron connectivity of the thalamus and associated circuitry in the basal ganglia (Bosch-Bouju et al., 2013) is poorly understood and it is commonly omitted in models of sensorimotor control. Current models of this subsystem (Nelson and Kreitzer, 2014) are still based on simplistic analogies between neurons and electronic logic gates (McCulloch and Pitts, 1943), despite data demonstrating that their circuits and even individual neurons generate complex patterns of spontaneous activity (Llinas, 1988; Nakamura et al., 2014). Computational models of individual neurons with such properties are in their infancy. Incorporating them into large-scale systems requires many more parameters and assumptions about their individual properties and connectivity patterns, as well as much greater computing power.

#### Midbrain Tectum

The tectum (superior and inferior colliculus in mammals) may play a much larger role in sensorimotor interaction with external objects than is generally acknowledged. The superior colliculus has been intensively studied and modeled for its role in directing saccadic eye movements to visual stimuli (Fecteau and Munoz, 2006). The inferior colliculus has been studied mostly as the "relay" in which sound localization information from both ears is conveyed to the auditory cortex (Slee and Young, 2014). But in fish and amphibia, the analogous subsystem controls most of the purposeful motor behaviors of the organisms, which must be fast, accurate and well-coordinated. In all vertebrates, all exteroceptive senses capable of providing localization information about an external object (vision, hearing, and touch) converge on the tectum. The tectal outputs project to brainstem nuclei and spinal cord layers that control all of the muscles required to acquire such targets, whether by saccadic gaze movements of eye and head (and auricular pinnae in most species other than primates) or reaching movements of the limbs (and jaws and tongue in many species) (Saitoh et al., 2007; Kozlov et al., 2014; Philipp and Hoffmann, 2014). The direct projections from retina to superior colliculus are known to be the source of express saccades (Munoz and Wurtz, 1992), accurately directed gaze movements that can acquire targets about twice as fast as the transcortical loop (lateral geniculate to occipital cortex and frontal eye fields) that actually projects back to the superior colliculus, which appears to control the metrics of the saccade. There have been several reports of extremely short latency corrections of reaching movements to visual targets that shift position (Gribble et al., 2002; Wilmut et al., 2006; Perfiliev et al., 2010) and in avoidance of obstacles to locomotion (Weerdesteyn et al., 2004), which would be consistent with the tectum performing the same function in limb movements as in gaze movements. If true, this would affect profoundly the assumptions made about the computational tasks of the motor and parietal cortical regions associated with reaching in extrapersonal space, perhaps the most common task for which top-down models have been constructed.

# Cerebellum and Brainstem

The cerebellum makes an excellent poster child for the problem of modeling any one of the many subsystems that subserve sensorimotor function. Its cytoarchitecture and neurophysiology is highly distinctive, relatively simple and homogeneous and better documented than any other part of the central nervous system. It is important enough to have been endowed with over half of the neurons in almost any vertebrate. Lesions to the adult cerebellum produce profound and distinctive motor deficits, yet humans born with substantial cerebellar atresia function surprisingly well (Walker, 1944). The cerebellum has attracted several detailed computational models of its function (Albus, 1975; Marr and Thach, 1991; Kawato and Gomi, 1993; Kawato and Samejima, 2007), yet it can be plausibly argued to subserve either motor (Thach, 2014) or sensory (Bower, 1997) function, to the virtual exclusion of the other.

Although the neurons of the cerebellum and their connectivity are understood better than most structures in the brain, the precise input sources and their target neurons as well as the precise destinations of the output projections are not wellknown. For example, the climbing fibers from the inferior olivary nucleus in the medulla provide a substantial portion of the input to the cerebellum, but little is known about the inputs to the inferior olive and how they are processed. Climbing fiber activity is known to be influenced by the reticular formation and red nucleus of the brain stem, several areas from the cerebral cortex, as well as by signals from the periphery related to proprioception and touch (Brown et al., 1977; Stecina et al., 2013). Specific sources of these signals, how they are processed in the inferior olive, and how the processed output affects cerebellar function is not known. Another major source of input to the cerebellum is the mossy fiber projections from the lateral reticular nucleus of the reticular formation. There is a large degree of convergence of afferent activity ascending from the spinal cord with activity descending from the cerebral cortex, tectum, red nucleus and other parts of the brain stem (Alstermark and Ekerot, 2013). The precise source of these inputs as well as the interneuronal circuits that integrate them is largely unknown.

The brain stem possesses many subdivisions including the inferior olive, vestibular nuclei, reticular formation and red nucleus with distinct set of inputs, outputs, and interneurons. Although they have an important functional role for even simple reach and grasp movements (Alstermark and Isa, 2012), the connectivity of their neural circuits as well as the precise location of their inputs and outputs are poorly understood (Kennedy, 1990; Kuchler et al., 2002). Much of the output of cerebellum is targeted to these structures, which also receive the proprioceptive and vestibular sensory signals required to coordinate movements across the entire body. Most studies of motor behavior focus on the "prime mover" muscles most closely associated with the task, but the freely mobile, multiarticulated structures of most terrestrial animals are actually quite difficult to stabilize because of intersegmental Coriolis forces (Lackner and DiZio, 2000). For example, human subjects asked to flex and extend their elbow tend to keep the wrist still without thinking, whereas a similar weight connected by a hinge to a stick being wagged up and down would flop uncontrollably. This problem could be solved by simply cocontracting the muscles of the wrist to stiffen the joint, but that would be energetically inefficient. Instead, subjects deftly time the activation of the many different wrist muscles to cancel the rapidly changing Coriolis forces. This problem is greatly magnified when trying to maintain balance of the whole body during any rapid movement of any body part, a problem that is constantly changing as the musculoskeletal system develops and ages. Perhaps the best studied model for adaptive tuning of stabilizing reflex gains is the vestibulo-ocular reflex, which involves changes in brainstem nuclei mediated by cerebellar plasticity (Kawato and Gomi, 1992; Raymond et al., 1996; Clopath et al., 2014). The details of these circuits and their learning rules remain contentious and it is uncertain if they generalize to other sensorimotor behaviors.

# Use of Systems Models to Demonstrate Competency

Many anatomical structures of the sensorimotor system have been investigated to characterize their inputs, outputs, and intermediate processing (although as explained above, many important knowledge gaps remain). Their complexity often makes it impossible to intuit their input-output transformation. Neural networks within these structures, for example, have a large number of neurons with various non-linear properties and substantial convergent, divergent, and recurrent connections that make it difficult to infer how they process a set of incoming signals. Computational models are therefore essential for predicting these non-intuitive interactions and provide insight into the possible transformations that a neural network can apply to a range of input signal patterns. Intuition is even less useful for predicting the collective input-output transformation of a system of interconnected neural networks formed by distinct anatomical structures in the nervous system such as those shown in **Figure 1**. Understanding such complex interactions is necessary for determining the relative roles of these anatomical structures in sensorimotor control and ultimately unraveling the mechanisms involved.

The sparse knowledge of the anatomy and physiology underlying the sensorimotor system precludes a complete understanding of the mechanism of sensorimotor control. This knowledge, however, can be exploited to understand the competency of the known properties and suggest experimental investigations for furthering our understanding and updating the models. It can be tested, for example, if a set of known properties of the sensorimotor system is sufficient for generating specific behaviors. If the properties are sufficient for generating a particular behavior then this would suggest that these properties may play a significant role. It has been shown recently that the known spinal circuits described above are sufficient for generating the muscle dynamics of wrist (Raphael et al., 2010) and arm movement (Tsianos et al., 2011, 2014), which has been traditionally assumed to arise largely from commands issued by the motor cortex. These results emphasize the possibility that spinal circuits can make large contributions to voluntary movement and encourage further experimental testing. It has also been shown that descending commands to models of known spinal circuitry can be linearly interpolated to generate intermediate movements (Tsianos et al., 2014). Such simple interpolation along with the modeled musculoskeletal system and spinal circuitry properties were sufficient to reproduce the extent of learning generalization observed experimentally for similar tasks. This result suggests that much of the generalization of learning observed experimentally may arise from simple combinations of learned voluntary commands rather than through the use of internal inverse models of the sensorimotor system. This is consistent with the tendency of human subjects to adapt to changes in the musculoskeletal system by relatively simple scaling of the original motor programs rather than computation of new programs that offered superior performance (de Rugy et al., 2012a,b).

# Use of Systems Models to Prove Insufficiency

If modeled properties of the sensorimotor system are not sufficient to explain a particular behavior, then these properties must interact with other biological properties that are either not modeled or not known. Other known properties can be added to the model incrementally to test which configurations are sufficient for reproducing the desired behavior. For example, activation of muscles and feedback through spinal circuits contribute to stable movements due to the relatively short time delays involved; however, the contribution of spinal circuits is limited for whole body tasks that require muscle coordination between distant body parts. Spinal circuits coordinate activity among a limited number of adjacent muscles and joints because inputs to a given spinal interneuron originate from only a few adjacent spinal segments. There are many relatively fast circuits involving anatomical structures rostral to the spinal cord such as the brain stem (Esposito et al., 2014), tectum (Philipp and Hoffmann, 2014), cerebellum (Azim et al., 2014) and even sensorimotor cortex (Scott, 2004, 2012; Nashed et al., 2015) that receive afferent signals from throughout the body and may therefore contribute to the dynamics and stability of such movement. Known circuits from these anatomical structures can be incorporated in models to test their sufficiency, which would depend on their specific connectivity as well as the range of input sources and their extent of convergence. If multiple anatomical structures appear sufficient, then the modeled task could be varied systematically to search for situations where each one becomes uniquely competent. This would indicate the types of sensorimotor tasks and conditions that each anatomical structure might subserve in the biological system. If known properties of a given anatomical structure cannot account for the behavior, it could be that the model for the

structure is inadequate or that other structures are required. Such negative results are particularly valuable in identifying opportunities to advance understanding of the system as a whole.

# General Obstacles to Successful Modeling

It is always easy to end a scientific paper with a call for more experiments to provide more data to fill in the missing pieces of knowledge. This brief review has pointed out many places where basic knowledge about neural connectivity is insufficient to permit reductionist modeling. But there are also many places where the connectivity is relatively well-known and detailed models have been constructed, only to expose contentious disagreements about what role the structure actually plays in a given behavior.

As David Marr pointed out, models must start with a top-level theory of computation—a division of function into a sequence or hierarchy of tasks (Marr, 1982). Only then can one contemplate the computational algorithms that might subserve each of those tasks and, below that, the machinery that performs each algorithm. The physiology and connectivity of individual neurons provides information about the machinery level, leaving the modeler to guess about the tasks and the algorithms.

It used to be feasible for a researcher to become familiar with most of the literature about most of the CNS and to construct a systems level model of behaviors that considered both its phylogenetic and ontological origins; for example, (Ayres, 1975). Most researchers now spend a lifetime learning the details of and formulating hypotheses about one or two of the many subsystems that somehow contribute to sensorimotor behaviors. They tend naturally to assume that the subsystem that they are studying is primarily responsible for the behaviors that they employ in their experimental designs. Their models of "their" subsystem must nevertheless be reconciled with some current dogma about the role of the other subsystems, as promulgated by other researchers with equally narrow and parochial views.

"It was six men of Indostan To learning much inclined, Who went to see the Elephant (Though all of them were blind), That each by observation Might satisfy his mind...

And so these men of Indostan Disputed loud and long, Each in his own opinion Exceeding stiff and strong, Though each was partly in the right, And all were in the wrong!"

—From "Blind Men and the Elephant" by John Godfrey Saxe (1816–1887).

# References


for normal and parkinsonian conditions. Front. Comput. Neurosci. 7:163. doi: 10.3389/fncom.2013.00163


and three-dimensional finite element models for tactile sensation studies. J. Biomech. Eng. 137, 061002. doi: 10.1115/1.4029985


recordings from hindlimb muscle spindles. J. Physiol. 522, 515–532. doi: 10.1111/j.1469-7793.2000.t01-3-00515.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Loeb and Tsianos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Direct and indirect spino-cerebellar pathways: shared ideas but different functions in motor control

Juan Jiang1† , Eiman Azim2† , Carl-Fredrik Ekerot 3† and Bror Alstermark <sup>1</sup> \* †

<sup>1</sup> Department of Integrative Medical Biology, Section of Physiology, Umeå University, Umeå, Sweden, <sup>2</sup> Departments of Neuroscience and Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Kavli Institute for Brain Science, Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA, <sup>3</sup> Department of Experimental Medical Science, University of Lund, Lund, Sweden

The impressive precision of mammalian limb movements relies on internal feedback pathways that convey information about ongoing motor output to cerebellar circuits. The spino-cerebellar tracts (SCT) in the cervical, thoracic and lumbar spinal cord have long been considered canonical neural substrates for the conveyance of internal feedback signals. Here we consider the distinct features of an indirect spino-cerebellar route, via the brainstem lateral reticular nucleus (LRN), and the implications of this pre-cerebellar "detour" for the execution and evolution of limb motor control. Both direct and indirect spino-cerebellar pathways signal spinal interneuronal activity to the cerebellum during movements, but evidence suggests that direct SCT neurons are mainly modulated by rhythmic activity, whereas the LRN also receives information from systems active during postural adjustment, reaching and grasping. Thus, while direct and indirect spinocerebellar circuits can both be regarded as internal copy pathways, it seems likely that the direct system is principally dedicated to rhythmic motor acts like locomotion, while the indirect system also provides a means of pre-cerebellar integration relevant to the execution and coordination of dexterous limb movements.

#### Edited by:

Ning Lan, Shanghai Jiao Tong University, China

#### Reviewed by:

Irina N. Beloozerova, Barrow Neurological Institute, USA Vincent C. K. Cheung, The Chinese University of Hong Kong, Hong Kong

#### \*Correspondence:

Bror Alstermark, Department of Integrative Medical Biology, Section of Physiology, Umeå University, Linneus väg 9, 901 87 Umeå, Sweden bror.alstermark@umu.se

†These authors have contributed equally to this work.

> Received: 13 March 2015 Accepted: 01 June 2015 Published: 06 July 2015

#### Citation:

Jiang J, Azim E, Ekerot C-F and Alstermark B (2015) Direct and indirect spino-cerebellar pathways: shared ideas but different functions in motor control. Front. Comput. Neurosci. 9:75. doi: 10.3389/fncom.2015.00075 Keywords: lateral reticular nucleus (LRN), spino-cerebellar pathways, spino-LRN-cerebellar pathways, internal feedback, motor control

# Introduction

Cerebellar circuits are of major importance in the control of movements, providing a neural basis for pattern recognition and motor behavioral correction and adaptation (Ito, 2006). These contributions to motor control depend on specific mossy fiber and climbing fiber cerebellar inputs that convey information about both ongoing motor output and external sensory events (Ito, 1984; Dean et al., 2010). In this paper, we focus on the organization of mossy fiber systems and, more specifically, we delineate two classes of spino-cerebellar pathways: direct spino-cerebellar projections, and indirect pathways via the lateral reticular nucleus (LRN; referred to as the spino-LRN-cerebellar pathway), as illustrated schematically in **Figure 1**.

Spino-cerebellar pathways have been implicated in the transmission of information about external events from various sensory modalities (cf. review, Stecina et al., 2013), the cancellation of reafferent sensory signals during self-generated movements (Hantman and Jessell, 2010), and the conveyance of internal copies of motor commands for rapid motor prediction and correction (Lundberg, 1971; Arshavsky et al., 1972, 1978; Alstermark and Isa, 2012;

Fedirchuk et al., 2013; Azim and Alstermark, 2015). However, little is known about the organizational and functional logic underlying the existence of two separate systems for conveying spinal signals to the cerebellum. By comparing the phylogeny, anatomy, genetic identities and functional organization of direct and indirect spino-cerebellar circuits, we highlight key similarities and differences between these pathways and discuss principal questions that remain.

# Phylogeny

A phylogenetic comparison of cerebellar circuits has been extensively reviewed by Ito (1984). There is evidence that direct spino-cerebellar and indirect spino-LRN-cerebellar tracts coexist in teleost fish (Szabo et al., 1990; Finger, 2000), suggesting an early evolutionary divergence of these pathways. In mammals, several direct spino-cerebellar tracts (SCT) have been identified anatomically and electrophysiologically. Two of the most studied are the dorsal (DSCT) and ventral (VSCT) spino-cerebellar tracts that originate in the thoracic and lumbar spinal cord (Jankowska et al., 2011; cf. review: Stecina et al., 2013). The corresponding direct SCT for forelimb regions are the cuneocerebellar tract (CCT; Jansen and Brodal, 1954; Ekerot and Larson, 1972) in the brainstem and the rostral spino-cerebellar tract (RSCT; Oscarsson, 1965; Hirai et al., 1976) in cervical segments, respectively (**Figure 1**).

Indirect spino-LRN-cerebellar pathways have mainly been studied in the cat (Clendenin et al., 1974a,b,c,d, 1975; Matsushita and Ikeda, 1976; Ekerot, 1990a,b,c), but comparative anatomy (Walberg, 1952) has revealed the existence of the LRN in a large number of mammals including Erinaceomorpha (hedgehog), Chiroptera (bat), Rodentia (squirrel, mouse and rat), Lagomorpha (hare), Carnivora (cat, dog and seal), Cetartiodactyla (harbor porpoise), Artiodactyla (pig, cow and roe deer) and Primates (rhesus macaque and human).

Interestingly, in teleosts there are abundant axon collaterals from direct spino-cerebellar pathways to the LRN (Szabo et al., 1990), whereas in cats the DSCT does not provide collateral excitation to LRN neurons (Ekerot and Oscarsson, 1975). These phylogenetic differences suggest that direct spino-cerebellar and indirect spino-LRN-cerebellar pathways may have originated as cooperative systems, which became progressively separated as more advanced motor repertories evolved.

# Anatomy

As shown in **Figure 1**, direct spino-cerebellar and indirect spino-LRN-cerebellar pathways originate in cervical, thoracic and lumbar spinal segments, as well as in the brainstem (for review, cf. Alstermark and Ekerot, 2013; Pivetta et al., 2014). Within the direct and indirect classes, subpopulations with ipsilateral, contralateral and bilateral projections have been identified (cf. reviews: Alstermark and Ekerot, 2013; Stecina et al., 2013). The ultimate mossy fiber terminations of these pathways in the cerebellar cortex are found mainly in the vermal and paravermal regions of the anterior and posterior lobes, as well as in the paramedian lobe. The location of ascending axonal projections in the white matter of the spinal cord and the pattern of mossy fiber termination zones within the cerebellar cortex differ across individual systems, but broad comparison of direct and indirect pathways to each other reveals no clear differences (cf. review Ito, 1984). Thus, at least at the gross anatomical level, direct spino-cerebellar and indirect spino-LRN-cerebellar pathways target overlapping cerebellar cortical circuits.

# Genetic Identities

The genetic delineation of neuronal subtypes has complemented classical anatomical and electrophysiological characterization of spinal circuits, and has provided a means for selective manipulation and functional dissection of these pathways (Goulding, 2009). While the molecular identities of each of the direct spino-cerebellar systems are yet to be fully defined, studies in mice have revealed that a population of dorsally-derived spinal interneurons that express the transcription factor Math1 give rise to multiple spino-cerebellar pathways (Bermingham et al., 2001). Moreover, DSCT neurons in Clarke's column have been shown to selectively express the neurotrophic factor Gdnf (Hantman and Jessell, 2010).

Indirect spino-LRN-cerebellar pathways, and the cervical propriospinal neuron (PN) system in particular, have been the subject of much recent genetic scrutiny. A prominent population of excitatory PNs involved in goal-directed reaching movements was identified within the Chx10-expressing V2a interneuron class (Azim et al., 2014); notably, only cervical but not lumbar V2a interneurons project to the LRN, indicating that indirect LRN-cerebellar pathways originating in the lumbar cord have distinct genetic identities. In zebrafish, a subset of V2a spinal interneurons send ascending projections to the hindbrain (Menelaou et al., 2014), suggesting that the V2a interneuron class establishes an evolutionarily conserved circuit for the conveyance of motor signals to supraspinal regions. Moreover, recent genetic and viral labeling studies in mice have revealed that in addition to V2a interneurons, several classes of molecularly defined excitatory and inhibitory cervical spinal interneurons project to the LRN (Pivetta et al., 2014), suggesting that other indirect spino-cerebellar pathways can be dissected genetically along similar lines.

# Functional Organization

It has been well documented that both direct spino-cerebellar and indirect spino-LRN-cerebellar pathways convey information related to ongoing rhythmic movements, including locomotion, scratching and respiration (cf. references in reviews by Ito, 1984; Alstermark and Ekerot, 2013; Stecina et al., 2013). Moreover, it has been proposed that the VSCT (Lundberg and Weight, 1971) and DSCT (Hantman and Jessell, 2010) monitor the excitability of spinal interneurons. Interestingly, whereas the VSCT (Fedirchuk et al., 2013) and DSCT (Stecina et al., 2013) signal mainly during the flexion phase, spino-LRN-cerebellar pathways are active throughout the entire cycle of flexion and extension (cf review by Alstermark and Ekerot, 2013), suggesting that indirect pathways convey a broader range of motor signals.

Another major difference in the functional organization of direct and indirect cerebellar pathways is that the four subsystems in the indirect spino-LRN-cerebellar pathway originating in the cervical spinal cord and brainstem (**Figure 1**) may be dedicated to more than just rhythmic movements (Alstermark and Ekerot, 2013). These subsystems, by monitoring the excitability of spinal interneurons, could signal information about posture (bilateral ventral flexor reflex tract; bVFRT), reaching (C3-C4 propriospinal system; PN), grasping (ipsilateral forelimb tract; iFT) and jaw opening (dorsal funiculus-trigeminal tract; DF-Trig), and their convergence in the LRN might enable the coordination of these separate motor actions into coherent and smooth movements (Alstermark and Ekerot, 2013, 2015).

Among these indirect systems, the function of C3-C4 PNs has been investigated extensively in the cat, monkey, human and recently in the mouse (Alstermark and Isa, 2012; Azim et al., 2014). These studies have shown that PNs mediate motor commands for reaching by directly modulating the activity of forelimb-innervating motor neurons, while also conveying copies of these motor commands, via axon collaterals, to the LRN. Genetic manipulation of PNs in the mouse has revealed that this internal copy pathway recruits a cerebellarmotor feedback loop, providing a plausible neural substrate for the rapid updating and correction of ongoing forelimb motor output (Azim et al., 2014; Azim and Alstermark, 2015). The current lack of selective genetic access to other spino-LRN-cerebellar pathways has precluded similar exploration of their behavioral functions, yet evidence suggests that the cervical bVFRT, iFT and PN systems provide both discrete and convergent internal feedback signals to LRN-cerebellar circuits (Alstermark and Ekerot, 2013; Pivetta et al., 2014; Huma and Maxwell, 2015), potentially enabling the coordination of forelimb and postural motor control. A companion article discusses the extensive convergence of projections from these distinct systems in the LRN, providing a pre-cerebellar center for the integration of spinal signals and their modulation by descending motor cortical pathways (Alstermark and Ekerot, 2015).

# Open Questions and Future Directions

1. How do descending motor pathways modulate the direct and indirect spino-cerebellar tracts? Thus far, only the descending inputs to the cervical PN system have been investigated systematically (Alstermark and Lundberg, 1992; Alstermark and Isa, 2012; Azim et al., 2014). The convergence of descending pathways onto cervical PNs suggests a role for these neurons in integrating motor command signals and conveying copies of this information to LRN-cerebellar circuits. A better understanding of the descending inputs onto other direct and indirect cerebellar pathways should help to clarify their potential contributions to voluntary movements.


# References


particular suggests that these systems may have evolved in concert with the increasing complexity of dexterous forelimb movements. The identification of unique genetic markers for each of these pathways should offer a means to access and manipulate these circuits selectively, providing the experimental resolution needed to characterize their discrete contributions to motor behavior (Azim et al., 2014; Azim and Alstermark, 2015).

6. There is growing interest in applying computational neurobiology approaches to understanding the molecular and genetic mechanisms that may contribute to spinocerebellar ataxia (cf. review by Brown et al., 2015), and models devoted to the role of internal feedback more generally have explored various neural circuits in the cortex, brainstem and spinal cord (cf. review by Azim and Alstermark, 2015). Regarding spinocerebellar pathways, a hypothesis has been forwarded on their role in the multi-dimensional integration of sensorimotor information (Spanne and Jörntell, 2013). However, in this model, direct spino-cerebellar and indirect spino-LRN-cerebellar pathways are grouped together. A new hypothesis has recently been proposed that focuses specifically on the role of indirect spino-LRN-cerebellar pathways (Alstermark and Ekerot, 2013). Future modeling approaches, informed by the experimental work described above, should provide greater insight into the discrete functions of direct and indirect spino-cerebellar systems.

# Acknowledgments

This work was supported by grants from the Swedish Research Council and Umeå University to BA and by a National Institutes of Health (NIH) K99 award (NS088193) to EA.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Jiang, Azim, Ekerot and Alstermark. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Subject-specific computational modeling of DBS in the PPTg area

Laura M. Zitella<sup>1</sup> , Benjamin A. Teplitzky <sup>1</sup> , Paul Yager <sup>2</sup> , Heather M. Hudson<sup>2</sup> , Katelynn Brintz <sup>1</sup> , Yuval Duchin<sup>3</sup> , Noam Harel <sup>3</sup> , Jerrold L. Vitek <sup>2</sup> , Kenneth B. Baker <sup>2</sup> and Matthew D. Johnson1, 4 \*

*<sup>1</sup> Department of Biomedical Engineering, University of Minnesota, Minneapolis, MN, USA, <sup>2</sup> Department of Neurology, University of Minnesota, Minneapolis, MN, USA, <sup>3</sup> Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, USA, <sup>4</sup> Institute for Translational Neuroscience, University of Minnesota, Minneapolis, MN, USA*

Deep brain stimulation (DBS) in the pedunculopontine tegmental nucleus (PPTg) has been proposed to alleviate medically intractable gait difficulties associated with Parkinson's disease. Clinical trials have shown somewhat variable outcomes, stemming in part from surgical targeting variability, modulating fiber pathways implicated in side effects, and a general lack of mechanistic understanding of DBS in this brain region. Subject-specific computational models of DBS are a promising tool to investigate the underlying therapy and side effects. In this study, a parkinsonian rhesus macaque was implanted unilaterally with an 8-contact DBS lead in the PPTg region. Fiber tracts adjacent to PPTg, including the oculomotor nerve, central tegmental tract, and superior cerebellar peduncle, were reconstructed from a combination of pre-implant 7T MRI, post-implant CT, and post-mortem histology. These structures were populated with axon models and coupled with a finite element model simulating the voltage distribution in the surrounding neural tissue during stimulation. This study introduces two empirical approaches to evaluate model parameters. First, incremental monopolar cathodic stimulation (20 Hz, 90µs pulse width) was evaluated for each electrode, during which a right eyelid flutter was observed at the proximal four contacts (−1.0 to −1.4 mA). These current amplitudes followed closely with model predicted activation of the oculomotor nerve when assuming an anisotropic conduction medium. Second, PET imaging was collected OFF-DBS and twice during DBS (two different contacts), which supported the model predicted activation of the central tegmental tract and superior cerebellar peduncle. Together, subject-specific models provide a framework to more precisely predict pathways modulated by DBS.

Keywords: pedunculopontine nucleus, deep brain stimulation, Parkinson's disease, non-human primate, finite element, diffusion tensor

# Introduction

Gait and balance difficulties in Parkinson's disease can be especially debilitating since they increase the risk of falling. For some patients, these symptoms are resistant or poorly managed by levodopa treatment and typical targets of deep brain stimulation (DBS) including the subthalamic nucleus (STN) and internal segment of the globus pallidus. In contrast, low-frequency electrical stimulation delivered within or near the pedunculopontine tegmental nucleus (PPTg), a component of the

#### Edited by:

*Ning Lan, Shanghai Jiao Tong University, China*

#### Reviewed by:

*Frank Rattay, Vienna University of Technology, Austria Ursula Van Rienen, University of Rostock, Germany*

#### \*Correspondence:

*Matthew D. Johnson, Department of Biomedical Engineering, University of Minnesota, 7-105 Nils Hasselmo Hall, 312 Church Street SE, Minneapolis, MN 55455, USA john5101@umn.edu*

> Received: *13 March 2015* Accepted: *02 July 2015* Published: *14 July 2015*

#### Citation:

*Zitella LM, Teplitzky BA, Yager P, Hudson HM, Brintz K, Duchin Y, Harel N, Vitek JL, Baker KB and Johnson MD (2015) Subject-specific computational modeling of DBS in the PPTg area. Front. Comput. Neurosci. 9:93. doi: 10.3389/fncom.2015.00093* mesencephalic locomotor region (MLR) of the brainstem, has shown promising results (Plaha and Gill, 2005; Schrader et al., 2013; Mazzone et al., 2014). Clinical outcomes, however, have varied from patient to patient across these studies, due in part to variation in surgical targeting, uncertainty in the therapeutic target, and the likely modulation of highly excitable, side effect inducing fiber pathways (Nowak and Bullier, 1998) outside the MLR.

A previous computational modeling study showed that clinical outcomes of DBS within the PPTg area are likely to be highly dependent upon lead position and stimulation settings (Zitella et al., 2013). For instance, the superior cerebellar peduncle passes through the PPTg en route from the deep cerebellar nuclei to the red nucleus and cerebellar receiving area of the motor thalamus by means of a decussation just rostral to PPTg. At present, how selective activation of this pathway affects freezing of gait is unknown, though stimulation of the cerebellothalamo-cortical circuit has been postulated to be beneficial for gait (Fournier-Gosselin et al., 2013). DBS in the PPTg area may also modulate medial fiber tracts such as the medial longitudinal fasciculus (MLF) or the oculomotor nerve (ON). Side effects from activation of either of these fiber tracts would be expected to affect the eyes and eyelids. Neuronal activation volumes that extend lateral of PPTg may include the medial lemniscus (ML) and lateral lemniscus (LL) and lead to paresthesias (Murata et al., 2003) and changes in auditory perception (Lim et al., 2008), respectively. Further, spread of current rostral to the PPTg may modulate the central tegmental tract (CTG), which rises from the nucleus solitarius and carries gustatory input to the ventral posteromedial nucleus of the thalamus (VPM). There is some evidence that activation of this tract may result in palatal myoclonus (Matsuo and Ajax, 1979).

While previous subject-specific computational models of DBS have been developed for the STN (Miocinovic et al., 2006; Chaturvedi et al., 2010; Butson et al., 2011), tailoring models to the PPTg area has been limited because of the poor image contrast within brainstem with standard MR scanner technology. In recent years, however, advances in structural imaging have made visualizing fiber tracts within the brainstem more readily available. In this study, we leverage susceptibility-weighted imaging (SWI) and diffusion-weighted imaging (DWI) to create subject-specific computational models of PPTg-DBS, which can predict activation of individual fiber tracts within the brainstem for any given DBS setting. In order for these models to be informative for clinicians, the models must provide accurate predictions of neuronal activation. The challenge becomes defining behavioral or functional outcome measures to confirm or otherwise modify the selection of model parameters including tissue conductance anisotropy and inhomogeneity, cellular morphology, axonal diameter, and ion channel kinetics among others. Here, we propose two approaches in the context of PPTg-DBS, namely eliciting an oculomotor side effect and performing DBS within the context of positron emission tomography (PET) imaging.

To examine the pathways modulated in the PPTg area during PPTg-DBS, a subject-specific computational model was developed. In this study, the models were used to (1) investigate the effects of using tissue conductance anisotropy within the brainstem based on diffusion-weighted imaging, and (2) perform model parameter sweeps to determine PPTg-DBS model sensitivity. The models were evaluated with varying axon diameter, conductivity values, and DBS lead location, and then compared against behavioral and PET imaging results.

# Materials and Methods

# Subject

Two female rhesus macaque monkeys (macaca mulatta, Monkey L and Monkey P) were used in this study. All procedures were approved by the Institutional Animal Care and Use Committee of the University of Minnesota and complied with United States Public Health Service policy on the humane care and use of laboratory animals. The animals were housed individually with environmental enrichment, provided with water ad libitum, and given a range of food options including fresh fruit and vegetables. All efforts were made to provide good care and alleviate any discomfort for the animals during the study.

Pre-operative 7T MRI was acquired at the Center for Magnetic Resonance Research (CMRR) at the University of Minnesota using a passively shielded 7T magnet (Magnex Scientific) for both animals. During the imaging sessions, the animals were anesthetized with isoflurane (2.5%) and monitored for depth of anesthesia. Susceptibility-weighted imaging was acquired with a 3D flow-compensated gradient echo sequence at 0.4 mm isotropic resolution using a field of view (FOV) of 128 × 96 × 48 mm<sup>3</sup> . Diffusion-weighted images (b-value = 1500 s/mm<sup>2</sup> ) were acquired with diffusion gradients applied along 110 uniformly distributed directions using a 128 × 84 × 99 mm<sup>3</sup> FOV (1 mm isotropic resolution). The 3D tensors were calculated as ellipsoidal functions, to identify the orientation of maximum value (Barmpoutis et al., 2009; Barmpoutis and Vemuri, 2010).

In Monkey L, a cranial chamber was mounted on the head to facilitate implantation of the DBS lead, as described previously (Elder et al., 2005). The high-field imaging, along with results from electrophysiological microelectrode mapping of the PPTg area, were superimposed in Monkey Cicerone (Miocinovic et al., 2007) to define a trajectory for unilateral implantation of a scaled-down version of a human DBS lead (2F diameter, 8 annular electrode contacts: 0.5 mm height, 0.25 mm spacing) (NuMed, Hopkinton, NY) in the region of the PPTg (right hemisphere). Following lead implantation, a post-operative CT scan was performed under Ketamine and Dexdomitor anesthesia to visualize the implantation trajectory and depth in Monkey Cicerone. The preoperative SWI was coregistered with the postoperative CT to determine the DBS lead location relative to nuclei and fiber tracts within the brainstem. After instrumentation with the chamber and the DBS lead, the subject was rendered parkinsonian with systemic injections of 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP).

At the conclusion of the study, both animals were deeply anesthetized with sodium pentobarbital and perfused with a fixative solution containing 4% paraformaldehyde, consistent with the recommendations of the Panel of Euthanasia of the American Veterinary Medical Association. After fixation, the

brain was removed, blocked, and cryoprotected in 15% sucrose in phosphate buffered solution. Coronal sections (50µm) were cut using a freezing microtome and labeled for Nissl. In the case of Monkey L, the DBS lead trajectory was again reconstructed from these histological images using Mimics (Materialise, Leuven, Belgium), which confirmed the in vivo estimation of the DBS lead trajectory that had been generated from co-registration of the pre-implant SWI with the post-implant CT.

# Axonal Model Morphologies

Several imaging-based tools were used to reconstruct the threedimensional morphologies of the PPTg area for use in the subject-specific finite element model and multi-compartment neuron model simulations (**Figures 1A,B**). The SWI volume was aligned in anterior commissure to posterior commissure (AC-PC) space with Analyze (AnalyzeDirect, Overland Park, KS), and then resliced to generate images that matched atlas plates from a rhesus macaque brain atlas (Paxinos, 2009). A nonlinear affine atlas registration algorithm based on a moving least squares fit applied to each image voxel was used to identify the borders of the PPTg, CTG, and ON. The algorithm involved the selection of analogous control points placed on each MR image and each corresponding atlas plate. Contours were traced from these warped atlas reconstructions in Rhinoceros (McNeel, Seattle, WA) and lofted into 3D surfaces using a nonuniform rational B-spline modeling approach. Results from the warping algorithm were aligned to the DWI using FLIRT, which provided anatomical context to guide the placement of seed points for probabilistic diffusion tractography calculations in FSL (Jenkinson et al., 2012). To identify the SCP tract, seed points were placed in the cerebellar outflow tract caudal to the decussation, with waypoints defined at the decussation of SCP and the entire contralateral thalamus. Manual thresholding of the output of probtrackx was performed in Amira (FEI, Hillsboro, OR) to produce the final tract geometry.

## Finite Element Models (FEM)

A finite element model of the DBS lead and surrounding neural tissue was created in COMSOL (**Figure 1C**). The variable resolution tetrahedral mesh was constructed with a maximum element size of 0.2 mm for the electrodes and 1.6 mm for the lead, encapsulation layer, and neural tissue. The final mesh consisted of 447280 elements with a finer resolution near the electrode-tissue interface. A point current source was modeled at the geometric center of each electrode, and the entire lead was surrounded by a 250µm encapsulation layer with a conductivity of 0.18 S/m. A 20 mm radius sphere around the electrode represented the neural tissue. A cylinder was placed on the edge of the sphere to represent the cranial chamber, and the chamber outer surface was assigned as ground (**Figure 1C**). For the isotropic FEM, conductivity of the neural tissue was homogeneous, 0.3 S/m. For the anisotropic FEM, the neural tissue conductivity was calculated from the 6-direction DTI tensors, based on an estimated linear relationship between the conductivity (σ) and diffusion tensor eigenvalues (Tuch et al., 2001).

$$
\sigma = s \ast D \tag{1}
$$

location and grounded chamber. (D) Electric potential isosurfaces for the anisotropic and isotropic model.

where s was set to 0.844 and D represents the diffusion tensor eigenvalues. These conductivity matrices (**Figures 2A,B**) were imported into COMSOL and interpolated onto the mesh using the nearest neighbors function. Voltage distribution during monopolar stimulation through each electrode were solved for using anisotropic and isotropic conductivity models with COMSOL (**Figure 1D**) (Schmidt and Van Rienen, 2012). The extracellular voltage predicted from the FEM solution was then interpolated at each nodal compartment along each multi-compartment axon model.

# Biophysical Modeling of DBS in the PPTg Area

SCP, ON, and CTG were each randomly populated with multi-compartment cable models of 1000 myelinated axons, ranging in diameter from 2 to 8.7µm (McIntyre et al., 2004a; Miocinovic et al., 2006; Johnson and McIntyre, 2008; Birdno et al., 2011). The axonal models included nodes of Ranvier, paranodal, and intermodal segments as well as a myelin sheath (McIntyre et al., 2002). Nodal compartments were given biophysical mechanisms related to a nonlinear fast Na<sup>+</sup> channel, persistent sodium channel, slow K<sup>+</sup> channel, and a leakage current. The paranodal compartments were instantiated with a slow K<sup>+</sup> current. Both intermodal and myelin compartments had only passive mechanisms, including a membrane capacitance (Cm = 2 µF/cm<sup>2</sup> ) and axoplasmic resistivity (70 -cm). Stimulus thresholds for evoking action potentials within the modeled axons were estimated in the NEURON v7.3 programming environment (Hines and

FIGURE 2 | Comparison of conductivity and diffusion tensors between Monkey L (top) and Monkey P (bottom). (A) The calculated conductivity, σxx, is shown for select coronal slices. (B) The distribution of conductivity values calculated from the primary, secondary, and tertiary eigenvalues for the entire brain (top) and the segmented brainstem (bottom). (C) The fractional anisotropy for a select brainstem slice (left), compared to a corresponding T1 slice (right). The diffusion tensors are plotted as spherical functions and overlaid on the fractional anisotropy. The orientations (dorsal-caudal, anterior-posterior, medial-lateral) are represented as RGB color components (i.e., R, G, and B, respectively).

Carnevale, 1997). Stimulus perturbations were inserted using the NEURON mechanism, extracellular, for each axonal compartment. Stimulus pulses (90µs pulse width) were delivered at a rate of 20 Hz. Activation was defined by the lowest amplitude to elicit one or more action potentials within 1–3 ms following each stimulus pulse for at least 8 of 10 stimulus pulses.

## Motor Side-Effects of DBS in the PPTg Area

A monopolar review was conducted to determine the electrical stimulation amplitude thresholds for eliciting an overt motor side effect, which for this lead implantation was found to be a right eyelid flutter. In this case, stimulation was delivered through an externalized current-driven pulse generator (Precision, Boston Scientific) with a cathode applied individually to each contact and a return set to the cranial chamber. The stimuli were delivered as a 20 Hz train of 90µs pulses in 0.1 mA increments until the eyelid flutter was observed visually by the investigators, as shown in **Table 1**.

## PET Analysis

Approximately 1 year after implantation of the DBS lead in Monkey L, PET/CT was collected using a Siemens Biograph 64

#### TABLE 1 | Motor side-effect thresholds.


slice scanner on three different days within an 8-day period. A full 24 h before each scan, the subject was withheld from any stimulation or medication. The subject was fasted beginning at 1700 the night prior to the scan (Garraux et al., 2011), with fasting blood glucose verified the morning of the scan. Thereafter, the proximal end of the DBS lead was connected to the external pulse generator and a single 8 mCi dose of 18- FDG was administered intramuscularly. As listed in **Table 2**, immediately following this injection either 20 Hz PPTg-DBS was applied at one of the two contacts of the DBS lead or no DBS was applied (baseline scan). The subject sat still in a quiet, familiar environment without stimuli while this

TABLE 2 | DBS conditions during each PET scan.


treatment was administered. After 45 min, general anesthesia was induced using ketamine (10 mg/kg) and diazepam (0.5 mg/kg) (Oguchi et al., 1982; Wyckhuys et al., 2014), and the subject was moved to the scanner for imaging. Reconstruction yielded a voxel size of 1.018 × 1.018 × 2 mm. The subject was then released to an isolation room until radioactivity was undetectable.

The PET images were analyzed with methods similar to those described previously (Ponto et al., 2014; Wyckhuys et al., 2014). Each non-contrast CT scan was transformed by rigid, manual co-registration (Jena et al., 2014) to align with a standard MRI template, INIA19 Macaca mulatta (Rohlfing et al., 2012), with the resultant transformation values individually applied to the PET image from the corresponding data acquisition session to align them to the normalized space. Finally, a preoperative MRI, taken before the cephalic hardware and DBS lead had been placed, was aligned with the INIA19 MRI to verify fit. The 1210-MRIderived-VOIs of the INIA19 template, were used within PMOD software (PMOD Technologies, Zurich, Switzerland) to compile statistical measures. Mean voxel value, standard deviation, and number of voxels were collected as standard uptake values (SUV) (Garraux et al., 2011). All scans were scaled so that the left occipital white matter uptake was equivalent. Volumes of interest were then consolidated to yield larger brain structures, decreasing the resolution from 1210 volumes of interest into a manageable grouping for analysis. The scaled images had a two tailed t-test performed at each of the VOIs, with a noncorrected p-value (alpha = 0.05). All brain regions with a positive T score, corresponding to a relevant increase in brain activity, and p < 0.05 were analyzed further. These brain region SUVs are reported below along with respective T scores and p-values. Due to the use of a single subject, no CT transformation based image attenuation correction was performed on the PET scan results.

# Results

Accurate prediction of therapeutic outcomes by computational neuron models of DBS targeting the MLR of the brainstem will have strong clinical value for freezing of gait in Parkinson's disease patients with these implants (Zitella et al., 2013). Two major challenges remain in rendering these computational models more predictive in power: (1) making them subject specific, and (2) calibrating the model parameters. This work provides a framework to address both challenges using a combination of structural imaging at high magnetic fields, stimulus-evoked behavior using a monopolar review, and functional imaging.

### Conductivity Anisotropy

A subject-specific model was created for Monkey L, but the conductivity, fractional anisotropy, diffusion tensors, and conductivity distributions were analyzed for both Monkeys L and P. Fractional anisotropy measured the difference between the three eigenvalues of the diffusion tensor (Pierpaoli et al., 1996). If the diffusion was isotropic (all three eigenvalues are equal), this value became 0. If a large number was calculated, there was high diffusion anisotropy. These values were scaled between 0 (isotropic) and 1 (anisotropic) and displayed as black and white, respectively (**Figure 2C**). In the brainstem region of Monkey L and Monkey P, the fractional anisotropy was found to be highly variable, with values ranging from less than 0.1–0.7. Since the voxel size of the DTI was 1 mm isotropic, each voxel could be composed of multiple fiber tracts, explaining this variability. The highest fractional anisotropy values (∼0.5–0.7 in Monkey L, ∼0.3–0.7 in Monkey P), appeared to correspond to areas of the superior cerebellar peduncle (caudal of the decussation) as well as the medial lemniscus, two of the largest pathways in the brainstem. However, the fractional anisotropy values for the selected slices in **Figure 2C** at the decussation of SCP were small (Monkey L: 0.295, 0.224, Monkey P: 0.181, 0.278, 0.268) and did not vary much from the mean of the surrounding voxels (Monkey L: 0.2867 ± 0.0752, Monkey P: 0.2575 ± 0.0504).

All six parameters of the diffusion tensor are visualized as spherical functions in **Figure 2C**. In both Monkey L and Monkey P, the greatest difference in the overall tensor direction in the brainstem was seen in the area of the ML, where the tensors were primarily dorsal-caudal and were oriented at a 45 degree angle from the surrounding voxels. Additionally, the principal direction (V1) at the decussation of the SCP was oriented medial-lateral, with a 90◦ difference compared to the neighboring voxels. When comparing neighboring voxels elsewhere in the brainstem, many midline voxels displayed at least a 45◦ difference in the longest axis. While some variability between animals was expected, the overall anisotropy in the brainstem was comparable.

Given that the brainstem was composed of a heterogeneous set of nuclei and fiber tracts, we hypothesized that an FEM of the brainstem with anisotropic conductivity would exhibit strong asymmetries in comparison to an otherwise equivalent isotropic model. The conductivity values, σxx, derived from the subjectspecific imaging are shown in **Figure 2A** for several coronal sections throughout the brain. Histograms of the conductivity values calculated from the primary, secondary and tertiary eigenvalues were given for the entire brain, including ventricles, and for the brainstem region around the DBS lead (**Figure 2B**). This region included the pons and part of the midbrain, demarcated as posterior of the substantia nigra. The average calculated conductivity along the main axes in the brainstem was between 0.3235 and 0.4018, just above 0.3 S/m, the value used for the isotropic models.

When conductivity anisotropy was incorporated into the models, the spread of current in the tissue was altered. As seen in **Figure 1D**, the isosurfaces of the electric potential in the isotropic model were spherical, while the isosurfaces in the anisotropic model were non-spherical. This is consistent with previous modeling studies that incorporated anisotropy (Miocinovic et al., 2009). Model predictions for activation threshold were much lower for anisotropic models than for the isotropic model. For example, the threshold for activating 5% of CTG axons using contact 7 was 0.5 mA for the anisotropic model and 1.1 mA for the isotropic model.

The conductivity scaling factor (s) has been previously reported as the range of s = 0.844 ± 0.0545 S·s/mm<sup>3</sup> (Tuch et al., 2001) with varying scaling factors used in other modeling studies (McIntyre et al., 2004b; Butson et al., 2006). To investigate model sensitivity to the conductivity scaling factor within the reported range, activation threshold curves were generated using three values for s (0.79, 0.844, and 0.89). These results are shown in the second column for the ON (**Figure 3**), SCP (**Figure 4**), and CTG (**Figure 5**) tracts assuming an 8 µm diameter axonal fiber for ON and a 2µm diameter for CTG and SCP. Varying the scaling factor ±6.5% resulted in only a minor shift (0.0981 mA) in the threshold for 5% activation of ON fibers using contact 7.

#### Model Parameter Sweep

In addition to investigating model sensitivity to tissue conductivity, other model parameters known to impact the calculation of activation thresholds such as axon diameter (Rattay, 1999) and precise lead location were investigated as well. Previous models of axons in the brainstem region (SCP, medial lemniscus, lateral lemniscus) were modeled with a diameter of 2µm (Zitella et al., 2013). While this is a conservative estimation for SCP and CTG axonal diameters, we also examined the effects on activation thresholds when using 5.7 and 8.7µm axon diameter, which may be more representative of actual axon diameters within the SCP (Hazrati and Parent, 1992) and CTG tracts. The ON tract was also modeled with 2, 5.7, and 8.7µm axon diameter with the latter thought to be the most realistic axon diameter. Atomic force microscopy has shown human oculomotor nerve fibers with much larger diameter fibers, between 10 and 15µm (Melling et al., 2003), which presumably would be slightly smaller in the rhesus macaque. Consistent with the principles of cable models of myelinated axons, the axonal diameter had a large effect on the resultant activation thresholds for all three fiber tracts (**Figures 3**–**5**).

Model sensitivity to the precise position of the DBS lead within the brainstem was investigated by shifting the lead in four directions. Using the same implantation angle as defined by the SWI/CT co-registration process and histological reconstructions from Monkey L, the DBS lead was shifted 0.5 mm anterior, poster, medial, and lateral of the original lead placement (chamber reference). Moving the lead medially increased ON activation, while moving the lead laterally decreased ON activation. At the threshold for contact 7 (1 mA), the model predicted a 27.5%

increase in ON activation for a medial lead location. Overall, anterior and posterior deviation in lead location did not alter the model results for amplitudes near the threshold. Predicted activation from stimulation through contact 5 at 1.4 mA only decreased by 0.7% when the lead was moved in the anterior direction (**Figure 3**).

For contacts 5, 6, and 7, SCP activation increased when the lead position was shifted medial and posterior, while an anterior and lateral lead position decreased activation. SCP activation at 0.9 mA through contact 7 increased the activation by 2.1% (from 0 to 2.1%) when moving the lead posterior. Contact 4 was embedded within SCP, so lead location had minimal effect on SCP activation at lower amplitudes (below 2 mA). However, for amplitudes above 2 mA, the anterior and lateral lead placements decreased activation and the medial lead placement increased activation. For 0.9 mA stimulation through contact 4, change in SCP activation was negligible (∼0.1%) (**Figure 4**).

Due to the anatomy of the CTG, the effect of lead location was different for each active contact. In the posterior direction, there was minimal change in activation for contacts 6 and 7. The same was true for contact 4, at stimulation amplitudes below 1.6 mA. Contact 5 stimulation produced lower activation with a posterior lead placement until stimulation amplitude increased beyond 2 mA, which resulted in a large increase in activation that exceeded the original lead placement results. For all contacts, lateral shift decreased activation and medial shift increased activation, but the magnitude of these changes in activation differed for each contact. For the CTG, moving the lead 0.5 mm medially increased tract activation by 8% when stimulating through Contact 7 at 0.9 mA and 2.6% when stimulating through Contact 4 (**Figure 5**).

# Comparison of ON Model Simulations to Stimulus-Induced Eyelid Flutter

A monopolar review was conducted across all 8 contacts by applying a 20 Hz train of 90µs pulses in increasing amplitude at intervals of 0.1 mA until a motor side effect was observed or the amplitude of stimulation reached 3.5 mA. For the proximal four electrodes (contacts 4–7), a right eyelid flutter was observed at amplitudes at or above 1 mA (**Table 1**) with more proximal contacts requiring higher stimulation amplitudes. At 20 Hz stimulation, the therapeutic PPTg-DBS stimulation frequency, stimulation resulted in an eyelid flutter, while stimulation at higher frequencies (e.g., 130 Hz) resulted in the eyelid remaining elevated. No other overt motor signs were observed at any of the stimulation amplitudes tested for other contacts.

The oculomotor nerve is known to project to the levator palpabrae superioris muscle of the eye, which is responsible for elevation of the upper eyelid (Porter et al., 1989). Multi-compartment axon models were developed for the oculomotor nerve to identify model parameter settings that resulted in the most consistent activation values across the empirical motor threshold amplitude values as defined in **Table 1**. We assumed that the neuron models should

TABLE 3 | Comparison of ON model simulations to behavior thresholds.


predict an activation of 5–15% based on previous models of the corticospinal tract of internal capsule (Chaturvedi et al., 2010) at the experimental motor threshold. Using the ON computational model (diameter = 8.7µm, s = 0.844, original lead location), the percentage of activated axons at motor thresholds was calculated (**Table 3**). For each behavioral threshold, the percent error was calculated as the difference between the experimental threshold and the modelpredicted stimulation amplitude necessary to activate 5% of the axons. There was no error in the model predictions for contact 5, 6, and 7. The percent error for contact 4 was 6.67%.

The neuron modeling results from the anisotropic model resulted in much lower activation thresholds than were predicted from the isotropic model. Moreover, the anisotropic models, in comparison to the isotropic models, resulted in activation thresholds that were more consistent with the thresholds for inducing eyelid flutter (**Figure 3**). For the isotropic models, there was 0% activation of ON at the threshold amplitude for all contacts. Using contact 5, a 5% activation of ON fibers was achieved at 1.25 mA for the anisotropic model, while the isotropic model required 3.4 mA to reach 5% activation. Similarly for contact 6, 1.1 mA activated 8.7% of the axons in the anisotropic model and 2.7 mA was required to activate 8.7% of the axons in the isotropic model. There were equivalent results for the SCP, where the anisotropic model predicted 26% of SCP axons activated at 1.2 mA through contact 4, while the isotropic model predicted 26% of SCP axons activated at 2.5 mA.

# Comparison of Model Simulations to PET Imaging

Two FDG-PET scans, in the context of DBS, were conducted to examine the effects of DBS in the PPTg area, after 0.9 mA stimulation through Contact 7 (configuration 1) and 1.2 mA stimulation through Contact 4 (configuration 2) (**Tables 4, 5**). These were compared to a baseline scan with no stimulation. A sampling of the resultant FDG standard uptake values (FDG-SUV) are shown in **Figure 6**. For the first stimulation configuration, the ventral posteromedial nucleus of thalamus (VPM), which is innervated by CTG (Blumenfeld, 2002), showed an increased FDG-SUV (p = 0.023). Further, descending projections of CTG project to the inferior olivary nuclei (Blumenfeld, 2002), which also showed an increased FDG-SUV

#### TABLE 4 | PET Configuration 1.


in configuration 1 (p = 0.033). This corresponded to an activation of 12.2% of CTG fibers (**Table 3**). For configuration 2, the models predicted no activation of CTG; the PET measured a significant increase in FDG-SUV in the VPM (p = 0.041), but no significant increase in FDG-SUV in the inferior olivary complex (p > 0.05).

Similarly, regions that are innervated by projections from the PPTg showed an increased FDG-SUV, including the centromedian nucleus of thalamus (p = 0.037) during configuration 1. Configuration 2 showed increased FDG-SUV in regions innervated by fiber pathways near PPTg, including the rostral interstitial nucleus of the MLF and the interstitial nucleus of Cajal, which are innervated by MLF. Increased FDG-SUV in downstream targets of the PPTg was also observed in configuration 2, including the globus pallidus, basal amygdala, peripeduncular nucleus, centromedian nucleus, and the STN. The PET results also showed an increase in FDG-SUV in the red nucleus for configuration 2. This was supported by the model predicted activation of the SCP tract for configuration 2, 25.10% (**Table 6**).

# Discussion

Subject-specific computational models are an important tool to better understand the mechanisms of DBS in the brainstem and guide future DBS therapies (Butson et al., 2007; Miocinovic et al., 2009; Chaturvedi et al., 2010; Keane et al., 2012; Lujan et al., 2012; Zitella et al., 2013). In order for these models to be clinically relevant they must provide accurate predictions. While other methods of validation have been applied to

#### TABLE 5 | PET Configuration 2.


FIGURE 6 | PET imaging during PPTg-DBS. (Top) Co-registration of SWI and baseline PET/CT images. The SWI is shown in blue-green cold scale for differentiation from the gray CT. (Left) FDG-SUV during PET configuration 1 (0.9 mA stimulation through

#### TABLE 6 | Model comparison to behavior.


computational models of DBS, no models of the brainstem have yet been rendered subject specific. In this study, we evaluated the sensitivity of a subject-specific model of PPTg-DBS in a nonhuman primate to different model parameters (tissue conductance anisotropy, axonal diameter, and DBS lead location) contact 7), normalized to OFF-DBS. The PET results are overlaid on SWI of Monkey L. (Right) FDG-SUV during PET configuration 2 (1.2 mA stimulation through contact 4), normalized to OFF-DBS.

and compared the results to behavioral and functional imaging measures to determine the most accurate tissue conductance model.

Our previous computational models assumed the DBS lead was surrounded by homogeneous, isotropic tissue with a conductivity of 0.3 S/m (Zitella et al., 2013). Based on the fractional anisotropy results from Monkey L and Monkey P in the brainstem, the mean of the image-based conductivity distribution did deviate from this isotropic conductivity assumption, but was well within an order of magnitude. Since the conductivity scaling factor did not greatly affect the model predictions, the spatial variability of the conductivity (i.e., the distribution of conductivities within the brainstem) proved to have a large effect on the potential distribution around the DBS lead. This

high anisotropy near the lead resulted in lower stimulation amplitudes required to activate nearby axons despite the slightly higher average conductivity in the brainstem. Based on these results, it seems that anisotropy, in conjunction with the average conductivity, plays a role in the ability to activate axons.

Several other computational models of DBS have incorporated anisotropy of tissue conductivity, including models of STN DBS, which assigned typical conductivity values based on the literature (Sotiropoulos and Steinmetz, 2007; Åström et al., 2009). Other studies have converted fractional anisotropy measures from subject DTI to conductivity, including models of DBS in the STN region and in the thalamus (McIntyre et al., 2004a; Miocinovic et al., 2009). Similar to our models, regions with high anisotropy showed greater variability in the voltage isosurfaces and in the activated volume of tissue. However, these studies showed, contrary to our results, that the addition of anisotropy to the model decreased the percentage of axons activated. This difference may relate to axonal fiber orientations relative to the stimulated electrode(s) as well to assumptions of the neuron model parameters.

The PPTg area, similar to other typical DBS target regions, is highly anisotropic. Indeed, the PPTg is surrounded by the spinothalamic tract, CTG, medial lemniscus, lateral lemniscus, and the MLF. The fibers of the SCP are intertwined with the cells of the PPTg, which introduces challenges when attempting to stimulate one pathway over another. The present study showed that the inclusion of anisotropic conductivity is highly important for computational model predictions. This finding suggests that efforts to increase the resolution of fractional anisotropy imaging within the brainstem—through high-field, high angular resolution diffusion imaging (Lenglet et al., 2012), customized head coils (Adriany et al., 2010), and advanced computational reconstruction algorithms (Duarte-Carvajalino et al., 2014) as used in this study—could have significant merit (Novak et al., 2001; Stieltjes et al., 2001; Soria et al., 2011; Aggarwal et al., 2013; Deistung et al., 2013; Gizewski et al., 2014).

The clinical value of computational models lies in their legitimacy, making validation extremely important. Previous studies have confirmed the validity of their model parameters based upon correlations between EMG thresholds and activation of the corticospinal tract of internal capsule (during STN-DBS) (Butson et al., 2006; Chaturvedi et al., 2010), between paresthesia thresholds and activation of the somatosensory representation of thalamus (during Vim-DBS) (Kuncel et al., 2008), and between conjugate eye deviation and oculomotor nerve activation (during STN-DBS) (Butson et al., 2006). Here we extend this approach to the brainstem in the context of DBS targeting the PPTg area. The results showed much better predictions of the activation of the oculomotor nerve axons at stimulation amplitudes necessary to induce eyelid flutter for the anisotropic models. Without being able to measure the magnitude of the eyelid twitch or obtain verbal feedback on the strength of the side effect, this is a positive result. Additional modifications to the model equations, more precise anatomical geometry, and higher resolution DTI could provide more accurate results. The assumption that the conductivity is linearly related to the diffusion tensor eigenvalues may not hold for high resolution (1 mm voxels) DTI within the brainstem and could also be a source of error in the models.

In this study, PET imaging was used as a gross measure of the activation in the area during stimulation to compare the effects of stimulation through different contacts to baseline. PET is a valuable tool that has been used to examine the effects of DBS (Haslinger et al., 2003; Mayberg et al., 2005). The use of PET in the context of PPTg-DBS provided a novel approach to further evaluate the predictive capabilities of the computational neuron models. While the results were consistent, there are several limitations that should be noted. In addition to having only one subject, there was only one scan taken of each configuration (OFF DBS, C4 stimulation, C7 stimulation). Additional small spatial errors could also have been introduced when aligning the INIA19 atlas to the PET/CT. Furthermore, the PET analysis reported here did not account for the precise time from injection to time of scan. This will be incorporated in future studies for more accurate results.

Future studies will also need a larger sample size and expand the model validation methods. Through studies in nonhuman primates, the addition of electrophysiology would provide more insight into the effects of stimulation. The electrophysiological activation thresholds could be compared to the model predictions by recording single-unit spike activity at multiple sites within upstream and downstream targets of fiber pathways coursing near the PPTg.

As DBS techniques continue to advance, new targets are being explored and new lead designs are being developed. There is a growing need for validated computational models to better understand the therapeutic results and titrate stimulation parameters in human patients implanted with DBS systems. This study is the first case of incorporating anisotropic conductivity into subject-specific computational models of DBS in the brainstem. Moreover, the study emphasizes how coupling behavioral metrics and functional imaging data in computational modeling studies can be critical for enhancing the predictive power of the models.

# Acknowledgments

We thank the Center for Magnetic Resonance Research (CMRR) for providing the imaging resources, specifically Bud Grossman of the University of Minnesota Medical School for the use of PMOD. We also thank the Minnesota Supercomputing Institute for providing the computational resources to complete this study. This work was supported by grants from the National Institutes of Health (R01-NS081118, R01-NS085188, P41- EB015894, P30-NS076408, Human Connectome Project U54- MH091657), the Michael J Fox Foundation, and the National Science Foundation (IGERT DGE-1069104 to LZ and GRFP 00006595 to BT).

# References


treatment of intractable proximal tremor. J. Neurosurg. 99, 708–715. doi: 10.3171/jns.2003.99.4.0708


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Zitella, Teplitzky, Yager, Hudson, Brintz, Duchin, Harel, Vitek, Baker and Johnson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Motor planning under temporal uncertainty is suboptimal when the gain function is asymmetric

Keiji Ota1, 2 \*, Masahiro Shinya<sup>1</sup> and Kazutoshi Kudo<sup>1</sup> \*

*<sup>1</sup> Laboratory of Sports Sciences, Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan, <sup>2</sup> Research Fellow of Japan Society for the Promotion of Science, Tokyo, Japan*

For optimal action planning, the gain/loss associated with actions and the variability in motor output should both be considered. A number of studies make conflicting claims about the optimality of human action planning but cannot be reconciled due to their use of different movements and gain/loss functions. The disagreement is possibly because of differences in the experimental design and differences in the energetic cost of participant motor effort. We used a coincident timing task, which requires decision making with constant energetic cost, to test the optimality of participant's timing strategies under four configurations of the gain function. We compared participant strategies to an optimal timing strategy calculated from a Bayesian model that maximizes the expected gain. We found suboptimal timing strategies under two configurations of the gain function characterized by asymmetry, in which higher gain is associated with higher risk of zero gain. Participants showed a risk-seeking strategy by responding closer than optimal to the time of onset/offset of zero gain. Meanwhile, there was good agreement of the model with actual performance under two configurations of the gain function characterized by symmetry. Our findings show that human ability to make decisions that must reflect uncertainty in one's own motor output has limits that depend on the configuration of the gain function.

#### Edited by:

*Vincent C. K. Cheung, The Chinese University of Hong Kong, Hong Kong*

#### Reviewed by:

*Athena Akrami, Princeton University - Howard Hughes Medical Institute, USA Shih-Wei Wu, National Yang-Ming University, Taiwan*

#### \*Correspondence:

*Keiji Ota and Kazutoshi Kudo, Laboratory of Sports Sciences, Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Komaba 3-8-1, Meguro, Tokyo 153-8902, Japan keiji.o.22@gmail.com; kudo@idaten.c.u-tokyo.ac.jp*

> Received: *09 February 2015* Accepted: *25 June 2015* Published: *15 July 2015*

#### Citation:

*Ota K, Shinya M and Kudo K (2015) Motor planning under temporal uncertainty is suboptimal when the gain function is asymmetric. Front. Comput. Neurosci. 9:88. doi: 10.3389/fncom.2015.00088* Keywords: decision making, risk-sensitivity, Bayesian decision model, response variance, coincident timing task

# Introduction

In highly skilled movement, especially in sports, decision making is important for superior performance. For example, a tennis player requires a spatial action plan about where in a court they should aim; a ski jumper requires a temporal action plan about when they should take off. An executed action is associated with a gain/loss. In ski jumping, the take-off jump should be as close to the edge of the ramp as possible to get the best jump length, while take-off too early or too late decreases jump length (Müller, 2009). However, in whatever action they plan, an executed action is not always equal to the planned one because of motor variability (Schmidt et al., 1979; Kudo et al., 2000; van Beers et al., 2004). Thus, both gain/loss associated with action and uncertainty in motor output should be considered for better decision making.

The mathematical method for selecting an optimal plan under conditions of limited uncertainty is known as statistical decision theory (Berger, 1985; Maloney and Zhang, 2010). In particular, Bayesian decision theory, which is a part of statistical decision theory, is a method for optimizing the expected gain/loss.The expected gain/loss is calculated by integrating the gain/loss function assigned to a certain action over a probability distribution of an executed action given a planned action. The Bayesian decision maker plans the action that optimizes the expected gain/loss for any combination of gain/loss function and motor variability (Hudson et al., 2012).

Previous motor control studies have evaluated the optimality of human motor decision making by comparing Bayesian ideal performance with actual human performance (Trommershäuser et al., 2003a,b, 2005; Wu et al., 2006; Hudson et al., 2012; O'Brien and Ahmed, 2013). Some reports have concluded that humans can plan actions that are optimal when considering their own motor variability (Trommershäuser et al., 2003a,b, 2005; Hudson et al., 2012), while other reports have concluded that humans cannot compute the movement strategy that maximizes the expected gain in the presence of such variability (Wu et al., 2006; O'Brien and Ahmed, 2013). Thus, there is inconsistency in published claims concerning the optimality of human action planning.

Two possible factors,—differences in experimental design, and differences in energetic cost—could account for this inconsistency. First, previous studies have reported optimal or suboptimal action plans with different movements and different configurations of the gain function. For example, Trommershäuser et al. (2003a,b, 2005) and Hudson et al. (2012) have demonstrated optimality in a pointing plan under a gain function in which the magnitude of gain/loss remains constant. In contrast, O'Brien and Ahmed (2013) have shown suboptimal reaching and whole-body movement strategies under an asymmetric gain function in which seeking higher values of gain brings participants closer to scoring zero gain ("falling over the cliff "). Therefore, we cannot directly evaluate the relationship between the optimality of the action plan and the configuration of the gain function because the experimental designs among previous studies differed.

Second, previous studies have mainly treated pointing or reaching movement as executed action (Trommershäuser et al., 2003a,b, 2005; Wu et al., 2006; Hudson et al., 2012; O'Brien and Ahmed, 2013). In reaching and pointing movements, energetic cost is proportionally larger as the distance of the required movement is made longer. Because large energetic cost requires participant large effort, energetic cost could be a factor disturbing the measured optimality of action strategies. For example, Hudson et al. (2012) have reported that a discrepancy between ideal and actual performance emerged when optimal but largecost movements were required during obstacle avoidance (i.e., large excursions).

Here, we used a coincident timing task requiring decision making and compared the Bayesian ideal performance with actual human performance under four different configurations of gain function including those used O'Brien and Ahmed (2013). In the coincident timing task, energetic cost is constant because the participant just presses a button whatever strategy he/she selects. Thus, we can directly evaluate the relationship between the optimality of action plans and the configuration of the gain function excluding the factor of energetic cost as a possible reason for any discrepancy found. We, in fact, found good agreement between the ideal timing strategy and the actual strategy under a symmetric configuration. However, a discrepancy was found under asymmetric configurations. We will discuss possible explanations for this discrepancy. We also observed that larger trial-by-trial compensation occurred following miss trials than after success trials even though the experienced response errors were of the same magnitude.

# Methods

# Participants

Thirty-seven healthy right-handed adults participated in the experiment. Sixteen participants (10 male, 6 female; mean age 28.1 ± 7.8 years) performed Experiment 1, twelve (10 male, 2 female; mean age 22.8 ± 2.8 years) participants performed Experiment 2, and the remaining nine participants (7 male, 2 female; mean age 21.3 ± 2.2 years) performed Experiment 3. All participants were unaware of the purpose of the experiment. This study was approved by the Ethics Committee of the Graduate School of Arts and Sciences, the University of Tokyo.

# Experimental Task

**Figure 1** shows the time sequence of our basic experimental task. First, a warning tone was presented to ready the participants for an upcoming trial. Then, a visual cue was presented on a computer screen as a starting signal (14 inches, 1600×900 pixels, refresh frequency 60 Hz; Latitude E5420, Dell, Round Rock, TX, USA). The participants were instructed to press a button after presentation of the visual cue. The response time was recorded as the button-press time relative to the onset time of the visual cue. In each trial, the participants gained a point based on "gain function," a function that translated the response time to a certain number of points. The details of the gain function are explained in the following section. The foreperiod (interval between the warning tone and the visual cue) was randomly varied between 800 and 1200 ms in steps of 100 ms. The target time was 2300 ms after the visual cue and was fixed throughout the experiment. In our experiment, the target time was associated with gaining 100 points but was not necessarily the time when the participants should respond (see below). The inter-trial interval was 2000 ms. All computerized events were controlled by a program written with LabVIEW software (National Instruments, 2011 Service Pack 1, Austin, TX, USA).

## Experimental Condition and Procedure

In each experimental condition, the participants were required to make a decision about when to press a button to maximize the total gain in 100 trials. The gain for a trial was a function of response time, termed the "gain function." Four conditions were tested, corresponding to different gain functions. The first was characterized as the No Risk condition, which employed a symmetric gain function (**Figure 2A**). In this condition, a gain for a trial (G) was determined from the following equation.

$$G(t) = \begin{cases} \begin{array}{c} \frac{1}{23}t, \text{ if } t \le 2300 \\\ -\frac{1}{23}t + 200, \text{ if } t > 2300 \end{array} \end{cases} \tag{1}$$

In the above equation, t represents a response time in milliseconds. When the participants responded earlier than the

target time, they received a number of points that was a positive linear function of response interval. When the participants responded later than the target time, they received a number of points that was a negative linear function of response interval. A maximum-possible one-trial gain of 100 points could be obtained by responding exactly at the target time.

The second condition was the Step condition, which also had a symmetric gain function (**Figure 2B**). In the Step condition, a period of constant gain was on the edge of risk of zero gain both at its start and at its termination as represented by the following equation.

$$G(t) = \begin{cases} \begin{array}{c} 0, \text{ } \acute{y} \\ 100, \text{ } \acute{y} \\ 0, \text{ } \acute{y} \end{array} 1900 \stackrel{\le}{\le} t \stackrel{\le}{\le} 2700 \\\ \text{ } \text{ } \acute{y} \end{array} \tag{2}$$

The participants received 100 points if they responded within ±400 ms of the target time. However, zero points were given if they responded at less than target −400 ms or at more than target +400 ms. We termed these eventualities "miss trials" and when they occurred, the participants were cautioned by an unpleasant alarm and a flashing red lamp on the screen. The volume of the alarm was 71.5 ± 0.4 dB.

The third condition was characterized as the Riskafter condition, which employed an asymmetric gain function (**Figure 2C**). In the Riskafter condition, single-trial gain rose linearly as the target time approached, then plunged to zero after the target time and then remained zero thereafter as represented by the following equation.

$$G(t) = \begin{cases} \frac{1}{23}t, & \text{if } t \le 2300\\ 0, & \text{if } t > 2300 \end{cases} \tag{3}$$

Earlier than the target time, the gain function looks the same as that in the No Risk condition. However, zero points were given if the participants responded after the target time. If they missed, they received the same penalty as in the Step condition. This configuration has been used in O'Brien and Ahmed (2013).

The last condition was the Riskbefore condition in which the gain function of the Riskafter condition is mirror-imaged across the target time (**Figure 2D**). The gain for a trial was determined from the following equation.

$$G(t) = \begin{cases} 0, \text{ if } t < 2300 \\ -\frac{1}{23}t + 200, \text{ if } t \ge 2300 \end{cases} \tag{4}$$

In contrast to the Riskafter condition, zero points were given if the participants responded before the target time in the Riskbefore condition. Again, if they missed, they received the same penalty as in the Step condition. For trial-by-trial compensation analysis described below, we defined miss trials as trials when the participants received zero points, success trials as any other trials in all four conditions. Of note, all the trials resulted in success in the No Risk condition because the range in success trials was enough large (i.e., it was 4600 ms).

In each trial, we provided the participants with feedback information consisting of the relative response time calculated by response time–the target time, the gain for the trial, and the cumulative total gain. All the participants performed 10 trials for practice. This practice session was conducted to give them a feel for the length of time from the visual cue to the target time. In this session, we only provided them with the relative response time (i.e., no gain function was applied). After the practice session, all the participants performed 100 trials in the No Risk condition as a first experimental session. In the second experimental session, the participants who were assigned to Experiment 1 performed 100 trials in the Riskafter condition. The participants who were assigned to Experiment 2 performed 100 trials in the Riskbefore condition. Those who were assigned to Experiment 3 performed 100 trials in the Step condition. Each condition (No Risk and Riskbefore/Riskbefore/Step) was conducted as separate experimental session. The participants rested for several tens of seconds between sessions.

Before running each session, we explained the structure of the gain function with a figure visualizing it (see Supplementary Figure 1). Thus, the structure of the gain function was known to the participants before testing. In the figure, following information was also included: 100 points could be gained when the relative response time was 0 ms in the No Risk, Riskafter, and the Riskbefore conditions, and when the relative response time was within ±400 ms of the target in the Step condition. However, the length of time from the visual cue to the target time was not described; thus, the participants did not know that it was 2300 ms. Also, before performing the No Risk condition, the participants did not know the gain structure that would be used in the next condition.

We instructed them to maximize total gain in each condition. Thus, they were required to make a decision about when to press a button to maximize the total gain. Actual monetary rewards/penalties were not used (see limitation related to this experimental procedure).

#### Model Assumptions

We calculated the ideal strategy that maximizes the expected gain by a Bayesian decision-theoretic approach for each participant and for all conditions (Hudson et al., 2012). Our model consisted of two sets and two functions. The two sets were: possible response strategy T (motor decision), and executed response time t (motor output). The two functions were: probability distribution of executed response P (t|T), and gain function G(t). Given a particular planned strategy, a particular response is stochastically executed. This is considered the uncertainty in motor output. In this study, we assumed that the produced response time t is distributed around the planned response time T according a Gaussian distribution (see Supplementary Table 1) as follows.

$$P\left(t|T\right) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left[-\frac{\left(t-T\right)^2}{2\sigma^2}\right] \tag{5}$$

Then, given execution of a particular response time, the gain is given according to the gain function G(t). Given both P (t|T) and G(t), the expected gain EG(T) as a function of planned response time T is calculated by the following equation.

$$EG(T) = \int\_{-\infty}^{\infty} G(t) \cdot P(t|T)dt\tag{6}$$

Once we had measured the response variance σ for each participant and condition, we could calculate the optimal mean response time T <sup>∗</sup> by maximizing Equation (6). A Bayesian decision maker chooses a response time T ∗ for any given gain function G (t) and response variance σ. This can be regarded as a theoretical risk-neutral optimal response.

# Estimation of 95% Confidence Interval of Optimal Response Time

Furthermore, we estimated the 95% confidence interval of the optimal mean response time T <sup>∗</sup> by bootstrapping (3000 resamples) for the Riskafter and the Riskbefore conditions. We then examined whether the actual response time is within this 95% confidence interval in the Riskafter condition and in the Riskbefore condition. In a range from σ = 0 to 0.4 in steps of σ = 0.001, we first calculated each optimal mean response time Xopt1(σ0), Xopt2(σ0.001), · · ·, Xopt400(σ0.4) by maximizing Equation (6). Here focusing on Xopt1(σ0), we simulated 100 trials of a task execution responding by this optimal mean response time and having this response variance (in this case Xopt1 = 2300, σ = 0) using a MATLAB randn function. We repeated this process 3000 times and obtained bootstrap samples x<sup>1</sup> = (x1t1, x1t2, · · · · ·, x1t100), x2= (x2t1, x2t2, · · ·, x2t100), · · · , x<sup>3000</sup> = (x3000t1, x3000t2, · · ·, x3000t100). For each bootstrap sample x<sup>b</sup> (b = 1, 2, · · ·, 3000), we calculated the average value of its samples <sup>b</sup>µ<sup>b</sup> <sup>=</sup> 1 <sup>100</sup> · P<sup>100</sup> i=1 (xbti). After sorting these average samples bµb (b = 1, 2, · · ·, 3000) in ascending order, we defined a 2.5% and a 97.5% point in these samples as the 95% confidence interval in optimal mean response time T ∗ . By repeating the above processes from Xopt<sup>1</sup> (σ0) to Xopt400(σ0.4), we estimated the 95% confidence intervals of each optimal mean response time.

If the observed mean response times were within these 95% confidence intervals, we would conclude that the participant plans optimal and risk-neutral timing strategies. In the Riskafter condition, observed times longer than the confidence intervals would indicate suboptimal and risk-seeking strategies. Observed times shorter than the confidence intervals would indicate suboptimal and risk-averse strategies.

# Optimal Response Times Calculated from the Measured Distributions

Although we had confirmed that the response distributions were Gaussian (see Supplementary Table 1), we also calculated optimal mean response time for the Riskafter and the Riskbefore conditions using the measured response distributions. In the Riskafter condition (**Figure 3A**, upper panel), once we had obtained the response distribution we simply shifted it back and forth to identify the maximal total gain for that distribution (**Figure 3A**, lower panel). In **Figure 3A**, we show the case of shifting the measured distribution back. We defined the optimal mean response time as the mean response time of the optimized distribution (gray solid line in **Figure 3A**). The estimated optimal mean response time was always earlier than the target time in the Riskafter condition. The difference between the estimated optimal response time and the target time reflected each participant's own variance in response time, (i.e., the larger one's variance, the

earlier the optimal response time, and vice-versa. This effect is visualized in **Figure 5** as the solid curve).

In **Figure 3B**, the black arrow indicates the total gain when the measured distribution is not shifted and the gray arrow indicates the highest total gain possible for that distribution under shifting. The two-headed arrows in **Figures 3A,B** represent the optimal shift size.

Finally, we compared the estimated optimal mean response times with the observed mean response times (black solid line in **Figure 3A**). In this case, the observed time was closer to the target time than optimal, indicating risk-seeking. We also applied this approach to the Riskbefore condition. We compared the mean, as opposed to median optimal and observed times because the measured distributions were Gaussian.

## Trial-By-Trial Compensation Strategy

In addition to determining the response time strategies based on all trials, we examined compensation against the most recent response error based on a trial-by-trial analysis. These results were then compared between/within the No Risk and the Riskafter/Riskbefore conditions. The magnitude of the response error experienced on the current trial is known to influence the response in the following trial (Thoroughman and Shadmehr, 2000; Scheidt et al., 2001). Thus, the compensation size in the following trial can be proportional to the current magnitude of response error. Additionally, it has been shown that humans adjust future motor behavior according to rewarding and nonrewarding outcomes experienced (Wrase et al., 2007). Therefore, in addition to the compensation strategy against response error, we hypothesized that the compensation size following miss trials would be larger in the Riskafter/Riskbefore conditions than the compensation size following success trials in the No Risk condition.

We defined the compensation on the current trial, trial n, by subtracting the response time on the current trial, from that on the following trial, trial n+1, as in Equation (7).

$$Compensation\_n = RT\_{n+1} - RT\_n \tag{7}$$

We supposed that the compensation occurs around mean response time in both conditions, thus we defined response error as response time–mean response time in this analysis. The compensation size was anticipated to depend on the magnitude of response error (in other words, the magnitude of deviation between the current response time and mean response time).

To compare the compensation size on the following to miss/success trials, we defined the absolute value of the difference between mean response time in the Riskafter/Riskbefore condition and the target time as "M" for convenience, and sorted trials into following four bins, −2M < error<sup>n</sup> ≤ − M (bin 1), −M < error<sup>n</sup> ≤ 0 (bin 2), 0 < error<sup>n</sup> ≤ M (bin 3), and M < error<sup>n</sup> ≤ 2M (bin 4) for Experiment 1 and −2M ≤ error<sup>n</sup> < −M (bin 1), −M ≤ error<sup>n</sup> < 0 (bin 2), 0 ≤ error<sup>n</sup> < M (bin 3), and M ≤ error<sup>n</sup> < 2M (bin 4) for Experiment 2. **Figure 4** shows the error distributions separated by the bins. With these procedures, we can evaluate the compensation sizes based on same magnitude of response error between conditions. Scaling by "M" also allows data from different participants to be combined. The last bin in Experiment 1 and the first bin in Experiment 2 are in the areas that result

in miss trials in the Riskafter/Riskbefore conditions. Errors −2M or less and more than 2M in Experiment 1 and errors less than −2M and 2M or more in Experiment 2 were excluded from the analysis because only small numbers of trials were obtained within these ranges. We collected errors on trial n in each bin and calculated the average compensation size in each bin. Repeating this procedure for each participant and condition, we compared the average compensation size across the participants against the magnitude of response error between/within conditions for each bin. Because the distributions of the compensation sizes across the participants were Gaussian (see Supplementary Table 2), we calculated the average value.

# Statistical Analysis

We conducted paired t-tests to examine the significance of differences between optimal and observed values for response time and total gain in all conditions. We also conducted twoway repeated-measures ANOVA to determine differences in trialby-trial compensation strategy between the No Risk condition and the Riskafter/Riskbefore conditions. A p < 0.05 was regarded as statistically significant. Cohen's d measure for the t-test was calculated to determine the magnitude of mean differences.

Trials with response times more than ±2.5 standard deviations from the mean were excluded from the analysis as outliers. Average number of trials excluded across the participants and two conditions was 2.4 ± 1.5 in Experiment 1, 2.3 ± 1.3 in Experiment 2, and 1.7 ± 0.7 in Experiment 3.

FIGURE 5 | The participants adopt a risk-seeking strategy. (A) Results in the Riskafter condition, one dot of each color corresponds to a participant. Theoretically, the optimal mean response time must be shorter than the target time as an increasing function of the variability of one's response time (the *x-*axis shows the SD of the response time in the Riskafter condition as an index of this variability). However, the observed mean response times (filled circles) are closer to the target time than optimal. Thus, a discrepancy is seen between actual strategy and the ideal strategy. Open circles indicate optimal mean response times estimated by shifting the participant's actual response time distributions. Black curve indicates the optimal mean response time estimated by a Bayesian model (Equation 6). Gray curves indicate the 95% confidence intervals of the optimal mean response times, which were calculated by 3000 replications of a bootstrap algorithm. The optimal response times estimated from the actual distributions locates roughly within the 95% confidence intervals. (B) Results in the Riskbefore condition. The same risk-seeking behavioral tendency is observed. In this condition, the optimal mean response time must be longer than the target time as an increasing function of response variability. The observed response times are again closer than optimal to the target time.

# Results

### Discrepancies with Optimal Strategy

We found discrepancies between the Bayesian ideal strategy and the actual human strategy in the Riskafter and the Riskbefore conditions. The observed mean response times and the optimal mean response times calculated from the measured distributions were plotted against the standard deviation (SD) of response time in the Riskafter condition for all 16 participants (**Figure 5A**). The optimal mean response times calculated by the Bayesian model and their 95% confidence intervals were also plotted. As shown in **Figure 5A**, the optimal mean response time moves further from the target time as response variance increases. However, for all participants, except one, the observed mean response time was closer to the target time than the optimal mean response time calculated either from the Bayesian-theoretical 95% confidence interval or from the measured distribution. This result suggests that the participants took higher-than-optimal risks given their own variance in response time, which is classified as a suboptimally risk-seeking tendency.

The participants were also suboptimal in the Riskbefore condition in the sense that they were in general faster to respond than predicted by the optimal model (**Figure 5B**). In the Riskbefore condition, the optimal time is later than the target time. We found that all 12 participants responded closer to the target time than the Bayesian-theoretical 95% confidence interval for the optimal mean response time and the optimal response time calculated from the measured distribution. Therefore, a riskseeking strategy was shown under an asymmetric gain function regardless of the location of the penalty region.

We also found that the SD of the response time was significantly correlated with that of the observed mean response time in the Riskbefore condition (r = 0.59, p < 0.05; but was not significantly correlated in the Riskafter condition, r = −0.36, p = 0.17). Thus, in the Riskbefore condition, participants who had large response variance responded further to the target time than those who had small variance. This result raises a possibility that our participants might have chosen response times reflecting the size of their own response variance. However, their timing strategy was not optimal from the perspective of Bayesian decision theory.

## The Effect of the Symmetry of the Gain Function

Actual human strategy agreed with the Bayesian ideal strategy under a symmetric gain function. We compared the optimal mean response time estimated by the Bayesian model with the observed mean response time in all four conditions (**Table 1**). Paired t-tests showed that across participants, the optimal mean response time was not significantly different from that observed in the No Risk [t(36) = −0.08, p = 0.94, d = −0.02] and the Step conditions [t(8) = 1.76, p = 0.12, d = 0.87]. However, the observed mean response time in the Riskafter condition was significantly longer than the optimal mean response time [t(15) = − 8.00, p < 0.001, d = −2.35], and was significantly shorter in the Riskbefore condition [t(11) = 6.68, p < 0.001, d = 2.04]. Thus, the participants planned optimal timing strategies only under symmetric gain functions.

Looking at total gain, the average value of the observed total gain across participants was not significantly different from that of the optimal total gain in the Step condition [t(8) = −0.33, p = 0.75., d = −0.08]. The average observed total gain was significantly smaller than the average optimal gain in the No Risk [t(36) = 2.38, p < 0.05, d = 0.14], the Riskafter [t(15) = 7.14, p < 0.001, d = 1.61], and the Riskbefore [t(11) = 6.68, p < 0.001, d = 0.65] conditions. Although the total gain was significantly smaller than the optimal gain in the symmetric No Risk condition, its effect size was apparently small compared with the asymmetric Riskafter and Riskbefore conditions. Taken together, we confirm that an optimal strategy for maximizing expected gain could be computed under a symmetric gain function, but not under an asymmetric gain function.

#### Learning Effects on Timing Strategy

We analyzed timing strategy on a whole block of 100 trials and showed its suboptimality in the Riskafter and the Riskbefore conditions. However, there is a possibility that the participants gradually learned the strategy thorough trials. To investigate this possibility, we compared the mean response time over the first 50 trials with the mean over the last 50 trials across participants. Paired t-tests showed that early and late mean response times were not significantly different in the Riskafter condition [t(15) = 1.48, p = 0.16] and the Riskbefore condition [t(11) = 0.53, p = 0.61]. Furthermore, we conducted paired t-tests in each participant excluding trials that were classified as outliers. The results showed that early and late mean response times were significantly different for only 1 out of 16 participants in the Riskafter condition [t(47) = 2.10, p < 0.05 for P13], and for 2 out of 12 participants in the Riskbefore condition [t(47) = −2.76, p < 0.01 for P18; t(47) = 2.68, p < 0.01 for P22]. Therefore, we concluded that participants did not learn timing strategy through trials. This result is consistent with previous studies claiming no evidence of learning effects on movement plans (Trommershäuser et al., 2003b, 2005; Wu et al., 2006; O'Brien and Ahmed, 2013).


*Optimal response time and optimal total gain were calculated with a Bayesian model (Equation 6). There is no significant difference between observed and optimal response times in the No Risk and Step conditions (symmetric gain functions), but there is a significant difference in the Riskafter and Riskbefore conditions (asymmetric gain functions). All data shown are averages across the participants* ± *standard deviation of the mean.* \* *indicates p* < *0.05 and* \*\*\* *indicates p* < *0.001.*

Riskbefore 12 2611.8 ± 69.9 2495.2 ± 40.3\*\*\* 2.04 8160.6 ± 442.3 7833.8 ± 556.9\*\*\* 0.65

### Trial-by-trial compensation strategy

We then compared trial-by-trial compensation strategies between the No Risk and the Riskafter/Riskbefore conditions. To this end, in **Figure 6A** we plotted the average compensation size across participants against the magnitude of response error on the previous trial in Experiment 1. We performed two-way repeated-measures ANOVA on the compensation size. The levels were condition (2: Riskafter condition and No Risk condition) and bin (4: bin 1–4). We found a main effect of bin [F(1.63, <sup>24</sup>.46) = 185.02, p < 0.001]. Thus, the average compensation size changed based on the experienced response error. Furthermore, we also found an interaction effect [F(2.23, <sup>33</sup>.52) = 6.36, p < 0.01] and a simple main effect of condition in bin 4 [F(1, 15) = 10.78, p < 0.01], but not in any other bins [Fs(1,15) < 4.16, ps > 0.05]. Of note, miss trials in the Riskafter condition are included in bin 4 (i.e., the range: M < error<sup>n</sup> ≤ 2M). Moreover, in bin 4, the average compensation size in the Riskafter condition was significantly larger than that in the No Risk condition [t(15) = −3.28, p < 0.01, d = −1.03]. These results suggest that the participants used statistically the same compensation strategy following success trials (bins 1–3 in both conditions), but compensated more strongly following miss trials (bin 4 in the Riskafter condition) compared with success trials (bin 4 in the No Risk condition), even though response errors in both bins were of the same magnitude (see Supplementary Figure 2).

Larger compensation after misses was also seen in Experiment 2 (**Figure 6C**). Again, we performed two-way repeated-measures ANOVA on the compensation size. We found a main effect of bin [F(1.59, <sup>17</sup>.48) = 192.30, p < 0.001] and an interaction effect [F(3, 33) = 4.12, p < 0.05]. We also found a simple main effect of condition in bin 1 [F(1, 11) = 6.90, p < 0.05], but not in any other bins [Fs(1, 11) < 2.70, ps > 0.05]. Bin 1 (i.e., the range: −2M ≤ errorn< M) is in the area of miss trials in the Riskbefore condition.

the Riskafter and No Risk conditions. A simple main effect test reveals that the average compensation in the Riskafter condition is to significantly shorter times than that in the No Risk condition in bin 4 (penalty region in the Riskafter condition), which indicates that participants overcompensate following miss trials compared with success trials in which the magnitude of the response error is held the same. (B) Individual data in the Riskafter condition, each colored symbol corresponds to a participant. Overcompensation also

average compensation in the Riskbefore condition is to significantly longer times than that in the No Risk condition in bin 1 (penalty region in the Riskbefore condition). (D) Individual data in the Riskbefore condition. The absolute compensation size in bin 1 was significantly larger than that in bin4. † indicates *p* < 0.10, \* indicates *p* < 0.05, and \*\* indicates *p* < 0.01. Error bars indicate standard error of the mean. Individual data for the compensation size in all bins is shown in Supplementary Figure 3.

Similarly to Experiment 1, the average compensation size in the Riskbefore condition was larger than that in the No Risk condition in bin 1 [t(11) = 2.63, p < 0.05, d = 0.80]. Therefore, we confirmed larger compensation following miss trials as a robust result regardless of the location of the penalty region.

We also confirmed that larger compensation occurred within the same Risk conditions. Paired t-test revealed that the absolute value of the average compensation size in bin 1 was significantly larger than that in bin 4 within the Riskbefore condition [t(11) = 3.45, p < 0.01, d = 0.91; individual data were shown in **Figure 6D**]. Within the Riskafter condition, the absolute compensation size in bin 4 was marginally significantly larger than that in bin 1 [t(15) = −1.78, p = 0.096, d = −0.53; individual data were shown in **Figure 6B**]. Magnitude of response errors was same in bin 1 and bin 4 but sign of errors was different. Significant difference between bin 1 and bin 4 in the No Risk condition was found neither in Experiment 1 [t(15) = 0.75, p = 0.47, d = 0.18] nor in Experiment 2 [t(11) = −0.41, p = 0.69, d = −0.15].

# Discussion

# Summary of Results

We directly evaluated the relationship between the optimality of action plans and the configuration of the gain function. With the coincident timing task, we could exclude the energetic cost factor, which might disturb an optimal action plan. Compared with Bayesian optimal timing strategy, our participants planned suboptimal strategies under asymmetric configurations. They tended to respond closer than optimal to times presenting the risk of zero gain. Under symmetric configurations, good agreement between the observed and optimal strategies was found. Furthermore, larger compensation occurred following miss trials compared with success trials even though the experienced response errors were of the same magnitude.

# Suboptimal Decision Making

We investigated whether humans can calculate an optimal timing strategy that maximizes the expected gain under four configurations of the gain function. In the Step condition, a constant value of gain was on the edge of risk of zero gain. The gain function has a symmetric configuration in this condition. Most of the relevant previous studies have used this type of gain function and have reported optimal movement planning (Trommershäuser et al., 2003a,b, 2005; Hudson et al., 2012). We likewise showed that strategies were optimal in the Step condition. In the Riskafter condition, higher values of gain come with higher risk. We found a discrepancy between the ideal Bayesian model and actual strategy in the Riskafter condition, participants showing a risk-seeking strategy. Our finding is consistent with a previous study that reports risk-seeking strategy under a similar gain function during reaching and whole-body movement tasks (O'Brien and Ahmed, 2013). In addition to the Riskafter condition, we applied the Riskbefore condition in which the configuration of the Riskafter condition was inverted with respect to time. We confirmed a risk-seeking strategy also in the Riskbefore condition. Therefore, these results suggest that human action plans tend to be suboptimal under situations in which higher values of gain occur closer to zero gain regardless of the location of risk. On the other hand, action plans could be optimal under situations in which a constant value of gain was close to zero gain.

A symmetric gain configuration was applied in the No Risk and Step conditions, while an asymmetric configuration was applied in the Riskafter and Riskbefore conditions. Wu et al. (2006) have investigated the endpoint of pointing movements under both symmetric and asymmetric expected gain landscape. Theoretically, the optimal endpoint in that study lay within the target circle under a symmetric expected gain landscape but lay within the penalty circle and did not cover the target circle under an asymmetric expected gain landscape. These researchers showed that an intuitive strategy to aim within the target circle could be adopted, but a counterintuitive strategy to aim within the penalty circle could not be adopted. Even in our experiment, the participants might not easily detect when they should press a button because optimal response time depends on response variance under an asymmetric configuration of the gain function. Therefore, our findings indicate a limitation on information processing and computational ability in decision making under uncertainty in motor output as well as in economic decision making (Simon, 1956).

# Distortion of Subjective Value

In the field of behavioral economics, prospect theory (Kahneman and Tversky, 1979) and cumulative prospect theory (Tversky and Kahneman, 1992) claim that irrational decision making is caused by a distortion of probability weighting from the actual probability and a distortion of subjective utility from the actual gain/loss. Prospect theory gives two reasons for risk-seeking behavior.

One reason would be an inappropriate estimation of the participant's own variance in response time (Wu et al., 2009; Nagengast et al., 2011; O'Brien and Ahmed, 2013). Wu et al. (2009) have shown that participants under-weighted small probabilities and over-weighted large probabilities when they made a decision whether to point to a riskier target bar or a safer target bar. O'Brien and Ahmed (2013) have also shown a similar distortion of probability weighting during a reaching task. These reports indicate that our participants might have believed that they had smaller response variability than they actually did. Such an inappropriate estimation of their own variance would have influenced them to approach the penalty zone more closely.

Before performing the Riskafter/Riskbefore condition, participants had only experienced 100 trials in the No Risk condition. Thus, they may not have had enough experience with the task to know their own response variance, but the report of Zhang et al. (2013) calls into question the idea that more practice would have helped. These researchers have shown that the distribution of a reaching endpoint was recognized as an isotropic distribution rather than the actual anisotropic distribution, and that this inaccurate estimation persisted even after extensive practice. This report indicates that an inappropriate estimation of one's own variance is not necessarily caused by lack of practice. Thus, the ability to recognize one's own variance in motor output appropriately may have limitations.

The second reason would be inappropriate evaluation of gain/loss (Lee, 2005; O'Brien and Ahmed, 2013). Risk-seeking in decision making arises when the subjective utility of gain is overvalued against the objective value (Lee, 2005). Here, higher values of gain came with a higher risk of zero gain in the Riskafter/Riskbefore conditions. O'Brien and Ahmed (2013) showed that most participants overvalued point reward and undervalued point penalty under this type of gain function. This inappropriate evaluation of gain/loss would also influence our participants to respond closer to the point where gains of zero began. However, we could not distinguish which distortion most affected risk-seeking behavior using the above analyses. Thus, our remaining issue is to specify them using other experimental paradigms.

# Trial-by-trial Analysis

We also investigated differences in trial-by-trial compensation strategy between/within the No Risk and the Riskafter/Riskbefore conditions. We found larger compensations following miss trials compared with success trials between the No Risk and the Riskafter/Riskbefore conditions with the same magnitude of response errors (see Supplementary Figure 2). The sign of response errors was same in comparison between the conditions. In comparison within the Riskafter/Riskbefore conditions, we also found that larger compensations occurred following miss trials compared with success trials with different sign of response errors. We assume that this is because of motivation to avoid consecutive misses.

Error feedback is necessary for motor adaptation (motor learning). Previous studies have investigated how the magnitude of error influences subsequent adaptation. These studies have reported that the size of the adaptation has a linear relationship with the magnitude of past errors (Thoroughman and Shadmehr, 2000; Scheidt et al., 2001). Linear adaptation against error is an element in minimizing future errors. However, recent studies have shown that motor adaptation does not depend simply on the magnitude of error. A nonlinear relationship has been observed when the subjective value, directional bias, statistical properties, and relevance of errors are experimentally manipulated (Fine and Thoroughman, 2007; Wei and Körding, 2009; Trent and Ahmed, 2013).

In our task, the subjective value of error (Trent and Ahmed, 2013) was different between conditions. Errors over/within the target time were cautioned in the Riskafter/Riskbefore condition, while same magnitude of errors was not cautioned in the No Risk condition. Trent and Ahmed (2013) have shown that weaker adaptation and weaker error sensitivity in response to errors further from the penalty region. This suggests that nonlinear adaptation is an effective motor control strategy for avoiding penalty. In our study, larger compensation was observed in response to errors that were recognized as misses. This suggests that the larger compensation strategy is an effective control heuristic for avoiding consecutive misses. This tendency was robust regardless of the location of the penalty region. Thus, our results support the view that compensation on the following trial is influenced not only by the magnitude of the error but also by the subjective value of the error.

# Decision Making on the Sports Field

Risk-seeking behaviors are sometimes seen in real life on the sports field. For example, professional NBA basketball players attempt consecutive three-point shots after they succeed in making a three-point shot even though the probability of further points is decreased (Neiman and Loewenstein, 2011), possibly due to enhanced self-confidence. NBA players are also unwilling to shoot during an early stage of the shot clock even though higher points per possession can be obtained by shooting more frequently (Skinner, 2012). This is possibly because of overconfidence about shot opportunities during later stages (Skinner, 2012). Therefore, suboptimal decision making would have the effect of degrading the performance of beginners as well as of experts in a variety of sports.

Both symmetric and asymmetric gain functions are seen on the sports field. Examples of the former occur in archery, Japanese archery, and shooting. In these sports, a gain is distributed symmetrically around the center of the target. Accuracy in hitting the center of the target is a crucial factor in performance. Examples of the latter gain functions occur in tennis, volleyball, golf, and ski jumping. In tennis, a ball bouncing as close as possible to the line marking the edge of the court would result in scoring a point, while a ball bouncing beyond the line would cost the player a point. In such a situation, appropriate decision making about where to aim as well as the accuracy of the aim are both critical factors. We have shown that humans cannot always make such appropriate decisions that consider variance in motor output. Therefore, the implication of our results for coaches and trainers, especially in sports with asymmetric gain functions is that it is important to fashion a risk-handling strategy optimized for each player's skill level.

# Limitations

In this study, we compared the observed mean response time calculated over 100 trials with the optimal mean response time in the Riskafter/Riskbefore condition and in each participant. Taking into account the fact that the location of the optimal mean response times move further from the target time as response variance increases, the observed mean response times were closer than optimal to the target time in both conditions (**Figures 5A,B**). However, a possibility remains that this observed response time would have approached optimal if the participants had been able to decrease their response variance through more practice. Therefore, our remaining issue is to investigate this possibility with a longitudinal study.

As another limitation, we instructed the participants to maximize the total gain but we did not use an experimental procedure giving them real monetary rewards in accordance with their performance. This raises the possibility that real monetary rewards would have induced risk-neutral behavior. However, it has been shown that real and virtual rewards induced similar performance in economic decision making (Bowman and Turnbull, 2003), autonomic response (skin conductance response) patterns resulted from monetary wins or loses (Carter and Smith Pasqualini, 2004), and brain activation patterns (Miyapuram et al., 2012). Therefore, we consider that use of real monetary rewards would have a small effect on our participant's risk-seeking behavior. However, it would be interesting to investigate motor decision making in situations in which a onetrial decision wins a high-priced award, such as a game-winning shot or a tour-winning putt.

# Author Contributions

KO, MS, and KK conceived and designed the experiments. KO performed the experiments. KO and MS programmed the simulations and KO analyzed the results. KO, MS, and KK interpreted the results. KO wrote the manuscript, MS and KK commented and revised the manuscript. KO, MS, and KK approved final version of manuscript.

# References


# Acknowledgments

We thank Kimitaka Nakazawa, Shun Sasagawa, and Hiroki Obata for valuable comments on the manuscript. This research was supported by Grants-in-Aid for Scientific Research (KAKENHI) No. 24240085 and 26560344 awarded to KK.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fncom. 2015.00088


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Ota, Shinya and Kudo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The lateral reticular nucleus; integration of descending and ascending systems regulating voluntary forelimb movements

#### Bror Alstermark <sup>1</sup> \* and Carl-Fredrik Ekerot <sup>2</sup>

<sup>1</sup> Department of Integrative Medical Biology, Section of Physiology, Umeå University, Umeå, Sweden, <sup>2</sup> Department of Experimental Medical Science, University of Lund, Lund, Sweden

Cerebellar control of movements is dependent on mossy fiber input conveying information about sensory and premotor activity in the spinal cord. While much is known about spino-cerebellar systems, which provide the cerebellum with detailed sensory information, much less is known about systems conveying motor information. Individual motoneurones do not have projections to spino-cerebellar neurons. Instead, the fastest route is from last order spinal interneurons. In order to identify the networks that convey ascending premotor information from last order interneurons, we have focused on the lateral reticular nucleus (LRN), which provides the major mossy fiber input to cerebellum from spinal interneuronal systems. Three spinal ascending systems to the LRN have been investigated: the C3-C4 propriospinal neurones (PNs), the ipsilateral forelimb tract (iFT) and the bilateral ventral flexor reflex tract (bVFRT). Voluntary forelimb movements involve reaching and grasping together with necessary postural adjustments and each of these three interneuronal systems likely contribute to specific aspects of forelimb motor control. It has been demonstrated that the command for reaching can be mediated via C3-C4 PNs, while the command for grasping is conveyed via segmental interneurons in the forelimb segments. Our results reveal convergence of ascending projections from all three interneuronal systems in the LRN, producing distinct combinations of excitation and inhibition. We have also identified a separate descending control of LRN neurons exerted via a subgroup of cortico-reticular neurones. The LRN projections to the deep cerebellar nuclei exert a direct excitatory effect on descending motor pathways via the reticulospinal, vestibulospinal, and other supraspinal tracts, and might play a key role in cerebellar motor control. Our results support the hypothesis that the LRN provides the cerebellum with highly integrated information, enabling cerebellar control of complex forelimb movements.

Keywords: interneurons, propriospinal neurons, motoneurons, lateral reticular nucleus, cerebellum, motor control, efferent copy, internal feedback

# Introduction

Cerebellar control of movements requires continuous information about descending motor commands and sensory information evoked by external events. Such information is provided by mossy fiber and climbing pathways (Ito, 1984). As to mossy fiber systems, much attention has been given to direct spino-cerebellar pathways that may provide information

#### Edited by:

Ning Lan, Shanghai Jiao Tong University, China

#### Reviewed by:

Tadashi Isa, National Institute for Physiological Sciences, Japan Ingela Hammar, Götborg University, Sweden

# \*Correspondence:

Bror Alstermark, Department of Integrative Medical Biology, Section of Physiology, Umeå University, Linneus väg 9, 901 87 Umeå, Sweden bror.alstermark@umu.se

> Received: 13 March 2015 Accepted: 17 July 2015 Published: 05 August 2015

#### Citation:

Alstermark B and Ekerot C-F (2015) The lateral reticular nucleus; integration of descending and ascending systems regulating voluntary forelimb movements. Front. Comput. Neurosci. 9:102. doi: 10.3389/fncom.2015.00102 about information about internal copies of motor commands (Lundberg, 1971; Arshavsky et al., 1972, 1978; Alstermark and Isa, 2012; Fedirchuk et al., 2013; Azim and Alstermark, 2015), from receptors about external events (Stecina et al., 2013) and inhibition of self-evoked reafferent signals by movements (Hantman and Jessell, 2010). However, less focus has been given to indirect spino-cerebellar systems. In a companion Perspective article of this Frontier Research Topic, we compare direct and indirect spino-cerebellar systems (Jiang et al., 2015). Of particular interest are systems conveying spinal information via the lateral reticular nucleus (LRN). The LRN in the caudal brain stem is a major mossy fiber input from the spinal cord, projecting to cerebellar cortex and sending collaterals to the deep cerebellar nuclei (Matsushita and Ikeda, 1976; Dietrichs and Walberg, 1979; Ito et al., 1982). Importantly, the extensive collateral projections to the deep cerebellar nuclei from the LRN exert a direct excitatory effect on post-cerebellar descending motor pathways, and thus can quickly modulate motor control.

Three major ascending systems from the spinal cord to the LRN have been described in the cat: the bilateral ventral flexor reflex tract (bVFRT; Clendenin et al., 1974b; Ekerot, 1990b), the ipsilateral forelimb tract (iFT; Clendenin et al., 1974c; Ekerot, 1990a) and the C3-C4 propriospinal neurones (C3-C4 PNs; Illert and Lundberg, 1978; Alstermark et al., 1981a). However, while LRN input from the iFT and bVFRT have been previously investigated (Clendenin et al., 1974a,b,c; Ekerot, 1990a,b), the role of C3-C4 PN input, and how signals from all three ascending systems are integrated in the LRN, remains unknown.

The C3-C4 propriospinal system is of special interest because these neurones project not only to the LRN, but also to forelimb motoneurones (MNs), conveying both ascending and last-order premotor information (Illert and Lundberg, 1978; Alstermark et al., 1981a; Isa et al., 2006). The function of the C3-C4 PNs has been investigated in behavioral experiments, revealing a role in mediating the voluntary command for visually guided forelimb reaching in the cat (Alstermark et al., 1981b; Alstermark and Kümmel, 1990) and additionally for precision grip in the macaque monkey (Sasaki et al., 2004; Alstermark et al., 2011; Kinoshita et al., 2012). The C3-C4 PNs are characterized by monosynaptic excitation from cortico-, rubro-, tecto- and reticulospinal fibers as well as from cutaneous and muscle afferents in the forelimb nerves (Illert et al., 1978). In addition, all of these converging descending and sensory inputs can mediate disynaptic inhibition of C3-C4 PNs via feed-forward and feed-back inhibitory interneurones (Alstermark et al., 1984c). These excitatory and inhibitory inputs are integrated by C3-C4 PNs, which then excite or inhibit forelimb MNs (Illert et al., 1977; Alstermark et al., 1984a,b; Alstermark and Sasaki, 1985).

In addition to targeting MNs, C3-C4 PN output bifurcates and ascends to the LRN, potentially providing the cerebellum with efference copy, now often referred to as internal feedback. Such information about the ongoing reaching movement may allow the cerebellum to quickly modify the descending motor command via the rubro- and reticulospinal systems (Alstermark et al., 1981a). This idea was recently supported by a combined electrophysiological, optogenetic and behavioral study in the

This study aims to investigate convergence of the iFT, bVFRT and C3-C4 propriospinal systems onto individual LRN neurones in the cat in order to clarify how ascending information from functionally different circuits is processed in the LRN. We find a minority of LRN neurones with input from only individual ascending systems, and a majority of neurones with convergent input from two or all three systems, suggesting that subpopulations of LRN neurones integrate ascending premotor information from the forelimb to enable cerebellar modulation of ongoing movement. A preliminary report has been presented (Alstermark and Ekerot, 2013).

# Materials and Methods

All procedures were approved by the local ethical committees (at the University of Göteborg and University of Lund) and were in accordance with Swedish regulations on animal experimentation.

# Preparation

The results were obtained from nine adult cats (body weight 2.5–3.1 kg). Intial anesthesia was ketamine/ether followed by α-chloralose (80 mg/kg). Additional doses of α-chloralose were administered during the course of the experiment. The criteria for adequate depth of anesthesia were the persistence of miotic pupils, stable blood pressure and respiratory rate and absence of the withdrawal reflex to noxious stimulation. The paralytic gallamine triethiodide (Flaxedil) was administered, and pneumothorax and artificial ventilation were performed to minimize movement artifacts. After paralysis, the criteria for adequate anaesthesia were miotic pupils and stable blood pressure, even with painful stimuli. Rectal temperature was maintained at 36–38◦C, and arterial blood pressure and expiratory CO<sup>2</sup> (4.0%) were monitored continuously. Blood pressure was maintained above 80 mmHg in all experiments. Laminectomy was performed to expose the spinal segments C1-C8 and Th11–13. The superficial (SR) and deep radial (DR) nerves on both sides were dissected and mounted on cuff electrodes. Craniotomy was performed to expose the caudal brain stem, the cerebellum and the cortex overlying nucleus ruber (NR). The experiments were terminated by a lethal of pentobarbital sodium.

# Stimulation and Recording

The experimental setup is shown in **Figure 1**. Corticofugal fibers were stimulated in the contralateral pyramid (Pyr) 3–4 mm rostral to obex at the caudal end of the 4th ventricle, rubrospinal fibers in the contralateral NR (NR; Horsley-Clarke coordinates A3, L1.5, H-2.5), and fibers in the lateral vestibulospinal tract (LVST) were stimulated in the contralateral ventral quadrant (coVQ) either in C4 or C6 using monopolar tungsten electrodes, and in Th13 using bipolar silver electrodes. To restrict the activation of the bVFRT to the cervical cord, the dorsal columns

FIGURE 1 | Schematic outline of the experimental set-up. Intracellular recordings were made from antidromically identified neurones in the lateral reticular nucleus (LRN) by electrical stimulation in the cerebellar (Cer) white matter. Possible convergence from bifurcating C3-C4 propriospinal neurones (C3-C4 PNs), which project to LRN neurones and motoneurones (MNs) in the forelimb segments C6-Th1; ipsilateral forelimb tract (iFT); and the bilateral flexor reflex tract (bVFRT) was tested in LRN neurones using electrical stimulation. C3-C4 PNs were activated from fibers in the contralateral pyramid (Pyr) and nucleus ruber (NR) after transection of these fibers in the dorsolateral funiculus (DLF) in C5. A control lesion was made in C2 to eliminate corticoand rubrospinal input to C3-C4 PNs. C3-C4 PNs could also be activated by stimulation of cortico- and rubrospinal fibers in the DLF in the C3-C4 segments following DLF transections in both C5 and C2. iFT neurones were activated by stimulation of primary afferents in the ipsilateral superficial (SR) and deep radial (DR) nerves (SR and DR) after transection of the dorsal column (DC) in C5 to eliminate afferent input to the more rostrally located C3-C4 PNs. bVFRT neurones were activated from the contralateral SR and DR nerves after a DC transection in C5 and from the lateral vestibulospinal tract (LVST) via stimulation in the contralateral ventral quadrant (cVQ) in C4 or C6. In order to prevent activation of bVFRT neurones in the lumbar segments by LVST stimulation, the contralateral spinal cord was transected at Th13. The lumbar bVFRT neurons were activated by stimulation of the LVST caudal to lesion in in Th13.

(DC) were removed and a hemisection contralateral to the recording side was performed in Th11. Intracellular recording was made with glass microelectrodes filled with 2 M potassium acetate (tip diameter 1.0–2.0 µm, impedance 2–5 M). LRN neurones were identified via antidromic activation from the ipsilateral cerebellar white matter dorsal to the interpositus nucleus at a depth of 6 mm below the cerebellar surface (insertion point of the electrode: 1–2 mm caudal to the primary fissure at a laterality of 4 mm). The arrival of the incoming volley to the LRN neurons was recorded by silver ball electrode at the surface near the patch used for the intracellular recordings. The stimulating and recording sites were verified histologically. The position of NR was also verified by recording the antidromic response followed by stimulation of the rubrospinal axons in Th13.

# Delineation of Spinal Systems Ascending to the LRN

In order to independently characterize the influence of each ascending system on the LRN, we took advantage of their differences in neuronal input and spinal cord location. The C3-C4 PN system is activated by stimulation of corticoand rubrospinal fibers. It has been demonstrated that the vast majority (84%) of the C3-C4 PNs with projection to motoneurons have ascending projection to the LRN (Alstermark et al., 1981a). We identified C3-C4 PN effects in the LRN via transection of cortico- and rubrospinal fibers in the dorsolateral funiculus (DLF) in C5 (sparing the input to the C3-C4 PNs, while interrupting input to the more caudal (iFT system). In three experiments the DLF was transected in C2 to interrupt the input both to C3-C4 PNs and more caudal systems. In addition, when both C5 and C2 DLF lesions were performed, the C3-C4 PNs could be synaptically activated in isolation via stimulation of cortico- and rubrospinal fibers in the DLF of the C3 segment. The isolated strip of DLF was stimulated using a monopolar tungsten electrode. Threshold for evoking the DLF volley was 10 µA. The threshold for current spread to the nearby dorsal column was checked by recording from the SR and DR nerves and was usually approximately 500 µA.

The iFT-system is characterized by strong activation from forelimb nerves (Clendenin et al., 1974b; Ekerot, 1990b). In order to restrict activation to iFT neurones in the forelimb segments (C6-Th1), the dorsal column (DC) was transected in C5.

The bVFRT neurones are monosynaptically activated from the LVST and have large, often bilateral, receptive fields (Clendenin et al., 1974c; Ekerot, 1990a). Effects in the LRN mediated via the subcomponents of the bVFRT-system were restricted by transection of the contralateral LVST in Th13 and the primary afferents in the dorsal column in C5. Cervical bVFRT neurons were activated by stimulation of the LVST in the cVQ in C4 or C6. Lumbar bVFRT neurons were activated by stimulation of the contralateral LVST in L2 caudal to transection in Th13.

# Results

Intracellular recordings were made from 113 LRN neurons identified via antidromic activation from the ipsilateral cerebellar white matter as described in the methods.

# Effects in LRN Neurons Evoked from Pyramidal and Rubrospinal Tracts

In order to examine monosynaptic excitatory effects of descending cortico- and rubro-spinal tracts on neurons in the LRN, and disynaptic excitation and inhibition mediated via C3-C4 PNs, recording was assessed following DLF transections of descending tracts in C2 or C5. The rostral C2 DLF transection eliminates the inputs from cortico- and rubrospinal fibers to the C3-C4 PN, whereas the C5 DLF transection spares this input, but eliminates it to more caudal spinal levels. Note, that after the C2 DLF transection both the input from pyramid and NR to the LRN is intact.

**Figure 2** shows recordings from two LRN neurones after a C2 DLF transection. The contralateral pyramid was stimulated by a train of three volleys at different strengths (**Figures 2A–C**). Small monosynaptic excitatory postsynaptic potentials (EPSPs) could be evoked with a threshold below 40 µA (**Figure 2A**). The amplitude increased when the current stimulus intensity

was raised to 80 and 200 µA (**Figures 2B,C**). The longer EPSP duration with 200 µA is likely due to activation slower conducting corticoreticular fibers. Monosynaptic pyramidal EPSPs were observed in 25% of the neurones. Monosynaptic rubral EPSPs after C2 DLF transection was observed in 10% of the neurones (not illustrated). Lack of disynaptic cortico- or rubral EPSPs were observed in all of the 40 cells tested. The segmental latencies are shown in **Figures 2D,E**. In contrast, disynaptic pyramidal inhibitory postsynaptic potentials (IPSPs) could be elicited after C2 DLF transection, but at lower frequency (7/19 neurones) compared to before the lesion (33/66 neurones). One example of pyramidal disynaptic IPSPs is shown in **Figure 2F**. Rubral IPSPs were lacking following C2 DLF transection (0/21 neurones) and one example is shown in **Figure 2G**, which is taken from the same LRN cell as in (**Figure 2F**). These data reveal monosynaptic excitation of LRN neurons by cortico- and rubro-reticular fibers, and disynaptic pyramidal inhibition, but not from NR.

After C5 DLF transection of cortico-and rubrospinal fibers, leaving descending connections with the LRN neurons and C3-C4 PNs, but not with interneurons in the forelimb segments, monosynaptic EPSPs and disynaptic and late EPSPs and IPSPs could be evoked in the LRN from the contralateral pyramid and NR. The expected minimal disynaptic linkage of pyramidal and rubral EPSPs and IPSPs to LRN via C3-C4 PNs is 2.1 ms, and the maximal linkage is 2.9 ms (based on a conduction velocity of 60 m/s for the corticospinal fibers and 26 m/s for the ascending branch of the C3-C4 PNs; Alstermark et al., 1981a). Disynaptic pyramidal EPSPs were found in 28% (19/67 neurones) and IPSPs in 55% (36/66 neurones). Disynaptic rubral EPSPs were observed in 16% (11/69 neurones) and IPSPs in 46% (32/69 neurones). The higher frequencies of the IPSPs most likely reflect the fact that the membrane potential decreased after electrode impalement of the LRN cells, making it easier to record IPSPs than EPSPs. **Figure 3** shows spatial facilitation of disynaptic pyramidal and rubral EPSPs and IPSPs in LRN neurones following stimulation of cortico- and rubrospinal fibers with a short train of 2–3 volleys after C5 DLF transection (**Figures 3A,B**), as well as the distribution of latencies (measured from the effective stimulation pulse) in **Figures 3C,D**.

These findings strongly suggest that disynaptic pyramidal and rubral excitation and inhibition in LRN neurones can be mediated by excitatory, respectively inhibitory C3-C4 PNs. The disynaptic pyramidal IPSPs that remain after the C2 DLF transection are presumably mediated via reticulospinal neurones. It is worth noting that all rubral IPSPs were mediated via spinal neurones located caudal to C2 (cf. **Figures 2** and **3**).

In order to further delineate the LRN effects of C3-C4 PN activation, the DLF was transected in C2 and C5. The isolated C2 to C5 strip of DLF was then stimulated electrically using a monopolar tungsten electrode. **Figure 4** shows that stimulation of cortico- and rubro-spinal fibers in the C3 DLF evoked disynaptic EPSPs and IPSPs in LRN neurones with latencies within the expected disynaptic range of 1.6–2.4 ms. These data further indicate that the disynaptic LRN effects of cortico- and rubro-spinal excitation are mediated by PNs located in the C3 to C4 segments.

After combined C2 and C5 DLF lesions, it was common to find a mixture of excitation and inhibition in many of the LRN neurones following cortico- and rubro-spinal fiber stimulation. Disynaptic EPSPs evoked by a train of stimuli in the C3 DLF are shown in **Figure 4A**. In another cell illustrated in **Figure 4B**, a mixture of EPSPs and IPSPs were evoked at the same latency when the membrane potential remained virtually unchanged,

shown in **Figures 4C,D**. It can be more easily observed in the expanded sweeps (lower records). These results strongly suggest that a subpopulation of LRN neurones receives mixed input from excitatory and inhibitory C3-C4 PNs.

## Effects in LRN Neurons Evoked from the iFT

In order to assess the influence of iFT neurones onto the LRN, we first confirmed earlier findings by Clendenin et al. (1974a,c) and Ekerot (1990a) that short latency excitation and inhibition in LRN neurones could be evoked by stimulation of ipsilateral forelimb afferents (see **Figure 1**). The effects of iFT activation are shown from two different LRN neurones in **Figures 5A,D** respectively for the cutaneous (iSR) and muscle forelimb (iDR) nerves. Latencies of EPSPs (**Figures 5B,C**) and IPSPs (**Figures 5E,F**) below 4 ms are compatible with a disynaptic transmission.

# Effect in LRN Neurons Evoked from the Cervical bVFRT

We also tested the effects of cervical bVFRT neurones on the LRN by stimulation of the lateral vestibulospinal tract in the contralateral (coVQ) in C4 or C6. We confirmed earlier findings by Clendenin et al. (1974b) and by Ekerot (1990b) that disynaptic excitation and inhibition in the LRN could be evoked by cervical bVFRT stimulation, as shown in two different LRN neurones in **Figures 6A,D**. **Figure 6A** illustrates disynaptic EPSPs evoked by the second and third volley (shown at higher sweep speed in the right panel). **Figure 6D** illustrates disynaptic IPSPs evoked by the first volley alone. In the right panel, taken at higher sweep speed, a second IPSP can be observed at trisynaptic latency (arrowhead). Note the prolongation of the minimal latencies when stimulating in C6 compared to C4, as shown in **Figures 6B,C,E,F**, respectively. This is expected because the bVFRT neurons are located downstream of the stimulation site.

Recording was also made from LRN neurons when stimulating the LVST below the transection in Th13 in order to activate bVFRT neuron located in the lumbar segments (not illustrated). We confirmed the previous findings by Clendenin et al. (1974b) and Ekerot (1990b) that LRN neurons receive input from both excitatory and inhibitory lumbar bVFRT neurons as has also recently been demonstrated anatomically in the rat by Huma and Maxwell (2015). We provide data from LRN neurons with input from lumbar bVFRT below in **Tables 1** and **2**.

# Location of LRN Neurones with Input from the Pyramid and NR

The location of LRN neurones receiving monosynaptic pyramidal and rubral excitation and disynaptic excitation

are shown at a slow (upper panels) and fast (lower panels) sweep speed. Note, the lower panels only show part of the upper records, indicated by the horizontal black line. (C,D), distribution of EPSP respectively IPSP latencies measured from the incoming volley evoked by stimulation in the isolated C3 DLF segment.

and inhibition mediated by C3-C4 PNs from the pyramid and NR, is shown in **Figures 7** and **8**, respectively. The LRN was divided into 1 mm thick segments, the lower being the most caudal section. Recordings were made mainly from the caudal and middle part of the nucleus, which receives most of the input from the spinal cord. In **Figures 7** and **8** are shown: left column, the location of cells with monosynaptic excitation; middle column, the location of cells with disynaptic excitation and right column, the location of cells with disynaptic inhibition (filled circles). Empty circles indicate recorded cells lacking synaptic input from these systems.

The LRN cells with monosynaptic pyramidal and rubral excitation were located in the dorso-medial part of the nucleus. In contrast, LRN cells with disynaptic EPSPs, mediated via the

C3-C4 PNs, were located both in the dorso-medial part, central and ventro-medial parts of the nucleus. Together these data indicate that the direct pyramidal and rubral modulation of LRN neurones is restricted to the dorso-medial part of the LRN, which was shown to project mainly to the ipsilateral pars intermedia and paramedian lobule V in the cerebellum with input from the iFT (Clendenin et al., 1974c; Ekerot, 1990a). Furthermore, the disynaptic effects mediated via the C3-C4 PNs reached not only this region of the LRN, but also more ventral parts of the nucleus, which was shown to project bilaterally in lobules IV and V of the anterior lobe and in the vermis of lobule VIII with input from the bVFRT (Clendenin et al., 1974b). These findings corroborate earlier observations on the termination of C3-C4 PNs in the LRN (Alstermark et al., 1981a).

# Convergence and Synaptic Integration in Subtypes of LRN Neurones

The large number of inputs to the LRN (monosynaptic input from the pyramid and NR, C3-C4 PNs, iFT, and cervical and lumbar bVFRT) suggest an elaborate process of descending and ascending synaptic integration in LRN neurons before producing mossy fiber output to the cerebellum. This integration has an additional layer of complexity since all spinal afferent systems consist of both excitatory and inhibitory neurones, while the monosynaptic input from the pyramid and NR is only excitatory. To investigate the process of integration, we identified LRN neuron subtypes receiving monosynaptic EPSPs from the contralateral pyramidal (**Table 1**). Only 5 out of 61 tested LRN neurons received monosynaptic rubral EPSPs and therefore it was not meaningful to make a table of these cells. Disynaptic pyramidal and rubral EPSPs and IPSPs mediated via presumed C3-C4 PNs were commonly observed in the different subtypes of LRN neurons (**Table 2**).

# LRN Neurons with Monosynaptic Input from Pyramid

Comparison of convergence in single LRN neurones revealed two major differences. First, among LRN neuron subtypes, it was common to receive monosynaptic pyramidal excitation among iFT, bVFRT (**Table 1**) and C3-C4 PNs (**Table 2**) in a range of 35–60%. In contrast, monosynaptic pyramidal excitation was rarely observed in LRN neurons with monosynaptic rubral input or from lumbar bVFRT LRN neurons (**Table 1**). Interestingly, in each LRN neuron with monosynaptic pyramidal excitation, there was no overlap of excitation and inhibition for each of the converging systems shown in **Table 1**. Thus, motor cortex can select differentially among LRN neurons with excitatory or inhibitory inputs for a given spino-LRN system.

TABLE 1 | Distribution of monosynaptic excitation from pyramid among subpopulations of LRN neurons.


The frequency is given both in percentage and the number of tested cells within parenthesis.


TABLE 2 | Distribution of disynaptic EPSPs and IPSPs from pyramid and nucleus ruber mediated via C3-C4 PNs among subpopulations of LRN neurons.

The frequency is given both in percentage and the number of tested cells within parenthesis.

# LRN Neurons with Disynaptic Pyramidal and NR Inputs Mediated via C3-C4 PNs

In all of the investigated LRN neurons there was a broad convergence between the various spino-LRN systems as shown in **Table 2**. Of particular interest in this study is the fact that LRN neurons with input from presumed C3-C4 PNs evoked from the pyramid and NR, also exhibited converging excitation and/or inhibition from the pyramid, iFT and bVFRT systems. This holds true also for LRN neurons with input from cervical or lumbar bVFRT systems.

# Discussion

It is clear that the spino-LRN-cerebellar route provides information from many spinal pre-motoneuronal centers (cf. review by Alstermark and Ekerot, 2013) and as pointed out

FIGURE 7 | Location of recorded LRN neurones with contralateral pyramidal input. Left panels show the location of cells with monosynaptic EPSPs, middle panels show cells with disynaptic EPSPs after C5 DLF transection, and right panels show cells with disynaptic IPSPs after C5 DLF transection. Filled circle are cells with effect and open cells with no effect.

in a companion Perspective article (Jiang et al., 2015) in this research topic, the iFT, bVFRT and C3-C4 PN systems may reflect a phylogenetic development in need of increased control of dexterous forelimb movements. We first discuss the input from the C3-C4 PN system alone, then the convergence of effects from all three systems and finally functional implications of the convergence patterns.

### C3-C4 PN Modulation of the LRN

Our results show that disynaptic excitation and inhibition in LRN neurons from cortico- and rubrospinal fibers could be mediated by convergent input to neurons in the C3-C4 segments, as demonstrated by the persistence of disynaptic effects after C5 lesion, but their elimination after C2 lesion. In addition, stimulation of the isolated DLF in C3 (following DLF transection in both C2 and C5) evoked disynaptic EPSPs and IPSPs in LRN neurones at low threshold, also showed transmission via neurons in the C3-C4 segments. It was previously shown that the vast majority (>84%) of the PNs have bifurcating axons projecting to the LRN and MNs (Alstermark et al., 1981a), which is in contrast to medial segmental interneurons (no projection to MNs) recorded in the same segments as the PNs (Alstermark et al., 1984c). These authors found that about 20% of these medial segmental interneurons have ascending collaterals to the LRN. Second, these interneurons, have weak or no convergent excitatory input from NR and tectum (Alstermark et al., 1984c). Since we demonstrate a strong facilitation of pyramidal EPSPs and IPSPs from NR after C5 DLF (**Figures 3A,D**), these effects must have been mediated via excitatory and inhibitory C3-C4 PNs. Furthermore, a majority of these LRN cells receiving either disynaptic excitation or inhibition via the C3-C4 PNs, also had monosynaptic excitatory inputs from the cortico-reticular neurons, as illustrated in **Figure 9**.

The projection from motor cortex to the LRN is not via collaterals of corticospinal fibers, but from a separate population of cortico-reticular neurons terminating in the brainstem (Alstermark and Lundberg, 1982; Matsuyama and Drew, 1997). Whether or not the rubro-reticular fibers are collaterals from rubrospinal neurones is not known, but we

have tentatively illustrated them as a separate population in **Figure 9**. Such separate control of LRN neurons from the motor cortex and possibly from the brain stem is interesting functionally, because it indicates a need for higher centers to select among the different subpopulations of spino-LRNcerebellar neurons. However, the monosynaptic rubro-LRN excitation was observed in only 5% in contrast to almost 60% for the pyramidal-LRN excitation in LRN neurons with disynaptic input mediated via the C3-C4 PNs, suggesting a much smaller impact for a rubro-LRN control on this subpopulation of LRN neurons.

# Convergence of Excitation and Inhibition in LRN Neurones

Our results corroborate earlier findings by Ekerot and colleagues

that there is a parallel input from excitatory and inhibitory iFT

subpopulations of LRN neurons signal information from excitatory and inhibitory iFT, C3-C4 PN and bVFRT systems to the cerebellum. (B), Subpopulations of

and bVFRT neurons to subpopulations of LRN neurons, and in addition show a similar organization for the C3-C4 PN system as illustrated in **Figure 10A**.

The first extensive investigation of the synaptic organization of a spino-cerebellar system was performed by Lundberg and Weight (1971) and by Lundberg (1971) on the ventral spinocerebellar tract (VSCT; see review Baldissera et al., 1981). Their results revealed a complex integration of excitation and inhibition from low threshold muscle afferents, high threshold flexion reflex afferents and descending systems on to VSCT neurones. Based on these results Lundberg (1971) proposed the hypothesis ''that some VSCT neurones monitor transmission in inhibitory pathways to motoneurones by measuring the output from last order inhibitory interneurons against the excitatory input to them''.

In higher mammals, like the cat and monkey, it is not known if the iFT and cervical bVFRT systems have direct projections also to motoneurons as the C3-C4 PNs, but recent findings in the mouse show that this is the case (Pivetta et al., 2014). Interestingly, whereas the VSCT neurons mainly receive input from inhibitory interneurons (Jankowska et al., 2010), the LRN neurons receive input from both excitatory and inhibitory interneurons. Another difference is that a majority of LRN neurons receive convergence, excitatory and inhibitory, from at least two of the ascending systems investigated in this study as shown in **Figure 10B**. A smaller fraction receives convergence from all three systems (iFT, bVFRT and C3-C4 PN). Apparently, there is a possibility for the cerebellum to compare the excitability level of each system alone as well as in combination. Thus, although these indirect spino-cerebellar pathways share several common properties with the direct spino-cerebellar tracts, the difference may be related to the increased demands required to control dexterous forelimb movements (Jiang et al., 2015).

# Function of Indirect Cervical Spino-LRN-Cerebellar Pathways

The goal of reaching is to approach and then grasp an object, and the timing between these two motor components is critical

LRN neurons signal convergent information from excitatory and inhibitory iFT, C3-C4 PN and bVFRT systems in different combinations to the cerebellum. Motor cortex (Mcx) has selective access to the various subpopulations of LRN neurons.

for a successful movement. The role of the C3-C4 PNs is to mediate the command for reaching as has been demonstrated in the cat, monkey and mouse (Alstermark and Lundberg, 1992; Alstermark and Isa, 2012; Azim et al., 2014). Importantly, in the mouse it was shown that the ascending branch from PNs to the LRN involves a cerebellar loop that can affect motoneurons and reaching behavior (Azim et al., 2014). Behavioral studies in the cat have shown that grasping is primarily controlled by interneurons within the forelimb segments (C6-Th1; Alstermark et al., 1981b; Alstermark and Kümmel, 1990). It therefore seems likely that the cerebellum, would need to receive grasping information from these forelimb segmental interneurons in conjunction with information regarding reaching mediated by C3-C4 PNs. Last-order segmental interneurons within the forelimb segments have been identified that mediate disynaptic corticospinal excitation to forelimb motoneurones (Alstermark and Sasaki, 1985; Sasaki et al., 1996). One possibility is that at least a subset of the segmental interneurons involved in the control of grasping are included in the iFT system which sends information from forelimb segments to the LRN. As

# References


shown in **Table 2**, excitatory and inhibitory iFT convergence was commonly found in LRN neurones with input from presumed C3-C4 PNs.

Another requirement for successful reaching and grasping is concomitant postural control, especially in the contralateral forelimb that supports much of the body weight (Alstermark and Wessberg, 1985). Previous experiments using activitydependent transneuronal uptake of WGA-HRP into last-order interneurones to identify spinal circuits involved in reaching and grasping revealed not only C3-C4 PNs and segmental interneurones on the ipsilateral side of injection, but also commissural interneurones in the forelimb segments on the contralateral side (Alstermark and Kümmel, 1990). Given their location, we propose that some of these contralateral neurones belong to the cervical and lumbar bVFRT systems, providing ascending information about the coordination of the limbs. Convergence from these systems was often found in LRN neurons with input from presumed C3-C4 PNs (**Table 2**). Taken together, as shown schematically in **Figure 11**, we propose that the LRN may provide an overview of reaching, grasping and posture to cerebellum that could compare the activity to make fast updating and corrections by the use of the internal feedback from the various spino-LRN-cerebellar pathways.

# Author Contributions

BA and C-FE contributed equally to the work.

# Acknowledgments

This work was supported by the Swedish Research Council. We thank Drs. Juan Jiang and Eiman Azim for valuable comments on a previous version of this article. Excellent technical assistance was given by Mrs. R. Larsson, Kersti Larsson and Susanne Rosander-Jönsson.

from higher centres to motoneurones. Exp. Brain Res. 42, 299–318. doi: 10. 1007/bf00237496


and monosynaptic excitatory convergence on C3-C4 propriospinal neurones. Exp. Brain Res. 33, 101–130. doi: 10.1007/bf00238798


**Conflict of Interest Statement**: The Guest Associate Editor Dr. Lan declares that, despite having collaborated with the author Dr. Alstermark, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Alstermark and Ekerot. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Leg mechanics contribute to establishing swing phase trajectories during memory-guided stepping movements in walking cats: a computational analysis

Keir G. Pearson\*, Naik Arbabzada † , Rod Gramlich † and Masahiro Shinya †

*Department of Physiology, University of Alberta, Edmonton, AB, Canada*

### Edited by:

*Ning Lan, Shanghai Jiao Tong University, China*

#### Reviewed by:

*Simon Giszter, Drexel Med School, USA Boris Prilutsky, Georgia Institute of Technology, USA*

#### \*Correspondence:

*Keir G. Pearson, Department of Physiology, University of Alberta, Medical Sciences Building, Edmonton, AB T6G 2H7, Canada kpearson@ualberta.ca*

#### †Present Address:

*Naik Arbabzada and Rod Gramlich, Katz Group Centre, Neuroscience and Mental Health Institute, University of Alberta, Edmonton, AB, Canada; Masahiro Shinya, Department of Life Sciences, Graduate School of Arts and Sciences, University of Tokyo, Tokyo, Japan*

> Received: *03 March 2015* Accepted: *07 September 2015* Published: *24 September 2015*

#### Citation:

*Pearson KG, Arbabzada N, Gramlich R and Shinya M (2015) Leg mechanics contribute to establishing swing phase trajectories during memory-guided stepping movements in walking cats: a computational analysis. Front. Comput. Neurosci. 9:116. doi: 10.3389/fncom.2015.00116* When quadrupeds stop walking after stepping over a barrier with their forelegs, the memory of barrier height and location is retained for many minutes. This memory is subsequently used to guide hind leg movements over the barrier when walking is resumed. The upslope of the initial trajectory of hind leg paw movements is strongly dependent on the initial location of the paw relative to the barrier. In this study, we have attempted to determine whether mechanical factors contribute significantly in establishing the slope of the paw trajectories by creating a four-link biomechanical model of a cat hind leg and driving this model with a variety of joint-torque profiles, including average torques for a range of initial paw positions relative to the barrier. Torque profiles for individual steps were determined by an inverse dynamic analysis of leg movements in three normal cats. Our study demonstrates that limb mechanics can contribute to establishing the dependency of trajectory slope on the initial position of the paw relative to the barrier. However, an additional contribution of neuronal motor commands was indicated by the fact that the simulated slopes of paw trajectories were significantly less than the observed slopes. A neuronal contribution to the modification of paw trajectories was also revealed by our observations that both the magnitudes of knee flexor muscle EMG bursts and the initial knee flexion torques depended on initial paw position. Previous studies have shown that a shift in paw position prior to stepping over a barrier changes the paw trajectory to be appropriate for the new paw position. Our data indicate that both mechanical and neuronal factors contribute to this updating process, and that any shift in leg position during the delay period modifies the working memory of barrier location.

Keywords: leg mechanics, obstacle avoidance, working memory, walking, model leg movement

# Introduction

An important issue in motor physiology is the extent to which the mechanical properties of the muscular and skeletal elements of the limbs and body contribute to enhancing the neuronal command signals regulating muscle contractions. In some instances, mechanical properties are known to contribute substantially to the production of effective movements. One of the clearest examples is walking in humans. Studies with mechanical bipedal walking devices have shown that coordinated stepping can be produced without any active elements thus indicating that the mechanical properties of the legs play an important role in the production of normal bipedal locomotion (Collins et al., 2005). Our interest in this issue comes from observations on the trajectories of paw movements in cats when they step over obstacles (McVea and Pearson, 2006, 2007; McVea et al., 2009; Lajoie et al., 2010; Pearson and Gramlich, 2010). When cats begin walking after stepping over a barrier with their forelegs, the slopes of the trajectories of the hind leg paws as they move toward the barrier is highly dependent on the initial distance of the paw from the barrier (McVea and Pearson, 2006). Moreover, if the animal adjusts the position of the hind paws relative to the barrier prior to resuming walking, the trajectory slopes are appropriate for the new paw position (Pearson and Gramlich, 2010). Because these adjustments change the geometry of the hind leg prior to stepping over the barrier, the question is whether the modification of the paw trajectories might simply be due to alterations in the mechanical arrangement of the hind legs. Alternatively, a combination of altered neuronal commands and altered leg geometry could underlie the modified stepping movements. It is well known that activity in leg flexor muscles is highly modulated when cats step over obstacles (Drew, 1993; Widajewicz et al., 1994), and that the pattern of modulation of foreleg flexors depends on the size and width of the obstacles (Drew, 1993; Krouchev and Drew, 2013). However, the extent to which hind leg flexor activity is dependent on the initial paw location relative to the same obstacle has not been examined in previous studies, so the relative contributions of neuronal and mechanical factors in establishing the slopes of the hind paw trajectories is unknown. It is also conceivable that the relative importance of mechanical and neuronal factors differs significantly during uninterrupted stepping over an obstacle (previous studies) compared to stepping over a remembered obstacle from a standing start (this study).

The importance of distinguishing between these two possibilities is that if the former is true it would have implications for our understanding of the long-lasting working memory system involved in the production of hind leg movements to clear the remembered barrier (McVea and Pearson, 2006). In particular, it would indicate that the change in the slope of the paw trajectory associated with a change in geometry of the leg before stepping (Pearson and Gramlich, 2010) does not depend on updating information representing paw position in the working memory system. Instead the same commands to hind leg flexor muscles, established by a representation in working memory of only the required height of the step to avoid the barrier, could produce different paw trajectories simply because of differences in leg mechanics associated with differences in the initial geometry of the leg.

To gain insight into the extent to which mechanical factors contribute to the dependence of paw trajectories on initial paw position, we have developed a four-link mechanical model of a cat hind leg and examined paw trajectories in response to a variety of joint-torque profiles. First, we demonstrate that the model accurately reproduces the profiles of paw trajectories when driven by hip, knee, ankle, and paw torques derived from an inverse dynamic analysis of a hind leg when it steps over an obstacle from many different initial positions. This demonstrated that the numerical integrations necessary for the forward dynamic simulations were adequate. Next we used the profiles of average joint torques for a small range of initial paw positions to drive the model over the full range of initial paw positions observed and compared the early slopes of the paw trajectories observed in a stepping animal with the initial slopes produced by the forward model. Our expectation was that there would be close match of these slopes if modification of paw trajectories was entirely dependent on the initial geometry of the leg. Any systematic differences in these slopes would indicate some sort of modification of neuronal commands and hence provide evidence that the working memory system not only represents information about the initial paw position relative to the barrier, but also has the capacity to be updated when the initial paw position changes prior to stepping.

# Materials and Methods

The primary objective of this study was to use a biomechanical model of a cat hind leg to investigate the contribution of mechanical factors in establishing the initial trajectory of the paw when the hind leg steps over a barrier. Thus, we begin this section with a description of the model, and the procedure we used to estimate joint torques from an inverse dynamic analysis of leg movements in stepping cats. Next we describe how we used the estimated joint torques as input to the model (forward dynamic simulation) to investigate how the paw trajectories are influenced by the initial geometry of the leg independent of any variation in neuronal commands. To verify some of the conclusions from this computational analysis, we also recorded electromyograms (EMGs) from hind leg flexor muscles in three cats as they stepped over a barrier. The method for recording and analyzing EMGs is described at the end of this section.

All experimental protocols were approved the University of Alberta Health Sciences Animal Policy and Welfare Committee.

# Four-link Mechanical Model of the Cat Hind Leg

The four main segments of the cat hind leg (thigh, shank, paw, and toes) were modeled as uniform rigid rods (**Figure 1A**). This is an enormous simplification of the complex mechanics of the leg but it allows a straightforward determination of the equations of motion and provides a simple approach for estimating torques at the hip, knee, ankle, and paw joints. A similar simplification has been used to calculate joint torques in many other biomechanical studies of hind leg movements in the cat (Hoy and Zernicke, 1985; Hoy et al., 1985; Ekeberg and Pearson, 2005). From measurements on a number of animals, the values we used for the mass of each segment were: thigh—200 gms, shank—100 gms, paw—40 gms, toe—20 gms. The lengths of the thigh and shanks were set at 10 and 11 cm, respectively, and the lengths of the paw and toes were measured from video recordings, approximately 5 and 2 cm, respectively. To get an estimate of joint torques from an inverse dynamic analysis we first derived five equations of motion for each segment (see below) thus resulting in twenty equations of motion with 20 unknowns that included the torques at the hip,

FIGURE 1 | Four-link modeling of cat hind leg. (A) The cat hind leg was modeled by four uniform rods for the thigh, shank, paw, and toes connected at the hip joint (h), knee joint (k), ankle joint (a), and paw joint (p). The model was used to calculate torques at these four joints when the animal stepped over a remembered barrier (dotted rectangle), and to simulate leg movements when driven by different torque profiles. The barrier (filled rectangle) was lowered after the forelegs have stepped over it. The slope of the toe trajectory (dotted line) was calculated over distance of 2 cm soon after the swing phase commenced. (B) Parameters used to determine the equations of motion for a single segment (see text for details). This segment represents the first segment in a kinematic chain starting from the origin (O), which is the hip joint in the model. Note that for each segment the segment angle (θ) is measured in the anti-clockwise direction from the positive X-axis proximal to the segment. By convention, positive torques are in the anti-clockwise direction. The reaction torque and forces at the distal joint are in the opposite directions to those at the proximal joint.

knee, ankle, and paw joints. These equations were solved using custom written software in the Matlab programming language.

The independent parameters of segment angles, angular velocity, and angular acceleration required for the inverse dynamic analysis (see Equations below) were obtained from video analysis of hind leg movements in three cats stepping over a remembered barrier. Animals first stepped over a 6.5 cm high barrier with their forelegs to establish a memory of barrier height, straddled the position of the barrier for about 30 s while the barrier was lowered, and then walked forward. We have reported in previous studies (McVea and Pearson, 2006; Pearson and Gramlich, 2010) that in this situation the hind legs step high in a manner that would avoid the barrier had it remained in place. Video images were captured at 60 frames/s using the Peak Motus data acquisition system, and the positions of reflective markers on the iliac crest, hip joint, ankle joint, paw joint, and the end of the lateral toe of the right hind leg were digitized. Triangulation was used to determine knee position because of large skin movements at the knee (hence the reason for using fixed lengths of the thigh and shank). Matlab software was used to calculate joint angles for each 16.67 ms video frame (angles measured counterclockwise from right horizontal, **Figure 1B**), then to filter these angles (2nd order Butterworth, cutoff = 10 Hz) and interpolate joint angles between video frames to yield a time resolution of 1 ms. The angular velocities and angular accelerations of the segments were calculated by taking the 1 ms time differences in segment angles and segment angular velocities, respectively.

The positions of the end of the toes were also digitized and used to determine the slope of the toe trajectory during the elevation phase of the leg (**Figure 1A**), and the initial distance of the toes from the barrier. The slope of the toe trajectories were calculated by dividing the change in vertical position (2 cm) by the change in horizontal distance as the toe moved between the heights of 3 and 5 cm. Our previous studies have reported that the slope of toe trajectory is highly dependent on the initial distance from the barrier, and modified appropriately when toe position is changed prior to stepping (Pearson and Gramlich, 2010).

# Equations of Motion for Inverse Dynamic Analysis

A set of 20 equations with 20 unknowns was used to estimate joint torques. The unknowns were joint torques (four in total), horizontal and vertical joint interaction forces at each joint (eight in total), and linear horizontal and vertical accelerations at the center of mass of each segment (eight in total). For each segment there were three equations describing the dynamics and two equations describing the kinematic. The point of reference for deriving all equations was the hip joint, and the variables are illustrated for single segment in **Figure 1B**. The three equations for the dynamics of each segment are:

$$f\_{\mathbf{x}1} - f\_{\mathbf{x}2} = m\ddot{\mathbf{x}}$$

$$f\_{\mathbf{y}1} - f\_{\mathbf{y}2} - mg = m\ddot{\mathbf{y}}$$

$$\begin{aligned} \left(\mathbf{r}\_1 - \mathbf{r}\_2 + f\_{\mathbf{x}1}a\sin(\theta) + f\_{\mathbf{x}2}a\sin(\theta)\right) \\ -f\_{\mathbf{y}1}a\cos(\theta) - f\_{\mathbf{y}2}a\cos(\theta) = 1\dot{\theta} \end{aligned}$$

where g is acceleration due to gravity, fx1, fy1and fx2, fy2are the pairs of horizontal and vertical joint interaction forces at the proximal and distal joints, respectively (note for the toe segment fx<sup>2</sup> andfy2are zero), x¨ and y¨are the linear horizontal and vertical accelerations of the center of mass, respectively, τ1and τ<sup>2</sup> are the torques at the proximal and distal joint respectively (note for the toe segment τ2is zero), m is the mass of the segment, a is half the length of the segment, I is the moment of inertia of the segment (I = 1 3ma<sup>2</sup> ),θ is the segment angle measured anticlockwise from the horizontal right of the joint, and θ¨is the angular acceleration of the segment.

The two kinematic equations for each segment become increasingly complex going from the proximal to distal segments because of the contributions of proximal segments to the movements of the distal segments. For the most proximal segment (thigh) the two equations are:

$$\begin{aligned} \ddot{x} &= -a\cos(\theta)\dot{\theta}^2 - a\sin(\theta)\ddot{\theta} \\ \ddot{y} &= -a\sin(\theta)\dot{\theta}^2 + a\cos(\theta)\ddot{\theta} \end{aligned}$$

where θ˙ is the angular velocity of the segment. For the most distal segment (toe) the two equations are:

$$\begin{aligned} \ddot{\chi} &= -A\cos(\theta\_1)\dot{\theta}\_1^2 - A\sin(\theta\_1)\ddot{\theta}\_1 - B\cos(\theta\_2)\dot{\theta}\_2^2 - B\sin(\theta\_2)\ddot{\theta}\_2 \\ &- C\cos(\theta\_3)\dot{\theta}\_3^2 - C\sin(\theta\_3)\ddot{\theta}\_3 - d\cos(\theta\_4)\dot{\theta}\_4^2 - d\sin(\theta\_4)\ddot{\theta}\_4 \end{aligned} \tag{1}$$
 
$$\ddot{\chi} = -A\sin(\theta\_1)\dot{\theta}\_1^2 + A\cos(\theta\_1)\ddot{\theta}\_1 - B\sin(\theta\_2)\dot{\theta}\_2^2 + B\cos(\theta\_2)\ddot{\theta}\_2$$

$$-C\sin(\theta\_3)\dot{\theta}\_3^2 + C\cos(\theta\_3)\ddot{\theta}\_3 - d\sin(\theta\_4)\dot{\theta}\_4^2 + d\cos(\theta\_4)\ddot{\theta}\_4 \tag{2}$$
 where  $A$ ,  $B$ , and  $C$  are the lengths of the high, blank, and yaw, respectively,  $d$  is half the length of the toe, and the subscripts

respectively, d is half the length of the toe, and the subscripts 1, 2, 3, and 4 indicate the segment angles, angular velocities, and angular accelerations of the thigh, shank, paw, and toe, respectively. The pair of kinematic equations for the shank and paw are the first four and six terms of Equations (1) and (2), respectively, with the shank length B replaced by the half length b in the equations for the shank, and with the paw length C replaced by the half length c in the equations for the paw.

#### Forward Dynamic Simulation of Swing

The objective of using a forward dynamic simulation of the leg was to determine how a variety of torque profiles influenced the initial trajectory of the toe during swing. In particular, we wanted to examine theoretically the trajectories when the same torque profiles drove leg movements from different initial toe distances from the barrier. That is, to examine toe trajectories assuming there are no changes in neuronal commands. It is important to note that we are making the assumption that joint torques are a good proxy measure of neuronal commands, which is not necessarily accurate if the moment arms of the leg muscles varies with the initial geometry of the leg, and there are non-linear relationships between neuronal commands and muscle forces. Obviously these factors might contribute significantly to the generation of joint torques, but this cannot be assessed because little is currently known about the neuro-mechanical properties of the hind leg flexor muscles that produce the swing phase. Despite not incorporating muscle properties into our simulation, we believe using a simple four-link model of a hind leg driven by joint torques can still yield some insight into the extent that purely mechanical factors are involved in establishing the relationship between toe trajectory slope and initial toe position from the barrier.

Custom written Matlab software was used for the forward simulation. The same equations of motions, segment masses, and segment lengths used in the inverse dynamic analysis (see above) were used in the forward simulation. The fourth-order Runge-Kutta algorithm was used for numerical integration with a time step of 1 ms. Starting from the initial values of joint angles, this algorithm determined the joint angular accelerations at each time step and used these accelerations to update the joint angles and velocities for the next time step. The initial values of angular velocities required for the forward simulation were chosen to match the selection of torque profiles. For example, when using torque profiles for trials with large toe-to-barrier distance, we used the average initial angular velocities for the same subset of trials.

To demonstrate that the numerical integration was accurate, and that the forward simulation produced appropriate leg movements, we initially drove the forward model with the profiles of joint torques derived from the inverse dynamic analysis of single trials. **Figure 2A** shows sets of stick diagrams of leg geometry for a single stepping trial measured from a stepping animal (left) and generated by the forward model (right). During the early part of the swing phase the leg movements produced by the forward model closely matched the leg movements observed in the behaving animals. This was illustrated by the close correspondence of slopes of the toe trajectories in the simulation and in the behaving animal (**Figure 2B**). Small difference in geometry occurred near the end of the step (**Figure 2A**), but these occur well beyond the times of the events we were measuring (early in the swing phase). Similar matches were obtained for all trials we examined, regardless of the initial geometry of the leg. Thus, we are confident that the method of numerical integration (4th order Runge Kutta) was appropriate for our study.

In our main analysis, we drove the forward model with the same profile of joint torques regardless of the initial geometry of the leg. The profiles we chose were averages of joint torques calculated in the inverse dynamic analysis for a small range of initial toe positions. This allowed us to examine how the rising slopes of toe trajectories were dependent on the initial geometry of the leg independent of the any changes in input commands. Our analysis was restricted to examining only leading steps of the right hind leg because the movement of the right leg (closest to the video camera) could be more accurately measured because the reflective markers were always visible, and the range on the initial toe distances from the barrier was much larger for the leading leg.

# Electromyographic (EMG) Analysis of Flexor Muscle Activity

In the three experimental animals, EMG recording electrodes were implanted in the two main knee flexor muscles, semitendinosus (St), and medial sartorius (Sartm), and the main hip flexor, iliopsoas (Ip) of the right hind leg. The procedures for implanting and recording EMGs have been fully described in previous papers (Donelan et al., 2009). The reason for recording flexor EMGs was to determine whether the magnitude of flexor burst activity during the early swing phase was dependent on initial leg geometry. If mechanical factors were entirely responsible for modifying the profiles of toe trajectories with different initial toe position, then we predicted that there would be little or no dependence of flexor EMGs magnitudes on the initial leg geometry.

The flexor EMG bursts were digitized using the analog acquisition hardware of the Peak Motus system, thus allowing synchronization of the video and EMG recordings. Data files of EMG activity stored by Peak Motus were then used in custom written Matlab programs to determine the magnitude of EMG bursts. The EMGs were full-wave rectified, low-pass filtered (cutoff = 50 Hz) and integrated over a time window of 150 ms from the beginning of the bursts.

# Results

# Forward Modeling of Toe Trajectories

To estimate the contribution of mechanical factors in increasing the slope of the trajectory of the toe when its initial position is close to the remembered barrier, we drove the four-link forward model of the hind leg with the same profile of torques for all initial positions of the hind leg. The torque profiles we chose were the averages of torque profiles for the subset of trials in which the initial toe position was at the longest distance from the barrier. Examples of torque profiles for the hip, knee, and ankle joints are shown in **Figure 5A**. Assuming that the initial mechanical arrangement of the leg does contribute to increasing the slope of the toe trajectory as the toe distance from the barrier decreases as described previously in behaving animals (Pearson and Gramlich, 2010), the main prediction of our forward modeling analysis was that the slopes of the toe

forward mechanical model. (A) The kinematics of the hind leg of a cat stepping over a barrier (actual kinematics) were used in an inverse dynamic analysis to first calculate the torque profiles at the four joints. These torque profiles were then used as inputs to the forward model to yield a simulated leg movement (model kinematics). Note the high degree of similarity of the stick figures for the actual and simulated leg movements. (B) Plot of the slopes of the actual and simulated toe trajectories showing a very close correspondence.

trajectories produced by the simulation should increase as the initial toe distance from the barrier decreases. This prediction was found to be correct for the simulations based on biological data from three animals (**Figure 3**). **Figure 3** shows scatter plots of the toe trajectory slopes vs. initial toe distances observed in the behaving animals (dotted lines, open squares) and produced by the simulation (solid line, filled circles). Both sets of data (biological and simulated) show the slope increasing as the toeto-barrier distance decreases, but the rate of increase is greater for the behavioral data. The reasonably close correspondence between the observed and simulated slopes at large initial toe-tobarrier distances was expected because the simulation was driven by the average profiles of joint torques and angular velocities calculated for the subset of trials starting at large initial toe-tobarrier distances.

FIGURE 3 | Slope of toe trajectories increase with shorter initial toe distances from the barrier in the behaving animal (dash lines and open squares) and in the forward model (solid line and filled circles) driven by averaged torque profiles calculated for a subset of trials starting with toe-to-barrier distances in the range of 16–25 cms. The initial geometry of the leg in each simulated trial was the same as initial geometry of the leg in the corresponding trial of the behaving animal. Note the divergence of the best best-fitting curves as the initial toe-to-barrier distance decreases due to the lower dependence of the modeled slopes on initial distance from the barrier. The three data sets are from three different animals.

The divergence of the slopes observed in the simulations and in the behaving animals was clearly revealed in scatter plots of model-slope vs. actual-slope (**Figure 4**). The data points for all three animals are linearly related with the best fitting lines (solid lines) having slopes of 0.34, 0.20, and 0.41 (the dashed lines in **Figure 4** are the identity lines with slopes of 1). This dependency was found to be significant in all three animals (R <sup>2</sup> = 0.53, p < 0.001, n = 39; R <sup>2</sup> = 0.34, p < 0.001, n = 55; R <sup>2</sup> = 0.0.56, p < 0.001, n = 72; T-tested using TDIST in Microsoft Excel). Because larger slopes are associated with trials with short toe-tobarrier distance (**Figure 3**), the plots in **Figure 4** show that the increase in model-slopes with shorter toe-to-barrier distance was less pronounced than the increase in actual-slopes with shorter toe-to-barrier distance. The matching of the model-slopes and actual-slopes at low slope values (close to the intersection of the solid and dashed lines) again reflects the fact that the simulation was driven by torque profiles associated with trials starting with large initial toe-to-barrier distance.

The slope increase in the simulated trajectories with decreasing toe-to-barrier distance indicates that mechanical factors related to leg geometry can significantly contribute to enhancing the elevation of the leg when starting a step close to the remembered barrier. But in addition, the divergence of the slopes of the modeled and actual toe trajectories indicates that neuronal commands to leg flexor muscles increases as the toeto-barrier distance decreases. We predicted, therefore, that the flexion torque at one or more of the leg joints would increase as the toe-to-barrier distance decreased.

## Inverse Dynamic Analysis of Hind Leg Stepping over a Remembered Barrier

In the three adult cats we examined the relationship between the profiles of joint torques, derived from an inverse dynamic analysis of a leading hind leg stepping over the virtual barrier. The joint torque profiles were similar in all three animals. **Figure 5A** shows, for one animal, the average torque profiles at the hip, knee, and ankle joints for eight trials when the initial toe-tobarrier distance was greater than 16 cm (solid lines), and five trials when this distance was less than 8 cms (dashed lines). At all three joints there was an initial torque in the flexion direction (this is plotted in the negative direction for the knee because, by convention, torques are measured in the anticlockwise direction so the clockwise flexion torque at the knee in the leg geometry we analyzed (**Figure 1**) is negative. The most notable feature of the plots shown in **Figure 5A** is that for the hip and ankle joints the flexion torques at the start of swing are similar for the two starting positions, whereas the flexion torque at swing onset at the knee is larger for the steps starting close to the barrier than for the steps starting a long distance from the barrier. **Figure 5B** is a scatter plot of data from another animal showing a progressive increase in knee torque as the toe-to-barrier distance decreases. This dependency was found to be significant in all three animals (R <sup>2</sup> = 0.290, p < 0.001, n = 39; R <sup>2</sup> = 0.208, p < 0.001, n = 55; R <sup>2</sup> = 0.121, p = 0.003, n = 72; T-tested using TDIST in Microsoft Excel). By contrast, there was no significant dependence of the initial torques at the hip and ankle joints on the initial distance of the toe from the barrier. R 2 values for the hip

generally lower than the slopes for the observed toe trajectories. Each data point in the scatter plots represents the slope of toe trajectory from the simulation (simulated slope) and the slope observed (observed slope) for a single trail. The three sets of data are from trials in three different animals. Note the divergence from the identity line (dashed) as slope increases. The matching of the data points at the low slopes is because the simulations were driven by average torque profiles of a subset of trials at long toe-to-barrier distances that are associated with low slopes (see Figure 3).

joint were 0.009 (p = 0.82, n = 39), 0.001(p = 0.49, n = 55), and 0.003 (p = 0.63, n = 72) and for ankle they were 0.042 (p = 0.21, n = 55), 0.043 (p = 0.14, n = 55), and 0.002 (p = 0.71, n = 72). The dependence of knee torque on distance indicates that increases in neural commands to knee flexors is required to fully produce the appropriate amount of knee flexion to avoid the remembered barrier when the initial toe position is close to the barrier. That is, increased motor commands to knee flexor muscles are required to produce the needed increase in the initial slope of the toe trajectory.

#### EMG Analysis of Activity in Leg Flexor Muscles

The results from both the inverse dynamic analysis and the examination of the properties of the forward model indicated that changes in neuronal commands, in addition to changes in leg geometry, are involved in establishing the inverse relationship between the slope of toe trajectories and the initial toe-to-barrier

very similar at swing onset for both conditions. Knee flexion torques are negative for the geometry of the leg used in the inverse dynamic analysis, while hip and ankle flexion torques are positive. (B) Scatter plot of data from another animal showing the increase in the knee torque at swing onset with decreasing toe-to-barrier distance.

distance (**Figure 3**). Thus, we predicted that the magnitude of bursts of activity in one or more leg flexor muscles would depend on the initial toe-to-barrier distance. We focused primarily on the activity in knee flexor muscles (ST and Sartm) because the inverse dynamic analysis showed that the knee flexion torque increased as the toe-to-barrier distance decreased (**Figure 5**). Consistent with this observation we found that the magnitude of the initial burst activity in both ST in all three animals and Sart<sup>m</sup> in two animals (recording electrodes in the third malfunctioned) increased the closer the initial toe position was to the barrier (**Figure 6**). This dependency was found to be significant in all cases (p < 0.001; T-tested using TDIST in Microsoft Excel). EMG recordings from the hip flexor IP muscle in two animals did not reveal any dependence on the initial toe-to-barrier distance (p > 0.1 in both animals), which was consistent with the absence of any dependency of the initial hip torque on distance (see previous section).

semitendinosus (ST) and medial sartorius (Sartm) increases with shorter initial toe distance from the barrier. The magnitudes of the EMGs were measured over the first 150 ms of burst onset. Data in (A) and (B) are from two different animals. EMG amplitudes measured in arbitrary units (AU).

# Discussion

A primary goal of the current investigation was to determine the extent to which purely mechanical factors might contribute to the modification of the toe trajectories when cats step over a remembered barrier with their hind legs. This investigation was motivated, in part, by an earlier finding that if a hind leg changes its initial position relative to a remembered barrier then the toe trajectory is appropriate for the new position and not the position at the moment the animals stopped walking (Pearson and Gramlich, 2010). This observation was interpreted as supporting the hypothesis that the neuronal system representing the memory of the position of the barrier with respect to the body could be updated to take into account the new position of the leg. However, an alternative possibility not considered in our earlier study, is that the modification of toe trajectory is entirely the result of changes in the initial geometry of the leg. In other words, updating the memory system regulating stepping over the remembered barrier is not required to generate a change in toe trajectory when a postural adjustment is made prior to stepping over the remembered barrier.

The results of the current investigation revealed that mechanical factors alone can contribute to altering toe trajectory in the correct direction but, in addition, changes in neuronal command to knee flexor muscles are required to fully produce the appropriate initial toe trajectory. The evidence for a contribution of mechanical factors came from the analysis of toe trajectories generated by a four-link forward dynamic model of the hind leg. By keeping the torque profiles at the four joints the same for all initial geometries of the leg, we found that the slope of the toe trajectories in the simulations increased as the toe was placed closer to the barrier (**Figure 3**). This is most likely related to the fact that when stepping began with the toe close to remembered barrier the leg was initially more flexed and the gravity torque at the knee was lower. Thus, this would have enabled the same motor commands (represented by joint torques) to more effectively elevate the leg. However, this mechanical contribution to the enhancement of toe elevation cannot account entirely for the toe elevations observed in behaving animals. This conclusion follows from our finding that mechanical factors alone are not sufficient to produce the required increase in the slope of toe trajectories as the toe-to barrier distance decreases (**Figures 3**, **4**). The difference in the slopes of toe trajectories seen behaviorally and those produced by the simulation diverged as the initial toe position moved closer to the barrier. This divergence indicated that increased motor commands to leg flexor muscles must also have contributed to producing the steeper toe trajectories the closer the initial toe position was to the barrier.

Evidence for an increasing contribution of motor commands to leg flexor muscles when the initial toe position was closer to the barrier came from both the inverse dynamic analysis and the EMG recordings from leg flexor muscles. As the initial toeto-barrier distance decreased, the initial flexion torque at the knee increased (**Figure 5**), whereas initial flexion torques at the hip and ankle joints remained almost constant. This finding indicates that the primary neuronal mechanism contributing to the regulation of toe trajectory is alterations in the level of motor commands to knee flexor muscles. Consistent with this conclusion was our finding that the magnitude of EMG bursts in both ST and Sart<sup>m</sup> were larger the closer the initial toe position was to the barrier (**Figure 6**). Because the motor commands for producing the stepping movements depend critically on information about the barrier height stored in a long-lasting working memory (Pearson and Gramlich, 2010), our finding of

# References


the modulation of magnitude of the EMG bursts in knee flexor muscles strongly indicates that information about initial toe-tobarrier distance is also stored in this working memory system. Moreover, the fact that increased knee flexor muscle activation is required the closer the toe is to the barrier, and because the kinematics of toe trajectories for steps with and without prior postural adjustments are similar (Pearson and Gramlich, 2010), it seems very likely that information about toe position stored in the working memory system can be updated if the position of the toe is changed prior to stepping.

The general conclusion of this investigation is that both mechanical and neuronal mechanisms contribute to establishing the kinematic profile of toe movement when a leading hind leg of a cat steps over a remembered barrier. It is not possible to accurately estimate the relative contribution of these two mechanisms because the computational analysis was carried out using a highly simplified model of the hind leg. For example, we used joint torques as a proxy for neural commands thus ignoring the role of muscle properties and muscle moment arms in the transformation of neural commands to joint torques. Moreover, no consideration was given to the influence of viscoelastic properties of muscle and connective tissue. Nevertheless, our model was clearly sufficient to demonstrate qualitatively that variations in the geometry of the leg can influence the kinematics of toe trajectories in a manner that is appropriate for partially explaining the variation in toe trajectories in behaving animals.

# Acknowledgments

This study was funded by a grant from the Canadian Institutes of Health Research.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Pearson, Arbabzada, Gramlich and Shinya. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A model-based approach to predict muscle synergies using optimization: application to feedback control

Reza Sharif Razavian\*, Naser Mehrabi and John McPhee

*Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada*

This paper presents a new model-based method to define muscle synergies. Unlike the conventional factorization approach, which extracts synergies from electromyographic data, the proposed method employs a biomechanical model and formally defines the synergies as the solution of an optimal control problem. As a result, the number of required synergies is directly related to the dimensions of the operational space. The estimated synergies are posture-dependent, which correlate well with the results of standard factorization methods. Two examples are used to showcase this method: a two-dimensional forearm model, and a three-dimensional driver arm model. It has been shown here that the synergies need to be task-specific (i.e., they are defined for the specific operational spaces: the elbow angle and the steering wheel angle in the two systems). This functional definition of synergies results in a low-dimensional control space, in which every force in the operational space is accurately created by a unique combination of synergies. As such, there is no need for extra criteria (e.g., minimizing effort) in the process of motion control. This approach is motivated by the need for fast and bio-plausible feedback control of musculoskeletal systems, and can have important implications in engineering, motor control, and biomechanics.

Keywords: muscle synergy, real-time control, model-based approach, optimization, operational space, taskspecific, dynamic redundancy, unique solution

# 1. Introduction

The human musculoskeletal system has a redundant structure—there are more degrees of freedom than required to perform a certain task (kinematic redundancy), and each degree of freedom is actuated by multiple muscles (dynamic redundancy). These redundancies make the control problem challenging. Humans usually take this ability for granted, without noticing the complexities involved.

Muscle synergy has been proposed as a possible strategy to reduce the dimensions of the control space (the number of variables modulated by the nervous system) in the control of musculoskeletal systems (for a short review see Tresch and Jarc, 2009). According to this hypothesis, the central nervous system (CNS) activates a group of muscles together; within each group, the muscles are activated via fixed patterns. Therefore, instead of activating all the muscles individually, the CNS combines a far fewer number of bundles of activation to build the required muscle forces. There are, however, important questions that need to be answered regarding the plausibility of this theory.

#### Edited by:

*Vincent C. K. Cheung, The Chinese University of Hong Kong, Hong Kong*

#### Reviewed by:

*Giovanni Martino, University of Rome Tor Vergata, Italy Jason Kutch, University of Southern California, USA Aymar De Rugy, Centre National de la Recherche Scientifique, France*

#### \*Correspondence:

*Reza Sharif Razavian, Department of Systems Design Engineering, University of Waterloo, 200 University Ave. W., Waterloo, ON N2L 3G1, Canada rsharifr@uwaterloo.ca*

Received: *28 May 2015* Accepted: *11 September 2015* Published: *06 October 2015*

#### Citation:

*Sharif Razavian R, Mehrabi N and McPhee J (2015) A model-based approach to predict muscle synergies using optimization: application to feedback control. Front. Comput. Neurosci. 9:121. doi: 10.3389/fncom.2015.00121*

# 1.1. Structure and Number of Synergies

The structure of the dimension reduction in the nervous system via muscle synergies is not properly understood. Muscle synergy is often defined as fixed relations between instantaneous activation levels of multiple muscles (Ting, 2007; McKay and Ting, 2008; Berniker et al., 2009; Roh et al., 2011; Safavynia et al., 2011; Kutch and Valero-Cuevas, 2012; Steele et al., 2013; Zelik et al., 2014). Alternatively, time-varying patterns (also called the motor primitives) are proposed as the building blocks of muscle activations (D'Avella and Tresch, 2001; Ivanenko et al., 2006; Bizzi et al., 2008; d'Avella et al., 2008; Sartori et al., 2013). A mixture of both approaches has also been investigated by Delis et al. (2014).

The identification of the synergies is an important part of the theory. Various methods have been proposed in the literature to decompose muscle activities into a number of synergies. In the majority of research articles, the goal has been to reconstruct the measured muscle activities as closely as possible, using a low-dimensional basis set (the synergies). Non-negative matrix factorization (NNMF, Lee and Seung, 2000; Sharif Shourijeh et al., 2015a) is a widely-used method in this application (d'Avella et al., 2008; McKay and Ting, 2008; Berniker et al., 2009; Kargo et al., 2010; Berger and d'Avella, 2014). This approach, however, is unable to determine whether the synergies result from neural origins (as claimed by the synergy theory), or are by-products of other processes [e.g., biomechanical constraints (Kutch et al., 2008; Kutch and Valero-Cuevas, 2012), or optimization (de Rugy et al., 2013)].

There is also uncertainty about the number of synergies. The usual practice is to examine the variance accounted for (VAF) of the experimental EMG after synergy decomposition (Lockhart and Ting, 2007; Roh et al., 2011; de Rugy et al., 2013; Moghadam et al., 2013; Sartori et al., 2013; Steele et al., 2013; Delis et al., 2014). In general, a fewer number of synergies produce a lower VAF, and as the number of synergies increase, more variation in the experimental data can be captured. Therefore, the number of synergies beyond which no further improvement in VAF is observed is usually chosen. Unfortunately this approach is purely statistical, and does not provide significant insight into biomechanical aspects of muscle synergy theory.

# 1.2. Dependency of the Synergies on the Task and Posture

The dependency of synergies on the task and posture has not been extensively investigated. Efforts have been made to find shared synergies that can reconstruct EMG data in a variety of tasks (e.g., Bizzi et al., 2008; Sartori et al., 2013; Zelik et al., 2014). In the majority of the articles, however, synergies from a single task are studied, without explicit investigation as to if the synergies vary from one task to another. de Rugy (2010) has shown that visuomotor adaptation occurs at the muscle synergy level, suggesting the necessity of task-dependent synergies. It is, therefore, reasonable to argue that the recruited set of synergies may depend on the intended action. For example, the set of synergies used during a hand-writing action is perhaps different from the set recruited during a simple grasp motion, though the same muscle are activated. Our hypothesis is that different sets of synergies are known to the CNS, and the CNS chooses the appropriate set to manipulate and perform the tasks.

Therefore, information about the intended task seems to be essential in the identification of the synergies. For this purpose, a quantifiable criterion is needed to distinguish between tasks. We hypothesize that the desired controlled variable (or the operational space) could provide such information. For example, in a point-to-point reaching task, Morasso (1981) found that the hand position and velocity follow a stereotypical trajectory (straight line motion with bell-shaped velocity profile). These findings suggest that the hand position is the actively controlled variable, and the operational space is the twodimensional Cartesian space for hand position. On the contrary, in an elbow flexion/extension task, joint angle and angular velocities follow such stereotypical trajectories, meaning that the controlled variable, rather than being hand position, is the joint angle (i.e., the operational space is the one-dimensional joint angle space, rather than a two-dimensional Cartesian space for hand position). Scholz and Schöner (1999) have presented the uncontrolled manifold theory to systematically identify the controlled variable in various tasks. According to this theory, the variability is higher in the dimensions irrelevant to the intended task than those directly related to it. This theory aligns well with the minimal intervention theory, which states that the CNS activates muscles to control only the task-relevant variable (Valero-Cuevas et al., 2009). Both theories support our hypothesis that the synergies, if they exist, have a direct relation with the intended task.

The effects of posture on synergies also needs to be studied. d'Avella et al. (2008) proposed tonic and phasic synergies for gravity balancing and acceleration,respectively. They were able to estimate the tonic synergy coefficients based on the final posture, and the phasic coefficients based on the velocity, using cosine tuning curves. We would like to expand the idea of posturedependent synergies (by defining them based on the operational space variables), and study the usefulness of these synergies in motion control.

# 1.3. Functional Aspects of the Synergies

Most synergy analysis processes in the literature only involve EMG reconstruction. Few studies have used synergy decomposition while taking into account the reconstruction of force/torque in the operational space. de Rugy et al. (2013) and Moghadam et al. (2013) identified synergies corresponding to various directions of wrist force and shoulder torque, respectively. Nonetheless, de Rugy et al. (2013) found high levels of error in the force reconstruction if too few synergies were used.

Using synergies to control motion is another challenge. Feedback control of musculoskeletal systems that act on task space variables is a appealing; however, the literature is limited (Lockhart and Ting, 2007; Ting, 2007), in which the center of mass position was used as the feedback to construct a balance controller.

More articles are available regarding the application of muscle synergy in the feed-forward control of motions (for example McKay and Ting, 2008; Berniker et al., 2009; Neptune et al., 2009; Kargo et al., 2010; Allen and Neptune, 2012). However, these studies reported that the synergies need to be fine-tuned, which is inherent to feed-forward control. Furthermore, despite the reduction in the number of control inputs, the problem is still redundant, requiring an optimization routine to solve for the best combination of the synergies.

#### 1.4. Relation to Optimal Control

There is a tight relation between muscle synergy theory and optimal control of motion. Essentially, an optimal pattern of muscle activities is inherently synergistic (de Rugy et al., 2013; Steele et al., 2013). We also observe that the output of the nervous system shows signs of optimality, meaning that if synergies do exist, they are optimal. We, therefore, hypothesize that the muscle synergy and optimality are interrelated—we show that synergies can be defined directly through optimization. To support this approach, we argue that the results of the optimization process (perhaps in course of human evolution) may have been learned and stored in the nervous system as synergies (see the discussion in de Rugy et al., 2012). A high-level controller (e.g., a robust or an optimal controller in Todorov and Jordan, 2002; Todorov et al., 2005) can then employ the synergies for the control of actions.

# 1.5. Relation of the Current Work with the Existing Literature

The following work addresses some of the aforementioned issues about muscle synergy theory via mathematical modeling and optimization. The novel contribution of this paper is the introduction of a model-based approach for the identification of muscle synergies, as an alternative to the factorization methods (in which the synergies are extracted from EMG data). It has been previously mentioned that synergies may arise from a background optimization process (either on-line optimization or evolutionary adaptation, de Rugy et al., 2012, 2013); however, no formal mathematical argument has been provided in the literature.

We also propose that the number of synergies depends on the intended task (i.e., the number of dimensions of the operational space). As a result of our model-based approach, the number of synergies is determined by the requirements of the musculoskeletal system and the task, resolving the discrepancy regarding the number of synergies in the literature. Furthermore, by examining the operational space, it is possible to distinguish between seemingly similar tasks, which may require substantially different synergies. The task-specific definition of the synergies is yet another novel contribution of our work that has not been previously investigated.

Lastly, these synergies simplify the redundant force-sharing problem in the musculoskeletal system, resulting in a unique solution for the muscle activities. Uniqueness of the solution is a fundamental feature of dimension reduction in motor control that has not been properly addressed. As a result, a simple feedback control scheme can be constructed without the need to solve an on-line optimization problem. This idea resembles the hierarchical control framework in Todorov and Jordan (2002) and Todorov et al. (2005); however, their relation to muscle synergy is less explicit. Such a fast and bio-plausible control scheme has significant implications in various fields, including faster simulation of musculoskeletal systems, predictive forward analysis of motion, prosthetic and orthotic device design, rehabilitation, and Functional Electrical Stimulation of muscles.

# 2. Materials and Methods

In this section, we will present the basics of our muscle synergy framework for the control of musculoskeletal systems. Previous studies show that optimization-based solutions to the muscle force-sharing problem results in realistic muscle activation patterns (for a review see Erdemir et al., 2007). Therefore, we have based our mathematical arguments on optimization results.

Muscle synergies will be defined based on the task, and for a certain operational space. For example, if the operational space (the controlled variable) is the elbow flexion angle, one flexor and one extensor synergies are needed. However, in reaching actions where the operational space is the two-dimensional (2D) position of the hand, flexion/extension synergies are irrelevant, and the shoulder and elbow muscles are recruited to satisfy the 2D hand force requirements. We hypothesize that multiple sets of synergies are known to the CNS, and different sets are recruited during different tasks.

Two examples are provided to showcase our methodology; a 2D one-degree-of-freedom (one-DoF) musculoskeletal forearm model (Sharif Shourijeh and McPhee, 2013; Sharif Razavian et al., 2015) has been used to explain the mathematical foundation of the method. Then, the method is generalized to a more complex three-dimensional (3D) human driver model (Mehrabi et al., 2015a,b). Although differences exist in the aforementioned tasks and recruited muscles, since the operational space in both systems is one-dimensional, our method can define two posturedependent synergies that sufficiently control the motion.

### 2.1. Synergistic Control of a Simple Model

The simple musculoskeletal forearm model used to introduce our muscle synergy framework is shown in **Figure 1A**. This model consists of seven muscles: brachioradialis, brachialis, biceps brachii (long and short heads), and triceps brachii (long, lateral, and medial heads). The physical parameters for these muscles are taken from Sharif Shourijeh and McPhee (2013). The model has one DoF at the elbow joint (flexion/extension angle, θ), which is considered as the operational space.

Since the model contains only mono-articular muscles, it is possible to analytically solve for the optimal muscle activations, ai , that minimize the instantaneous cost function J:

$$J = \sum\_{i=1}^{m} a\_i^2 \qquad \text{ (\$m = number of muscles)} \tag{1}$$

subject to the constraints:

$$\sum\_{i} F\_{i} r\_{i}(\theta) = T \tag{2}$$

and

$$0 \le a\_i \tag{3}$$

The cost function J in Equation (1) represents the muscular effort at each instant of time. Equation (2) is the moment balancing constraint that requires the muscles to generate a certain torque T in the operational space. The muscle forces, F<sup>i</sup> , act at posturedependent moment arms ri(θ), which are positive for the elbow flexors (i.e., brachioradialis, brachialis, and the two biceps brachii heads), and negative for the extensors (triceps brachii heads).

The inequality constraint (Equation 3) enforces the activations to be positive. No explicit upper bound (i.e., a<sup>i</sup> ≤ 1) is assumed for the activations since lower activations are strictly preferred by the cost function, resulting in optimal activations that do not violate the upper bound constraint. Therefore, our arguments are only valid for sub-maximal motions.

The muscle force can be estimated from the activation level using a Hill muscle model as:

$$F = aF\_{0\_{\text{max}}} f\_l(\theta) f\_{\nu}(\not\theta, a) \cos(\alpha) \tag{4}$$

In the Hill muscle model, muscle activation, a, scales the maximum muscle force F0max . Additionally f<sup>l</sup> and f<sup>v</sup> are forcelength and force-velocity relations (Thelen, 2003) that also alter the muscle force. Lastly, muscle force in the tendon direction is affected by the pennation angle α.

We can combine the non-linear terms in Equations (2, 4) and rewrite the constraint (Equation 2) as:

$$\sum\_{i} a\_{i} h\_{i}(\theta, \dot{\theta}) = T \tag{5}$$

where <sup>h</sup>i(θ , <sup>θ</sup>˙) is the non-linear function that transforms muscle activity to the torque in the operational space (similar to a Jacobian that transforms joint torque to end-effector force); it accounts for the force-length relation f<sup>l</sup> , force-velocity relation fv, maximum force F0max , pennation angle α, and moment arm r(θ) of muscle i. Therefore, h is positive for the flexors and negative for the extensors. It should be noted that in Equation (5), we have neglected the dependency of the force-velocity term on the activation [i.e., <sup>f</sup><sup>v</sup> <sup>=</sup> <sup>f</sup>v(θ˙)]. **Figure 2** shows the value of <sup>h</sup> for the long head of biceps brachii as a function of joint angle θ and

activation for θ˙ = 2 rad/s. As can be seen, h is not a significant function of activation.

Solving this optimization problem (details given in the Appendix) yields the optimal activations:

$$a\_i^\* = 0\tag{6}$$

or

$$a\_i^\* = \frac{h\_i(\theta, \dot{\theta})}{\sum\_j h\_j^2(\theta, \dot{\theta})} T \tag{7}$$

The optimal solutions (Equations 6, 7) are both valid answers in different situations. When the joint torque T is positive, the solution (Equation 7) is valid for the flexor muscles, which have h(θ , θ˙) > 0. For the extensors, however, h is negative resulting in a negative (infeasible) answer if Equation (7) is used. Therefore, the optimal extensor activations when T > 0 are stated by Equation (6) (i.e., no extensor activity.) The opposite argument can be made when the joint torque is negative. In this case, the optimal extensor activations are found using Equation (7), while flexors are inactive<sup>1</sup> . Therefore, the closed-form solution of Equation (7) can be used to efficiently calculate the optimal muscle activations that generate a certain joint torque, T.

Alternatively, we can observe that the ratio of the activations for the same-action muscles is independent of the required torque; they all activate with fixed (posture-dependent) relations—the same notion as muscle synergy.

$$\frac{a\_i^\*}{a\_j^\*} = \frac{h\_i}{h\_j} = f(\theta, \dot{\theta}) \tag{8}$$

It is possible to define two synergies for this operational space: one for a positive joint torque (flexor, S f ), and one for a negative one (extensor, S e ). We can identify two representative muscles (i.e., a flexor and an extensor) from the full set of muscles, and calculate the synergy ratios of Equations (9, 10).

$$S\_i^f = \begin{cases} \frac{a\_i^\*}{a\_f^\*} = \frac{h\_i}{h\_f} & S\_i^f > 0\\ 0 & S\_i^f \le 0 \end{cases} \tag{9}$$

$$S\_i^{\varepsilon} = \begin{cases} \frac{a\_i^{\ast}}{a\_\varepsilon^{\ast}} = \frac{h\_i}{h\_\varepsilon} & S\_i^{\varepsilon} > 0\\ 0 & S\_i^{\varepsilon} \le 0 \end{cases} \tag{10}$$

In these relations, S f i and S e i are the flexor and extensor synergy ratios for muscle i, respectively. We can calculate the optimal muscle activation for muscle i based on the flexor and extensor representatives (a<sup>f</sup> , ae) using:

$$a\_i = a\_f \, \mathbf{S}\_i^f + a\_e \, \mathbf{S}\_i^e \tag{11}$$

The representative activations themselves can be calculated either from the optimal values (Equation 7), or from any other control logic. It is important to note that, regardless of the values of (af , ae), if the synergy ratios of Equations (9, 10) are used, the resulting torque is optimally produced.

The calculation of synergy ratios are straightforward in this model; they are the ratio of non-linear transformation of muscle i to that of the representative muscle. Although h is in general a function of activation, we can safely neglect such dependency and calculate h|<sup>a</sup> <sup>=</sup> 0.5. As shown later, this approach results in near-optimal solutions. The flexor and extensor synergy ratios for the 2D forearm model are shown in **Figure 3**, where the long head of biceps and the long head of triceps are chosen as the representative flexor and extensor muscles, respectively. It should be noted that the choice of the flexor and extensor representatives are arbitrary in this model because the muscles have explicit flexor/extensor functions.

#### 2.2. Synergistic Control of the 3D Arm

As a more complex example, we have considered a 3D arm model rotating a steering wheel, Mehrabi et al. (2015a,b, see **Figure 1B**). This model consists of four body segments: trunk, upper arm, forearm, and hand. The trunk is assumed to be fixed, and the upper arm is attached to the trunk using a spherical joint. The elbow is modeled as a revolute joint, and the hand is connected to the forearm via a universal joint. Since it is assumed that the hand grips the steering wheel firmly, the whole system has only one DoF. Therefore, knowing the steering wheel angle is sufficient to find the arm joint angles. This argument is not valid in general, as there is one extra DoF (supination/pronation) that is neglected for the sake of simplicity. In the case that this extra DoF exists, we will need more synergies to control the motion, which is out of the scope of this paper.

The objective function is still the minimization of muscular effort (Equation 1). However, the operational space in this model is no longer the joint angle; instead, the desired operational space (i.e., the variable that is controlled) is the steering wheel angle.

For this complex system of driver/steering wheel, the mathematical arguments similar to Equations (1–7) are more difficult to make. However, since the model has only one DoF, it is possible to generalize the arguments to accommodate this 3D arm as well.

Given the complex kinematics in this model, direct solution for h (as used in Equation 5) is challenging. An efficient method is to calculate it from the response of musculoskeletal system similar to the experimental procedure in Berger and d'Avella (2014). At a certain posture, activation of each muscle will produce a torque in the operational space (in this case steering rotation θ). We can define the non-linear transformation h(θ) as:

$$h\_i(\theta) \stackrel{\Delta}{=} \frac{T\_i}{a\_i} \tag{12}$$

where a<sup>i</sup> is the activation of muscle i and T<sup>i</sup> is the resulting torque in the operational space. Having h calculated from Equation (12), it is now possible to use the constraint of Equation (5), thereby making similar arguments to calculate the optimal activations and the synergy ratios:

$$S\_i^{\rm cw} = \begin{cases} \frac{a\_i^\*}{a\_{\rm cw}^\*} = \frac{h\_i}{h\_{\rm cw}} & S\_i^{\rm cw} > 0\\ 0 & S\_i^{\rm cw} < 0 \end{cases} \tag{13}$$

$$S\_i^{ccw} = \begin{cases} \frac{a\_i^\*}{a\_{ccw}^\*} = \frac{h\_i}{h\_{ccw}} & S\_i^{ccw} > 0\\ 0 & S\_i^{ccw} < 0 \end{cases} \tag{14}$$

These synergy ratios are calculated based on two representative muscle activations: a counter-clockwise rotator and a clockwise rotator, which are denoted by the subscripts cw and ccw, respectively.

In general, h is a function of both the steering angle and angular velocity. However, as previously shown in the 2D results (**Figure 3**), h and synergy ratios, S, are not significantly affected by θ˙. Therefore, we assume that h is only a function of θ; i.e., h = h(θ). This assumption significantly reduces the complexity of

<sup>1</sup>A subtle detail that need to be considered in Equation (7) is that, depending on which group is active, the summation in the denominator has to be calculated over the same-action muscles (either flexors, or extensors).

the synergies and the memory required to store them. However, it comes at the expense of slight sub-optimality if the synergy ratios of Equations (13, 14) are used (we will show later that they are close to optimal). Furthermore, such an assumption aligns well with the concept of posture-dependent synergies, whereas, velocity-dependency has not been reported before.

To summarize the procedure, we can activate any muscle i individually at a certain posture, measure the resulting torque, and then calculate hi(θ) using Equation (12). Doing the same procedure for all muscles and at various postures will result in a set of posture-dependent hi(θ), which in turn can be used to calculate the synergy ratios from Equations (13, 14). **Figure 4** shows the synergy ratios for all 15 muscles in this model, where latissimus dorsi (Jonsson and Jonsson, 1975) and anterior deltoid (Hayama et al., 2012) are the clockwise and counter-clockwise representatives, respectively. It is interesting to note that except for three muscles (anterior deltoid, long head of triceps, and latissimus dorsi) all others change function at a certain steering wheel angle (from CCW rotator to CW rotator or vice versa). This phenomenon limits us to chose any arbitrary muscles as the representatives. This observation also highlights the necessity of synergy dependency on posture.

## 2.3. Comparison with Non-negative Matrix Factorization

The established method to extract the synergies usually involves the generation of a large matrix containing all the EMG data, which is then fed to a factorization algorithm. The most widely used algorithm in this context is Non-Negative Matrix Factorization (NNMF Lee and Seung, 2000). The NNMF decomposes the original EMG data matrix, **A**, into two matrices: the non-negative synergy matrix, **S**, and the non-negative coefficient matrix, **C** as:

$$\mathbf{A}\_{m \times l} = \mathbf{S}\_{m \times n} \mathbf{C}\_{n \times l} \tag{15}$$

where m is the number of muscles, l is the number of samples, and n is the number of synergies. Each column of the synergy matrix **S** represents a synergy, and contains the relative contributions of each muscle in that synergy. A row in the coefficient matrix, **C**, contains the activation level of the corresponding synergy for all the samples.

The samples in the data matrix may vary based on the experiment; they can be snapshots of the time-varying muscle activities, or the average of the recodings from multiple trials. Regardless, NNMF results in synergies that are essentially static i.e., they are the same for all samples.

It has been shown in Steele et al. (2013) that one obtains similar synergies from NNMF with experimental EMG data as from NNMF with optimal activations. Therefore, the standard method of extracting synergies from the EMG data can be replaced by applying NNMF to the optimal muscle activations. Consequently, to compare our method with NNMF results, we have used the synergies extracted from optimal muscle activations as the benchmark. **Figures 3**, **4** show the synergies extracted using NNMF. To obtain these synergies, the optimal muscle activities were found such that the musculoskeletal systems followed a random motion in their operational space. These time-varying optimal muscle activities were gathered in the matrix **A**, and fed to the NNMF algorithm to find the

static synergies. The calculated synergies were scaled so that the activation of the representative muscle would equal to unity.

As can be seen in **Figures 3**, **4**, the NNMF synergies are close to the average of the posture-dependent synergies; however, because of the limited number of synergies (n = 2 in these cases), NNMF is unable to capture all such variations. As a result, the synergies resulting from NNMF are not suitable for for control purposes (see section Results).

## 2.4. Feedback Control of Musculoskeletal Systems

The major motivation for this work is the need for a fast and bioplausible feedback controller for the musculoskeletal system. The usual practice of optimization for the control of a musculoskeletal system is a time-consuming process, and cannot be used for realtime applications. It is also unrealistic to assume that the CNS can perform this amount of computations in real-time (de Rugy et al., 2012). The definition of muscle synergy presented in this work yields a unique solution for the force-sharing problem, thereby eliminating the need for any on-line optimization, resulting in a significantly faster feedback control scheme.

With the synergy ratios calculated beforehand, we can control the musculoskeletal system in an optimal manner by calculating

only the representative muscle activations. All other muscle activations can optimally be constructed using the synergy ratios:

to create muscle activations from the two synergy ratios.

$$\text{for the 2D model :} \qquad \mathbf{a} = \mathbf{S}^{\dagger} \ a\_{bic} + \mathbf{S}^{\epsilon} a\_{tri} \tag{16}$$

$$\text{for the 3D model :} \qquad \mathbf{a} = \mathbf{S}^{\text{cw}} \ a\_{delt} + \mathbf{S}^{\text{cw}} \ a\_{Lat} \tag{17}$$

where **S** f , **S** e , **S** ccw, and **S** cw are vectors containing all the synergy ratios, S f i , S e i , S ccw i , and S cw i , respectively.

The representative muscle activations can be found with various control methods such as forward static optimization (FSO), optimal control (e.g., model predictive controller, MPC, or linear quadratic regulator, LQR), or even a simple proportionalintegral-derivative (PID) controller.

To show the effectiveness of the synergies for real-time control, a simple PID controller is used to control the musculoskeletal systems (**Figure 5**). The output of the controller is a signed activation. Therefore, the positive and negative portions of the signal should be separated. The positive values are interpreted as the representative muscle activation for the flexor or CCW synergies, for the 2D and 3D models, respectively. Similarly, the negative portion is interpreted as the extensor/CW representative in the two models. The representative activations can subsequently be multiplied by the corresponding synergy ratios Equations (16, 17) to calculate all muscle activations.

# 3. Results

The simulation results for both the 2D and 3D models are presented here. In these simulations, the objective was to efficiently follow a desired trajectory in the operational space.

Two feedback control methods were used: an optimal controller [forward static optimization (FSO), Sharif Shourijeh et al., 2015b], and the PID controller. FSO was selected as our optimal controller because of its feedback properties and the fact that it results in optimal behavior (Anderson and Pandy, 2001). For the FSO controller, we considered a weighted sum of the muscular effort and tracking error as the objective function. The weighting factors and the PID controller parameters are provided in **Table 1**.

$$J = \left. w\_1 \sum\_{i=1}^{m} a\_i^2 + \left. w\_2(\theta - \theta\_{des}) + \left. w\_3(\dot{\theta} - \dot{\theta}\_{des}) \right| \right. \right| \tag{18}$$

Two sets of simulations were run. First, the musculoskeletal systems were driven by the optimal controller, resulting in our gold standard muscle activation patterns. In these simulations, the activation level of all the muscles were individually modulated by the FSO controller for each time step. In the second and third sets of simulations, the PID controller calculated the signed activation signal based on the tracking error (i.e., the difference between the desired and actual angle), which was used to construct the muscle activity levels using muscle synergies (Equations 16, 17). These muscle activities were used to drive the musculoskeletal systems.

**Figure 6** shows the performance of the two controller methods for the 2D model. As can be seen in this figure, the performance of the two controllers is very close. The tracking error is comparable using the two controllers, and the muscle activation patterns are also very similar.

The similarity of the activations resulting from the synergistic controller (PID) and the optimal controller (FSO) suggests that the synergies defined in the previous sections result in nearoptimal behavior. The numerical values of the physiological cost (Equation 19) in **Table 2** further show the closeness of the two methods. Previous reports (Erdemir et al., 2007) have shown that the optimal muscle activities (calculated by the FSO controller) estimate realistic muscle activities, which implies that our synergistic controller results in realistic activity patterns.

$$\text{effort} = \frac{1}{T\_f} \sum\_{i} \int\_0^{T\_f} a\_i^2 dt \tag{19}$$

**Figure 7** presents the 3D model simulation results, which contains an extra set of simulations to compare the NNMF synergies with our posture-dependent synergies. Similar to the 2D model results, the optimal muscle activities are well-matched by the synergies presented in this paper. However, the static synergies from NNMF could not properly recreate the optimal (gold standard) muscle activities. This happens because the NNMF essentially averages the relative muscle activities for the entire range of motion in the operational space, therefore neglecting the changing importance and function of the muscles.

TABLE 1 | Numerical values of the parameters used in the two simulations. Parameter 2D model 3D model FSO *w*1 1 1 *<sup>w</sup>*<sup>2</sup> <sup>3</sup>×10<sup>6</sup> <sup>1</sup>×10<sup>4</sup> *<sup>w</sup>*<sup>3</sup> <sup>5</sup>×10<sup>2</sup> <sup>1</sup>×10<sup>2</sup> PID *Kp* 10 100 *Ki* 10 100 *Kd* 2 0

As a result, some muscles are over-activated (e.g., medial head of triceps), incorrectly activated (e.g., posterior deltoid) or even completely neglected (e.g., brachioradialis and brachialis). Our definition of synergies allows for high reconstruction accuracy with the minimum number of synergies (in these cases only two synergies). The comparison of the numerical values of the physiological cost (**Table 2**) further show that the two NNMF synergies cannot reconstruct the optimal muscle activities as well as the posture-dependent ones (the physiological cost increases by 12% using NNMF synergies).

The synergistic controller performed similar to the optimal controller, but was 2–3 orders of magnitude faster (**Table 2**).

TABLE 2 | Comparison of the two control methods.


† *Simulations on a 3.60 GHz quad-core Intel CPU with 16 Gb of RAM.*

‡ *Total simulation time, which includes the time required for controller calculations, plus the integration time of the musculoskeletal model.*

These results show that the synergistic controller can run in realtime, which is an important requirement in many applications including real-time control of functional electrical stimulation and rehabilitation devices (see Section Discussion).

# 4. Discussion

Muscle synergy has been considered as a possible mechanism employed by the human nervous system to control movements. Previous investigations on muscle synergy usually relied on an inverse extraction method—i.e., the synergies were extracted from the measured muscle activities using a factorization algorithm (e.g., NNMF, Lee and Seung, 2000). These methods usually neglect the functional aspects of synergies and their correspondence with the task.

Unlike previous research, we proposed a model-based approach to define the synergies based on the principles of optimality. It has been argued that muscle synergies may arise from a background optimization process (perhaps during evolution) (de Rugy et al., 2012, 2013). Our method relies on these arguments, and defines the synergies by employing optimization tools. However, unlike the on-line optimization methods, e.g., (Todorov and Jordan, 2002; Todorov et al., 2005), the synergies are optimally calculated and stored offline, and recalled during an on-line control process. In support of our method, it has been reported that the muscle activities estimated by optimal control approaches correlate well with experimental EMG (Erdemir et al., 2007), and that the synergies extracted from such optimal muscle activities match the ones extracted from the EMG data (Steele et al., 2013). The comparison between our results and the synergies obtained from the common factorization methods (NNMF) also show the plausibility of the optimal arguments. Therefore, the presented method can be used as a theoretical modelbased framework to study muscle synergy—a tool that was not available before.

Another distinguishing feature of our approach is the dependency of the synergies on the posture and the task. To the best of our knowledge (and perhaps because of the vast number of the required experimental trials) no explicit definition of posture-dependent synergies is available in the literature. Our approach excels as it relates the synergies to known biomechanical parameters (such as muscle strength and moment arm, which are already available in the literature, e.g., Garner and Pandy, 2001). The posture dependency, as mentioned earlier in this paper and in de Rugy et al. (2012), might be an important requirement of synergies, as muscles may change function depending on the posture [e.g., wrist muscles (Kakei et al., 1999)]. Our results comparing the posturedependent synergies and the fixed ones from NNMF support our hypothesis that posture-dependent synergies can reduce the dimensions of the control space more effectively; fewer synergies are required to efficiently control the motion if synergies are posture-dependent.

There are two schools of thought regarding the relationship between synergies and tasks: some researchers try to find the shared synergies that can explain muscle activities in a variety of motions (e.g., Bizzi et al., 2008; Sartori et al., 2013), while others look at specific tasks [e.g., point-to-point reach, (d'Avella et al., 2008), or wrist articulation, (de Rugy et al., 2013)]. Taskdependent synergies have been previously mentioned (e.g., in Zelik et al., 2014), but no scientific method to distinguish the tasks and relate the synergies to the operational space has been shown. We argue that for the efficient control of a task, it is essential for the CNS to recruit the synergies related to that specific operational space.

This argument immediately raises questions about how the CNS may learn and recall these synergies for every task. One possible argument is that the synergies (especially the ones related to locomotion) are fine-tuned over the course of human evolution, and perhaps hard-coded into the spinal cord circuitries (the so-called central pattern generators (Ijspeert, 2008) can be viewed as an example). Alternatively, and especially in the context of adaptation to new tasks, the synergies may be viewed as flexible structures, decoded by the interneurons of spinal cord (similar to the concept of spinal-like regulators proposed by Raphael et al., 2010). It has been shown in de Rugy (2010) that the process of visuomotor adaptation likely happens at the sensory level as well as the execution (muscle synergy) level. In the light of their results, we can argue that in a novel task (e.g., a distorted operational space), the previously learned synergies may not be able to span the new space (due to the highly nonlinear transformations from the synergy space to the operational space); thus, the CNS needs to learn new synergies to effectively maneuver in the new operational space. It is likely that during the learning process, the CNS uses the previously known synergies as a starting point, and by trial and error develops a new basis set that is good enough (Loeb, 2012) to maneuver in the new task space.

# 4.1. Application to Higher Degrees of Freedom: Insights from Robotics

The dependency of the number of synergies on the operational space dimensions can be explained from a mechanical point of view. Assume an n-DoF robot (**Figure 8A**) with an ndimensional operational space; also assume that the robot is non-redundant (i.e., there are n actuators). In a certain state of the robot, each actuator can produces a force in the operational space (denoted by the vectors V<sup>i</sup> in **Figure 8A**). The set of all n force vectors can be viewed as a basis set that spans the robot's operational space. To control the robot in its operational space, a required end-effector force can be decomposed onto the basis set, resulting in the decomposition coefficients C<sup>i</sup> . These coefficients correspond to each actuator effort (Khatib, 1987).

The human musculoskeletal system (**Figure 8B**) is different from a robot in two ways: it is actuated by muscles (they can only pull), and is also redundant (there are more actuators than the degrees of freedom).

FIGURE 8 | (A) A non-redundant robotic arm. The operational space is spanned by the basis set *Vi* ; an arbitrary force can be decomposed into this basis set, resulting in the required actuator efforts. (B) The human musculoskeletal system. Since the muscles are pull-only actuators, one extra basis vector is required to satisfy positive-decomposition constraint. The basis set, *Si* , in this case are synergy-produced forces, which can be used to decompose any arbitrary hand force.

The pull-only condition introduces the constraint that the end-effector force vector has to be positively-decomposed (i.e., the coefficients C<sup>i</sup> must be positive). To positively-decompose an arbitrary vector in an n-dimensional vector space, n + 1 basis vectors are needed (instead of n), meaning that n + 1 pull-only actuators are needed.

The redundancy poses the challenge of non-uniqueness of the solution—the number of muscles is usually larger than n + 1. In order to reduce the redundant system to a non-redundant one, multiple muscle has to be grouped into n + 1 synergies; this way, each synergy's pulling direction can be used as a basis vector vectors (Si, i ∈ {1...n + 1} in **Figure 8B**) to span the operational space (this is essentially the same concept as the cosine tuning curves mentioned before e.g., in de Rugy et al., 2013).

The CNS can therefore, control the redundant musculoskeletal system by employing the best combination of muscle activities that generate such basis vectors in each posture. We argue that these best sets (or muscle synergies) are known to the CNS, and the CNS can reach a unique solution for the intensity of each synergy to generate a certain end-effector force, and consequently control the motion.

Our results show that when the operational space is onedimensional, two posture-dependent synergies are enough to generate the motion. As the dimensions of the operational space increases, more synergies will be required; for instance, to control two dimensional point-to-point reaching action, three synergies are required so that any arbitrary hand force is positivelydecomposed onto the synergies basis set. This hypothesis is supported by independent experimental analysis of reaching action in d'Avella et al. (2008), that three synergies can account for most of the variation in arm muscles EMG.

One important drawback of the application of the same method to higher operational space dimensions is the possible sub-optimality due to the force decomposition mechanism mentioned above. Although each basis vector is optimally produced by a single synergy, there is no guarantee that a linear combination of two synergies (to create an arbitrary force in the operational space) will remain optimal. Our one-DoF results were indeed optimal, because the operational space was always aligned with the optimally produced basis vectors. Our informal studies of higher-dimension systems show that the sub-optimality exist (although not significant). A possible strategy might be to increase the number of synergies. With more synergies the basis set is more packed, leaving smaller area to be spanned by two adjacent basis vectors. This strategy has been reported before in de Rugy (2010) where eight synergies were used to reconstruct six muscle activities. One immediate advantage of this reversed dimension reduction is the elimination of the need for optimization.

# 4.2. Other Implications of the Approach

The muscle synergy framework presented in this paper has important implications in different areas. Our approach to muscle synergy proposes answers to some unresolved issues in motor control studies, namely the number of synergies and their dependency on the task. In this paper we have presented ideas on the requirements of the synergies from a theoretical dynamics perspective. However, the reader should note that the results only show an initial study, and further experimental and/or theoretical investigations are necessary to make a stronger argument.

The results are even more interesting from an engineering perspective. The human musculoskeletal system is challenging to control due to the redundancy and non-linearities involved. Our muscle synergy approach introduces a way to simplify the control of such systems, which can be used in both simulations and real-life applications. Having a realistic controller that mimics the CNS behavior is a necessary component in predictive musculoskeletal simulations. Such a controller can generate/correct motions in unknown situations (e.g., in the presence of disturbances or when experimental motion is not available) without the need for computationally-intensive optimization solutions. It can also facilitate the design and control of machines interacting with humans, such as prosthetic and orthotic devices, exoskeletons, and rehabilitation robots (Ghannadi et al., 2015), by allowing fast prediction of the human behavior. Furthermore, the synergy controller can have direct application to the feedback control of real musculoskeletal systems via neurostimulation and functional electrical stimulation, where optimality and computational efficiency are absolute necessities.

# 5. Conclusion

In this paper, we presented a model-based mathematical method to define muscle synergies based on mathematical modeling and optimal control theory. We showed that muscle synergies can be effectively used to control a musculoskeletal arm in real-time. Using this approach, the indeterminate force-sharing problem in musculoskeletal system dynamics reduces such that the solution is unique. Our novel definition of the posturedependent synergies allowed us to optimally generate torques in the operational space. This lent itself to both fast and efficient feedback control for the musculoskeletal systems. Our results showed that the difference in muscle activities and tracking performance between the feedback controller and the optimal results are insignificant, while the computations are ∼1000 times faster with the former method. Further improvements can be made, however, by introducing a closed-loop control logic that takes into account predictive and learning properties of the human motor control system.

# Acknowledgments

The authors wish to thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for funding this study.

# References


Morasso, P. (1981). Spatial control of arm movements. Exp. Brain Res. 42, 223–227.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Sharif Razavian, Mehrabi and McPhee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Appendix

To solve the optimization problem, the cost function (Equation 1) can be augmented using a Lagrange multiplier, λ (Kirk, 2004).

$$\hat{J} = \sum\_{i} a\_i^2 + \lambda \left[ \sum\_{i} a\_i h\_i(\theta, \dot{\theta}) - T \right] \tag{A1}$$

The inequality constraints (Equation 3) can be rewritten using the slack variables s<sup>i</sup> in the form:

$$a\_i - s\_i^2 = 0 \tag{A2}$$

where s<sup>i</sup> are assumed to be unbounded. Substituting Equation (A2) into Equation (A1) yields:

$$\hat{J} = \sum\_{i} s\_i^4 + \lambda \left[ \sum\_{i} s\_i^2 h\_i(\theta, \dot{\theta}) - T \right] \tag{A3}$$

$$\sum\_{i} s\_i^2 h\_i(\theta, \dot{\theta}) = T \tag{A4}$$

At a local minimum, the gradient of the cost function should be zero.

$$\frac{\partial \hat{f}}{\partial s\_i} = 4s\_i^3 + 2\lambda s\_i h\_i(\theta, \dot{\theta}) = 0 \tag{A5}$$

One answer to this equation is:

$$s\_i = 0 \Rightarrow a\_i^\* = 0 \tag{A6}$$

If s<sup>i</sup> 6= 0, we can divide (Equation A5) by s<sup>i</sup> to get:

$$2s\_i^2 + \lambda h\_i(\theta, \dot{\theta}) = 0 \tag{A7}$$

which leads to:

$$s\_i^2 = -\frac{\lambda}{2} h\_i(\theta, \dot{\theta}) \tag{A8}$$

By substituting this expression into the constraints (Equation A4), the Lagrange multiplier can be found:

$$\sum\_{i} \left[ -\frac{\lambda}{2} h\_i(\theta, \dot{\theta}) \right] h\_i(\theta, \dot{\theta}) = T \tag{A9}$$

$$\lambda = \frac{-2T}{\sum\_{i} h\_i^2(\theta, \dot{\theta})} \tag{A10}$$

Therefore, the optimal solution (in sub-maximal contractions) can be found:

$$s\_i^2 = a\_i^\* = \frac{h\_i(\theta, \dot{\theta})}{\sum\_j h\_j^2(\theta, \dot{\theta})} T \tag{A11}$$

# Nomenclature


# Coordinated alpha and gamma control of muscles and spindles in movement and posture

Si Li <sup>1</sup> , Cheng Zhuang<sup>1</sup> , Manzhao Hao<sup>1</sup> , Xin He<sup>1</sup> , Juan C. Marquez 1, 2, Chuanxin M. Niu<sup>3</sup> and Ning Lan1, 4 \*

*<sup>1</sup> School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, China, <sup>2</sup> School of Technology and Health, Royal Institute of Technology, Stockholm, Sweden, <sup>3</sup> Department of Rehabilitation, Ruijin Hospital of School of Medicine, Shanghai Jiao Tong University, Shanghai, China, <sup>4</sup> Division of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, USA*

#### Edited by:

*Si Wu, Beijing Normal University, China*

#### Reviewed by:

*Da-Hui Wang, Beijing Normal University, China Danke Zhang, Hangzhou Dianzi University, China*

#### \*Correspondence:

*Ning Lan, School of Biomedical Engineering, Med-X Research Institute, Shanghai Jiao Tong University, 1954 Hua Shan Road, Shanghai 200030, China ninglan@sjtu.edu.cn*

> Received: *27 April 2015* Accepted: *14 September 2015* Published: *09 October 2015*

#### Citation:

*Li S, Zhuang C, Hao M, He X, Marquez JC, Niu CM and Lan N (2015) Coordinated alpha and gamma control of muscles and spindles in movement and posture. Front. Comput. Neurosci. 9:122. doi: 10.3389/fncom.2015.00122* Mounting evidence suggests that both α and γ motoneurons are active during movement and posture, but how does the central motor system coordinate the α-γ controls in these tasks remains sketchy due to lack of *in vivo* data. Here a computational model of α-γ control of muscles and spindles was used to investigate α-γ integration and coordination for movement and posture. The model comprised physiologically realistic spinal circuitry, muscles, proprioceptors, and skeletal biomechanics. In the model, we divided the cortical descending commands into static and dynamic sets, where static commands (α*<sup>s</sup>* and γ *<sup>s</sup>*) were for posture maintenance and dynamic commands (α*<sup>d</sup>* and γ *<sup>d</sup>*) were responsible for movement. We matched our model to human reaching movement data by straightforward adjustments of descending commands derived from either minimal-jerk trajectories or human EMGs. The matched movement showed smooth reach-to-hold trajectories qualitatively close to human behaviors, and the reproduced EMGs showed the classic tri-phasic patterns. In particular, the function of γ *<sup>d</sup>* was to gate the α*<sup>d</sup>* command at the propriospinal neurons (PN) such that antagonistic muscles can accelerate or decelerate the limb with proper timing. Independent control of joint position and stiffness could be achieved by adjusting static commands. Deefferentation in the model indicated that accurate static commands of α*<sup>s</sup>* and γ *<sup>s</sup>* are essential to achieve stable terminal posture precisely, and that the γ *<sup>d</sup>* command is as important as the α*<sup>d</sup>* command in controlling antagonistic muscles for desired movements. Deafferentation in the model showed that losing proprioceptive afferents mainly affected the terminal position of movement, similar to the abnormal behaviors observed in human and animals. Our results illustrated that tuning the simple forms of α-γ commands can reproduce a range of human reach-to-hold movements, and it is necessary to coordinate the set of α-γ descending commands for accurate and stable control of movement and posture.

Keywords: α-γ motor system, propriospinal neurons, spinal circuits, muscle and spindle, computational modeling, simulation, movement and posture

# Introduction

The physiological system of human motor control is not only highly redundant (Bernstein, 1967; Martin et al., 2009), but also endowed with intricate dual α and γ sensorimotor control (Granit, 1975; Pierrot-Deseilligny and Burke, 2005; Lan and He, 2012; Prochazka and Ellaway, 2012). Our understanding about how movements are organized and how muscles are coordinated in performing different tasks remains incomplete due to lack of in vivo data during behaviors. One of the remaining issues in sensorimotor control is to account for the role of γ motor system in movement and posture. In spite of thorough elucidation of peripheral efferent and afferent innervations of the spindle organ (Matthews, 1964; Boyd, 1980; Hulliger, 1984; Prochazka, 1999), the importance of γ motor system in motor control is still not well understood.

A consistently observed phenomenon during both movement and posture control is α-γ co-activation (Vallbo, 1971; Taylor et al., 2004, 2006; Prochazka and Ellaway, 2012). Direct recording by Taylor et al. (2004, 2006) revealed a co-varying pattern of gamma-static and dynamic firings with joint angle during locomotion. There was plenty evidence of independent control of γ -motoneurons during movement (Prochazka et al., 1985; Dimitriou and Edin, 2008). The co-activation of γ motoneurons with α motor activity was generally viewed to compensate for the unloading effects of muscle contraction to spindle sensitivity. Using a realistic virtual arm (VA) model (He et al., 2013), Lan and He (2012) suggested a plausible function of γ <sup>s</sup> fusimotor control to convey centrally encoded joint angle information to the periphery through regulating spindle sensitivity, so that the Ia signaling from spindle afferents is kept faithfully proportional to the joint angle during movement and muscle contraction. This not only explains γ co-activation with α activity, but also supports the hypothesis that the γ <sup>s</sup> command reinforces the centrally planned kinematics of movement and posture by way of spinal circuits.

Neurophysiological studies have identified separate spinal pathways and circuits of sensorimotor system in details, where the α and γ commands interact with each other to produce sensory and motor outputs (Baldissera et al., 1981; Lemon et al., 2004; Pierrot-Deseilligny and Burke, 2005). In addition to the mono-synaptic cortico-motoneuronal pathway (Lawrence and Kuypers, 1968; Lemon, 1999; Quallo et al., 2012), disynaptic excitatory and inhibitory cortico-motoneuronal pathways via PNs in C3-C4 were found to exist in cats and in macaque monkeys, as well as in humans (Malmgren and Pierrot-Deseilligny, 1988; Gracies et al., 1994; Alstermark et al., 2007; Isa et al., 2007). The PN was shown to play an important role in reaching movement of upper limb (Alstermark et al., 1981; Alstermark and Isa, 2012). In a computational analysis to emulate involuntary oscillatory movements in human upper extremity (Hao et al., 2013), it is postulated that movement signals are transmitted via the disynaptic pathway of the PN network, where the γ -dynamic (γ <sup>d</sup>) command is integrated with the α-dynamic (αd) command to produce pre-motoneuronal outputs. The γ <sup>d</sup> command encodes kinetic information of joint acceleration (or deceleration), and gates the α<sup>d</sup> command of double frequency at the PN network to determine the timing of activation for a pair of flexor and extensor muscles during oscillatory movements.

These studies suggest that while the α motor system provides the main drive for muscles, the γ motor system executes a more subtle control for movement dynamics and the maintenance of posture. In this paper, we extend the α-γ model (Hao et al., 2013) to investigate the modular control of voluntary movement and posture, and to demonstrate coordination of a set of α-γ descending commands in the control of movement and posture. Human reach-and-hold movements and muscle EMG activities were recorded and analyzed to guide the specification of the central descending commands. The model behavior was matched to the data of human movement and posture in all subjects. A variety of modifications were introduced in model structures to assess the functional significance of α-γ coordination by "deafferenting" or "deefferenting" the model. Results of this study provided a quantitative evaluation of the relative importance that each proprioceptive afferent and descending α-γ command may have on movement and posture control. Agreement between model predictions and human movement data further corroborated the α-γ coordination as a rudimentary means of sensorimotor control. The results argue that coordinated γ control with α activation is essential for accurate and stable control of movement and posture. Preliminary analysis of this study appeared in a conference proceeding (Li et al., 2014).

# Materials and Methods

### Corticospinal Virtual Arm (CS-VA) Model

The module of posture and movement control based on α-γ dual control system was implemented in the computational CS-VA model (**Figure 1**). This model is based on realistic physiological studies (Cheng et al., 2000; Mileusnic and Loeb, 2006; Mileusnic et al., 2006; Taylor et al., 2006; Alstermark et al., 2007; Isa et al., 2007) and has been validated to capture the realistic neuromechanical properties of human upper limb in previous work (Song et al., 2008a,b; Lan and He, 2012; He et al., 2013). It consists of four parts, the primary motor and sensory cortex, PN network, spinal cord circuitry and virtual arm (VA) model. The spinal cord network with PNs innervating a pair of antagonistic

**Abbreviations:** α, γ , Alpha, gamma motor nerves and neurons; PN, Propriospinal neuron; EMG, Electromyography; GTO, Golgi Ten Organ; CS-VA, Corticospinal virtual arm; VA, Virtual arm; α<sup>s</sup> , γ<sup>s</sup> , Static alpha, gamma command; αd, γd, Dynamic alpha, gamma command; αMN, γ MN, Alpha, gamma motoneuron; Um, Muscle input in the CS-VA model; U<sup>s</sup> , Spindle input in the CS-VA model; Fm, Muscle force; Lce, Muscle fascicle length; Lmt, Musculo-tendon length; Ia, Primaryafferent from spindle; II, Secondary afferent from spindle; Ib, Afferent from GTO; PC, Pectoralis Major Clavicle; DP, Deltoid Posterior; BS, Brachialis; Tlt, Triceps Lateral; Bsh, Biceps Short Head; Tlh, Triceps Long Head; θel, Elbow angle; θsh, Shoulder angle; γ -PN, Dynamic gamma inhibition on PN; Ia(−), Ia reciprocal inhibition on αMN; Ia-PN, Ia afferent gain on PN; Ia(+), Stretch reflex; Ib(−), Ib inhibition from GTO; RC, Renshaw Cell; ErrorP, ErrorM, mean difference between simulation and experiment of terminal posture, movement; Vpeak, Peak velocity; 3SD, Three times of standard deviation of static EMG; AG1, First agonistic muscle burst phase; AG2, Second agonistic muscle burst phase; ANT, Antagonistic muscle burst phase; RS, Ratio of static EMG (Bsh/Tlh); RP, Ratio of peak EMG (ANT/AG1); ampAG<sup>1</sup> ampANT, Amplitude of AG1 and ANT phase on αd; pwAG1,pwANT, Pulse width of AG1 and ANT phase on αd; MJT, Minimum jerk trajectory.

interneuron. The subscripts "s" and "d" in α and γ motor signals stand for static and dynamic commands, respectively. *U<sup>m</sup>* is muscle input, which is equivalent to human EMG; *Us* is spindle input; *Lmt* and *Fm* represent muscle tendon length and muscle force computed form virtual arm model; *Lce* represents fascicle length; Ia and II are sensory feedback of primary and secondary afferents from muscle spindle; and Ib is the feedback of Golgi Tendon Organ (GTO). The virtual arm has two joints including shoulder and elbow in the horizontal plane, with two degrees of freedom, i.e., shoulder flexion/extension and elbow flexion/extension. Three pairs of antagonistic muscles are included in the model: shoulder flexor Pectoralis Major Clavicle (PC), extensor Deltoid Posterior (DP), elbow flexor Brachialis (BS), extensor Triceps Lateral (Tlt), and biarticular flexor Biceps Short Head (Bsh), extensor Triceps Long Head (Tlh). Simulated joint kinematics, especially elbow angle is compared to elbow angle (θ*el*) recorded in human experiment. Modified from Hao et al. (2013) with permission.

muscles (**Figure 2**) has been validated in the mechanism study of Parkinsonian tremor (Hao et al., 2013). The VA has two joints including shoulder and elbow in the horizontal plane, two degrees of freedom, namely shoulder flexion/extension and elbow flexion/extension. The six dominating muscles includes two pairs of monoarticular muscles, shoulder flexor Pectoralis Major Clavicle (PC), extensor Deltoid Posterior (DP), elbow flexor Brachialis (BS), extensor Triceps Lateral (Tlt) and one pair of biarticular muscles, flexor Biceps Short Head (Bsh), extensor Triceps Long Head (Tlh). Muscle spindle and Golgi Tendon Organ (GTO) are implemented within the VA to provide proprioceptive feedback. Three parts of the CS-VA model, including PN network, spinal cord circuitry and VA, have been integrated in SIMULINK/MATLAB platform. In this study, biarticular muscles of Bsh and Tlh were mainly used to realized movement and posture.

Within the CS-VA model, central motor commands from motor cortex are transmitted to spinal α and γ motoneurons (αMN, γ MN) in two pathways, mono-synaptic pathway carrying static α and γ motor commands to corresponding motoneurons for posture control, and multi-synaptic pathway transmitting dynamic α and γ motor commands to MNs via PN network for movement control. Muscles and spindles are activated by α and γ motor signals respectively after the regulation of PN and reflex network within spinal cord. The muscular dynamics computed by SIMM model of arm give a real-time muscle tendon length to virtual muscle, at the same time, the dynamics are reflected by Ia and II afferents from muscle spindle, and Ib afferent from GTO. Feedback from proprioceptive afferents participates in signal processing within spinal cord network and cortex simultaneously. In this framework, central inputs of static and dynamic α and γ motor commands are specialized in Section Specification of Central Descending Commands, the input of virtual muscle (Um) is equivalent to recorded EMG, and the output joint kinematics gives a behavioral indicator to compare with elbow angle (θel) in human experiment.

PNs in C3-C4 receive wide descending control from cortico-, rubro-, reticulo and tectospinal tracts, and project inhibition or excitation to downstream αMN (Alstermark et al., 2007), at the same time receive feedback from Ia and cutaneous afferents (Alstermark et al., 1984a,b,c). Based on neurophysiology structures of PN network, αMN, γ MN, and the spinal reflexes, a model of spinal circuitry was implemented within the CS-VA model as is shown in **Figure 2** (Hao et al., 2013; Li et al., 2014). The central descending commands in the model were passed down to αMN and γ MN through mono-synaptic and multi-synaptic PN pathways, respectively (Isa et al., 2007). There was experimental evidence indicating that PNs mediate motor commands for reaching movement (Alstermark et al., 2007; Alstermark and Isa, 2012). Computational analysis (Lan and He, 2012) suggested that γ <sup>s</sup> conveys kinematic information of joint angle. Therefore, in this model, we assume that the descending commands can be divided into static set (α<sup>s</sup> and γ <sup>s</sup>) for posture and dynamic set (α<sup>d</sup> and γ <sup>d</sup>) for movement. This division is consistent to the evidence of individual posture and movement modules in central motor control system (Kurtzer et al., 2005).

As illustrated in **Figure 2**, α and γ commands control muscles and spindles through a set of spinal circuits. α<sup>d</sup> is integrated at the

PN with γ <sup>d</sup> in a pattern of mirror inhibition (γ <sup>d</sup> or 1-γd). The recording from dynamic γ MN in Taylor et al. (2006) revealed a reciprocal change of γ <sup>d</sup> with joint angle. The PN also receives excitation from autogenic Ia afferent (Ia-PN) (Alstermark et al., 1984b; Malmgren and Pierrot-Deseilligny, 1988). After the PN, α<sup>s</sup> and α<sup>d</sup> commands converge at αMN, and its output is further regulated by reciprocal inhibition (Ia(−)) from antagonistic muscle, recurrent inhibition of Renshaw Cell (RC), autogenic stretch reflexes from Ia (Ia(+)), and Ib afferents (Ib(−)) (Eccles et al., 1957, 1960; Windhorst, 2007). The output of PN (YPN) and αMN (YαMN) are given in Equation (1), in which subscript "e" and "f " represent extensor and flexor,

are discussed later.

$$Y\_{PN\_\varepsilon} = a\_d - d\_f \ast \gamma\_d + a\_{\varepsilon} \ast \upsilon\_{\varepsilon}{}' \tag{1a}$$

$$Y\_{PNf} = a\_d - d\_\varepsilon \* (1 - \nu\_d) + a\_f \* \upsilon\_f \tag{1b}$$

$$N\_{\varepsilon}(t) = \alpha\_{\varepsilon\varepsilon} + Y\_{\text{PNe}} \tag{1c}$$

$$N\_f\left(t\right) = \alpha\_{sf} + Y\_{PNf} \tag{1d}$$



*a, Ia afferent gain on PN (Ia-PN); d,* γ *dynamic inhibition gain on PNs(*γ*-PN); p, gain of PN to reciprocal inhibition; r, Ia reciprocal inhibition gain (Ia(*−*)); s, stretch reflex gain (Ia(*+*)); g, recurrent inhibition gain from Renshaw Cell (RC); b, Ib gain of Golgi tendon organ (Ib(*−*)).*

$$\frac{dC\_{\varepsilon}(t)}{dt} = -\frac{1}{\mathfrak{r}\_{N\_{\varepsilon}}} C\_{\varepsilon}(t) + \frac{1}{\mathfrak{r}\_{N\_{\varepsilon}}} N\_{\varepsilon}(t) \, 0 \le N\_{\varepsilon}(t) \le 1 \qquad \text{(1e)}$$

$$\frac{dC\_f(t)}{dt} = -\frac{1}{\tau\_{N\_f}}C\_f(t) + \frac{1}{\tau\_{N\_f}}N\_f\left(t\right) \\ 0 \le N\_f\left(t\right) \le 1 \qquad \text{(1f)}$$

$$C\_{\mathfrak{e}}{}^{\prime}(t) = C\_{\mathfrak{e}}(t) \times \sigma\left(t\right) \tag{1g}$$

$$C\_f'(t) = C\_f(t) \times \sigma\left(t\right) \tag{1h}$$

$$Y\_{aMNe} = \frac{C\_{\text{e}}\prime(t)}{1 + \text{g}\_{\text{e}}C\_{\text{e}}\prime(t)} \left[1 + \text{s}\_{\text{e}}\upsilon\_{\text{e}}\prime - r\_{\text{e}}\upsilon\_{f}\prime - b\_{\text{e}}\varphi\_{\text{e}}\prime\right] \tag{11}$$

$$Y\_{aMNf} = \frac{C\_f \text{'} \text{(t)}}{1 + \text{g}\_f \text{C}\_f \text{'(t)}} \left[ 1 + \text{s}\_f \upsilon\_f \text{'} - r\_f \upsilon\_c \text{'} - \text{b}\_f \varphi\_f \text{'} \right] \tag{1 \text{j}}$$

N(t) is the sum of descending commands to αMN, C(t) represents background activation of αMN pools, τ is the time constant of excitation with the value 0.029 (sec), σ(t) represents the signal dependent noise, which is a Gaussian distributed random signal (Fuglevand et al., 1993; Jones et al., 2002). C(t) was multiplied by σ(t) to yield the noise corrupted background activations (C′(t)) in Equations (1g) and (1h). The output of αMN (Equations 1i and 1j) is the sum of all excitatory and inhibition inputs; here, the values of υ′ and ϕ′ are proportional to Ia and Ib afferent discharge frequencies, respectively; and g, s, r, b are reflex gains of RC, Ia(+), Ia(−), Ib(−). The values of spinal reflex gains are listed in **Table 1**.

# Human Reach and Hold Experiment Subjects and Experiments

Seven healthy adults participated in this elbow joint's reach and hold experiment. The human subject study was approved by the Internal Review Board (IRB) of University of Southern California (USC). All of them obtained a brief explanation of this study before the experiment, and signed the informed consent.

During the experiment, the subject sat comfortably with the upper arm maintained perpendicular to the trunk in the horizontal plane (**Figure 3A**), the hand and forearm were secured on the manipulation to make single joint movements around the elbow. All subjects initially held their elbow at 90◦ , and performed successive reach and hold movements triggered by the visual cue of LED light, following three blocks. In Block 1, the elbow was extended with the range of 30 degrees, and after

FIGURE 3 | Human reach and hold experiments, and analysis of kinematic and EMG data during posture and movement. (A) Experimental set up: the subject is seated at the table to make fast reaching movement of elbow joint in the horizontal plane. Subjects' arm and hand were secured on a manipulandum to keep the upper arm stable and ensure smooth extension and flexion of the forearm. Elbow angle (θ*el*) is defined as the included angle between forearm and the extending line of upper arm. Movements with a range of 30, 45, and 90 degrees are performed in Block 1, 2, and 3, respectively. During extension, the elbow angle (θ*el*) changes from initial posture of 90◦ to terminal posture of 0◦ , the filled dot indicates the holding posture between reaching movements. The reversal flexion from 0 ◦ to 90◦ has the same movement range and holding postures. The shoulder angle (θ*sh*) kept at 90◦ during all experiments. (B) Analysis of kinematic and EMG data during posture and movement. During movement, the elbow moves from initial posture to terminal posture, and the movement onset (*t*0 ) and offset (*t*1 ) are defined as the time at which the velocity change (increases or decreases) is 10% of the peak velocity (*Vpeak*). Initial posture period starts at time (*t*0 – 1.5) and terminal posture is defined from time (*t*<sup>0</sup> + *4*) after the angle stabilizes. The time window is 1 s. The tri-phasic EMG pattern of extensor Triceps long head (Tlh) and flexor Biceps short head (Bsh with inverse value) is shown in the bottom (low passed at a cut off frequency of 50 Hz with digital Butterworth filter).

three successive extensions, the elbow stopped at 0◦ , the reversal flexion with the same movement range and holding posture was performed after a 5 s holding period at 0◦ . In Block 2, the range of extension and flexion was 45◦ , thus one holding period at the mid-posture of 45◦ was performed, and in Block 3 the elbow had direct extension and flexion movement between 90◦ and 0◦ . During the experiments, the subjects moved as fast as them could between postures, and a 5 s holding was required at each posture. Each block was repeated for five trials, and between blocks, the subjects had more than 10 min to rest. During the movement, the elbow angle was recorded and EMGs of biceps short head (Bsh) and triceps long head (Tlh) were collected using bi-polar surface electrodes, the EMG signals were pre-amplified at the gain of 1000, band-pass filtered with cut-off frequencies at 5 Hz and 1000 Hz, and then sampled at 2000 Hz for post-processing.

### Analysis of Kinematics and EMG Data

Recorded joint trajectory and EMG were analyzed to provide guidance for simulation. Since Subject 3 couldn't follow the experimental protocol, the data was excluded from result analysis. The joint angles were low-pass filtered with a cutoff frequency of 10 Hz to remove high-frequency noise, and differentiated to obtain velocity. The collected EMG of biceps and triceps were band-pass filtered with a cut-off frequency between 20 and 500 Hz to remove motion artifacts and high-frequency noise, rectified, and then low passed at the cut-off frequency 50 Hz for further analysis. As shown in **Figure 3**, the time for movement onset (t0) and offset (t1) was defined as the time where velocity increased and decreased to 10% of peak velocity (Vpeak) respectively (Atkeson and Hollerbach, 1985). The static angles and EMG during initial posture were then defined as the averaged value from time (t<sup>0</sup> − 1.5) (sec) to time (t<sup>0</sup> − 0.5) (sec) before movement onset. Terminal posture phase started from time (t<sup>1</sup> + 4) (sec), and steady state posture angle was obtained in a window of 1 (sec). The onset of muscle firing was calculated as the time at which EMG amplitude exceeded static EMG for three times of standard deviation (3SD) of it, and maintained higher than it for at least 25 ms (Hodges and Bui, 1996; Takatoku and Fujiwara, 2010), the offset of muscle firing was calculated as the time EMG decreased lower than 3 SD below static EMG. According to this threshold criterion, the duration of muscle firing was obtained, and a low passed EMG at the cut off frequency of 6 Hz (Steele, 1994) was used to detect the amplitude of EMG. Off-line Digital Butterworth filters were used in this study with forward and reverse passes to avoid phase shift. The static and dynamic features of movements and EMGs were used to tune parameters of descending commands for corresponding posture and movement.

## Specification of Central Descending Commands

To fit the behaviors of the model in Section Corticospinal Virtual Arm (CS-VA) Model to the experimental data in Section Human Reach and Hold Experiment, the parameters of the model are fixed throughout the simulation. Only parameters of central descending commands are adjusted to capture the realistic feature of human movements and postures, as described in the following sub-sections.

#### Determining Static Command set (αs, γ <sup>s</sup> ) for Posture

Posture was realized by adjusting static α and γ commands in this model. Earlier results (Lestienne et al., 1981) indicated that terminal posture could be coded as the ratio of activation levels between antagonistic muscles of a joint, thus α<sup>s</sup> of Bsh and Tlh in the simulation was set according to the relationship between ratio of static EMG (Bsh/Tlh) (RS) and static elbow angle (θel) of human experiment (Equation 2). The initial and terminal α<sup>s</sup> of Bsh and Tlh were then decided by corresponding RS at angle θel from the experiment. The value of α<sup>s</sup> was adjusted in the model to keep a low activation on muscles during postures. The transition of α<sup>s</sup> between initial and terminal postures was assumed as a ramped changing pattern. To ensure a single joint movement in the simulation, the shoulder was fixed through adding a constant high co-activation level on α<sup>s</sup> of PC and DP.

$$\frac{\alpha\_s \left(\theta\_{el,} Bsh\right)}{\alpha\_s \left(\theta\_{el}, Thh\right)} = RS\_{\left(\theta\_{el}\right)}\tag{2}$$

γ s is related to centrally planned kinematics. In this study, we adopt an experimentally approved criterion for the central planned angle trajectory, the minimal-jerk criterion (Hogan, 1984; Flash and Hogan, 1985). The objective function is given in Equation (3a), and the minimal-jerk trajectory (MJT) is obtained by integration from t<sup>0</sup> and t1, which are times of movement onset and end. The outcome is a smooth angle trajectory in Equation (3b), as follows.

$$Z = \frac{1}{2} \int\_{t\_0}^{t\_1} \left( \frac{d^3 \theta}{d^3 t} \right)^2 dt \tag{3a}$$

$$\theta\_{el}(t) = \sum\_{i=0}^{5} a\_i t^i \tag{3b}$$

The γ <sup>s</sup> input for muscle spindles was calculated from the MJT during posture and movement, according to the quadratic relationship between γ <sup>s</sup> and joint angle obtained in Lan and He (2012). The γ <sup>s</sup> inputs to Bsh and Tlh were determined in Equations (4a) and (4b) in the following:

$$\log\_s(\text{Bsh}) = 2e - 5(\theta\_{sl} + \theta\_{cl})^2 - 0.0008\,(\theta\_{sl} + \theta\_{cl}) + 0.3698 \quad \text{(4a)}$$

$$\gamma\_s \left( T l h \right) = \left. 8e - 6 (\theta\_{sl} + \theta\_{el})^2 - 0.0044 (\theta\_{sl} + \theta\_{el}) \right. \\ \left. + \right. 0.9292 \tag{4b}$$

where the shoulder angle (θsh) was fixed, and the elbow angle trajectory (θel) was the MJT from Equation (3b). The unit of joint angles in Equation (4) is degree.

#### Determining Dynamic Command Set (αd, γ <sup>d</sup> ) for Movement

According Taylor's recording (Taylor et al., 2000, 2004, 2006), γ <sup>d</sup> efferent had a constant baseline of firing rate during both posture and movement, and its firing rate sensitively changed with onset of muscle lengthening, thus presented a leading phase before movement. In the previous analysis of involuntary oscillatory movements (Hao et al., 2013), it was hypothesis that γ <sup>d</sup> represented joint acceleration, and was used to steer activations of antagonistic muscles for joint acceleration and deceleration. This reproduced all features of involuntary oscillatory movements such as Parkinsionian tremor. Thus in the present study, we also adopted this hypothesis, and determined the γ <sup>d</sup> command as the acceleration of the MJT as follows:

$$\gamma\_d = 0.5 + \
k \frac{d^2 \theta\_{el}(t)}{d^2 t} \tag{5}$$

In which θel(t) was adopted from Equation (3b). The constant bias was set at 0.5 (normalized) to give a balanced (or mirror) inhibition on antagonistic PNs (see **Figure 2**). Parameter k was chosen so that the value of γ <sup>d</sup> was within 0 and 1.

Movement was performed through integrating dynamic α and γ commands at the PN network. It has been proposed that, the motor system modulated the pulse amplitude and duration of muscle activations to produce movement with different velocities and ranges (Corcos et al., 1989; Gottlieb et al., 1989), and the pulse strategy has successfully generated scaled movements (Lan, 1997; Lan et al., 2005). Thus, a pair of pulses was utilized on α<sup>d</sup> to act as agonistic acceleration (AG1) and antagonistic deceleration (ANT) phase respectively. The pulse waveform was given in Equations (6a) and (6b):

$$\alpha\_{d} = \begin{cases} amp\_{AG1}, t \in \left( t\_{AG1}, t\_{AG1} + pw\_{AG1} \right) \\ amp\_{ANT}, t \in \left( t\_{ANT}, t\_{ANT} + pw\_{ANT} \right) \\ 0, t \in (0, t\_{AG1}) \cup \left( t\_{AG1} + pw\_{AG1}, t\_{ANT} \right) \\ \qquad \cup \left( t\_{ANT} + pw\_{ANT}, t\_{stim} \right) \end{cases} \tag{6a}$$
 
$$\frac{amp\_{AG1}}{amp\_{ANT}} = RP \tag{6b}$$

where ampAG<sup>1</sup> and ampANT were pulse amplitudes of AG1 and ANT (for extension movement, AG1 is on Tlh and ANT is on Bsh respectively), and their values were based on the ratio of peak EMG (ANT/AG1) (RP) obtained in human movement. pwAG<sup>1</sup> and pwANT were the pulse widths for AG1 and ANT, which were set as the durations of experimental EMGs according to the 3SD threshold criterion in Section Analysis of Kinematics and EMG Data. tAG<sup>1</sup> andtANT were the starting times for AG1 phase and ANT phase, tsim was the terminal time of simulation. Therefore, by adjusting the amplitude and width of the bi-phasic pulses, movements with different ranges and velocities could be achieved by the model.

#### Determining the Stabilization Pulse

A third pulse with a ramped decreasing form was necessary to stabilize the joint after movement. It was implemented in α<sup>s</sup> of agonist muscle (second agonistic muscle burst, AG2), since tri-phasic patterned EMG is found a common feature in fast reaching movement (Ghez and Martin, 1982; Flanders et al., 1994; Berardelli et al., 1996), and AG2 has been widely accepted as a phase to help stabilize the elbow at terminal posture after movement (Hannaford and Stark, 1985; Takatoku and Fujiwara, 2010). The amplitude and duration of AG2 were adjusted in the simulation to stabilize the joint after movement.

# Simulation with Intact Feedforward and Feedback Controls

Normal condition with intact feedforward and feedback control was simulated to match typical trials in human experiment. For each subject, different postures were simulated and compared to the experimental relationship between RS and steady state angle. Stiffness control by varying co-activation level of antagonist muscles at different postures was demonstrated. To determine joint stiffness, a force perturbation was added directly on single joint muscle BS, and the stiffness was calculated as the ratio of changes in elbow torque with respect to joint angle (He et al., 2013). For each subject, the extension movement from 60◦ to 30◦ (Movement 1) was specifically simulated and matched to experimental data. More simulations were performed to fit experimental movements of larger range and in opposite direction for Subject 1, Simulation of 11 s was usually performed. After the system had converged at a steady initial state, a random, signal dependent noise (Jones et al., 2002) was added on Um, the elbow was driven at time 6 (sec). Reflex gains used in this study were set within the range that kept the system stable (**Table 1**) (He et al., 2013). All α and γ commands used in simulation were normalized to values between 0 and 1. Nominal parameter values of spinal circuits were adopted from Hao et al. (2013), and were listed in **Table 1**.

# Simulation with Abnormal Feedforward and Feedback Controls

In this study, we examined the effects of altering model structure on movement and posture by deafferenting and deefferenting the model. Deafferented conditions were modeled by assigning the related gains of Ia, and Ib to zero. Deefferented conditions of the model were obtained by setting one of the descending commands to zero at a time, while keeping others intact. The effect of abnormal ratio (Bsh/Tlh) of α<sup>s</sup> on terminal postures was also examined by assigning static ratio out of normal ranges of experiments. Note that γ <sup>d</sup> was always maintained a bias of 0.5 in the analysis of deefferentation study. The abnormal conditions were applied to the model 0.1 (sec) before movement initiation in simulations.

The difference between experimental and simulated movements was used to quantify the effects of abnormal control of movement and posture. Angular errors for movements and terminal steady state posture were evaluated as in Equations (7a) and (7b):

$$Error\_P = \frac{1}{W} \sum\_{j=1}^{W} \left( \theta\_{\text{el}(P\_{sim}(j))} - \theta\_{\text{el}(P\_{exp}(j))} \right) \tag{7a}$$

$$Error\_M = \frac{1}{V} \sum\_{j=1}^{V} \left( \theta\_{el(M\_{sim}(j))} - \theta\_{el(M\_{exp}(j))} \right) \tag{7b}$$

where Error<sup>P</sup> and Error<sup>M</sup> gave the averaged differences in terminal steady state posture and movement between simulated and experiment respectively (**Figure 3**), W and V were the number of resampled data points during terminal steady state posture and movement phases.

# Results

# Features in Kinematics and EMGs of Human Posture and Movement

All subjects presented a common pattern in movement and posture at the elbow joint. The joint angle was stabilized after a slight overshoot, and the velocity displayed a bell-shaped profile. Tri-phasic EMG was observed in fast movements, with AG1 firing at accelerating phase, ANT at deceleration phase. The AG2 of agonist muscle appeared during movement offset and lasted until the joint was stabilized at terminal posture. When the elbow was stabilized after movement, the EMG of Tlh and Bsh were maintained at different steady state levels, which varied with elbow angle.

An example of the relations between EMG ratio and joint angle during posture and movement was presented in **Figure 4**. The EMG of Tlh and Bsh were found to vary reciprocally with the elbow angle from 0◦ to 110◦ . The ratio of static EMG (Bsh/Tlh) (RS) was well fitted against the elbow angle in a linear manner (**Figure 4A**, P < 0.01). During movement although

FIGURE 4 | Experimental kinematic and EMG during posture and movement of Subject 1. (A) Relationship between postural EMG ratio of Biceps and Triceps and elbow angle, and a linear regression is fitted with *P* < 0.01, *R* <sup>2</sup> = 0.8715. (B) Relationship between the ratio of peak EMG (ANT/AG1, Bsh/Tlh for extension, Tlh/Bsh for flexion) and peak velocity for fast reaching. The linear regression is rejected (intercept *P* > 0.05).


TABLE 2 | Linear relationship between ratio of static EMG (RS) and elbow angle (θel) (Experiment (Exp.) vs. Simulation (Sim.)).

the peak velocity increased with movement range and speed, The relationship between peak velocity and ratio of peak EMG (ANT/AG1) (RP) was almost flat (P > 0.05), as shown in **Figure 4B**, and the ratio of peak EMG had an average value of 0.67 (±0.29).

These features were present in all subjects. A summary of experimental relation between the ratio of static EMGs and elbow angle for all 6 subjects was given in **Table 2**. This indicates that the central module for posture control purposefully varies the static levels of antagonistic muscle activation in such particular linear manner to compensate for the change in moment arms of muscles at different joint angles. This experimental linear relationship is sufficient to guide the specification of α<sup>s</sup> descending commands in the model for posture maintenance.

# Simulated Posture Control

Posture control was then simulated by tuning α<sup>s</sup> inputs to Bsh and Tlh for each subject, and by setting γ <sup>s</sup> according to the static angle θel. Using the linear relation to guide the specification of α<sup>s</sup> commands for Bsh and Tlh muscles in the model, a comparable relationship was obtained in simulated postures for all subjects. The results in **Table 2** showed that the simulated linear equations for each subject matched to the experimental relationship closely (P < 0.05). Overall an average linear relationship existed in experiment and simulated groups for all six subjects (P < 0.05), and they had a similar slope and intercept. This indicates that posture control can be achieved by tuning α<sup>s</sup> , as well as setting γ <sup>s</sup> inputs to relevant antagonistic muscles.

Thus, the posture module of motor system can control joint stiffness while maintaining the same joint posture. **Figure 5** demonstrated such a strategy by increasing the co-activation levels of antagonistic muscles for Bsh and Tlh, while keeping the γ <sup>s</sup> commands at the values corresponding to elbow angles of 30◦ , 50◦ , 65◦ respectively. **Figure 5A** showed that the simulated

FIGURE 5 | Simulated posture control. (A) Muscle activation ratio of simulated postures in comparison with experiment's linear regression line. Different postures were maintained while keeping the ratio of muscle activation of Bsh and Tlh within the range of experimental value. (B) Postures maintained with increasing joint stiffness. The elbow angles were maintained at 30◦ , 50◦ , and 65◦ by increasing muscles activation on Bsh, but keeping the ratio of Bsh and Tlh within experimental range, independently controlling joint stiffness was realized while maintaining the same posture.

posture angles against the experimental linear relation. Since multiple activation levels of muscles can achieve the same elbow posture, joint stiffness can be controlled to desired levels for different motor tasks. We calculated joint stiffness by applying force perturbations on BS and measured the ratio between change of joint torque and angle. **Figure 5B** showed that joint stiffness increased with the muscle activation, while the same posture was maintained. The results indicate that the central motor system can control joint stiffness and posture independently by tuning the levels of α<sup>s</sup> commands with programmed γ <sup>s</sup> control for postures.

# Simulated Movement Control

The extension movement from 60◦ to 30◦ (Movement 1) was simulated for six subjects individually. An extension movement from 90◦ to 45◦ (Movement 2), a flexion movement from 30◦ to 60◦ (Movement 3) were simulated for Subject 1. **Figures 6A–C** depict the descending commands tuned for Movement 1, 2, and 3 respectively. During commands assignment, γ <sup>s</sup> and γ <sup>d</sup> commands were calculated from the MJT of target trajectory (Equations 4 and 5). And the initial and terminal postures were maintained by setting the ratio of α<sup>s</sup> of Bsh and Tlh as

0.18 (sec) respectively. (B,C) Present the descending commands for movements 2 and 3 respectively. The amplitudes and pulse widths of <sup>α</sup>*<sup>d</sup>* for Movement 2 were *ampAG*<sup>1</sup> (0.69), *pwAG*<sup>1</sup> (0.28(sec)), *ampANT* (0.30), *pwANT* (0.20(sec)), in Movement 3, they were *ampAG*<sup>1</sup> (0.34), *pwAG*<sup>1</sup> (0.26(sec)), *ampANT* (0.26), *pwANT* (0.19 (sec)).

experimental values. Thus, movements between postures were simulated with tuning of the pulse height and width of the α<sup>d</sup> command. The amplitudes and pulse widths were adopted from experimental EMG (Equation 6). The third pulse in the α<sup>s</sup> of Tlh acted to help stabilize the elbow after fast movement. The posture commands were also specified for the other four muscles in the model to keep the simulation running properly.

The simulated angle, velocity and muscle activation of Movement 1, 2, and 3 were compared with the experimental movements (**Figures 7A–C**). The initial and terminal angles could finely match the experimental postures, and the three movements displayed the signature bell-shaped velocity profile, and tri-phasic firing pattern of Bsh and Tlh. Results showed that the elbow could be maintained closely at terminal postures, and the errors during movements were relatively small compared to the range of movements.

Furthermore, all simulated movements matched well to those of experimental trajectories and EMGs in all subjects by tuning the α<sup>d</sup> pulse command. The errors of trajectory fitting for Movement 1 were calculated according to Equation (7) and listed in **Table 3**. It is demonstrated that the movement module of the central motor system could control movements of a range of angles and velocities by tuning the α<sup>d</sup> pulse command while coordinating posture commands.

# Simulation with Abnormal Afferent and Efferent Controls

Movements under abnormal afferent and efferent controls for Movement 1 were simulated to assess the individual contribution of each descending command to movement and posture. **Figure 8** illustrated the various effects for Subject 1. **Figure 8A** showed that deafferentation after Ia(−) was removed overshot the target of movement with a smaller steady state angle; while removing Ia(+) and Ia-PN undershot the target of movement, note that removing Ia(+) resulted in a larger steady state angle, while Ia-PN had no effect on final posture. When all Ia afferents to αMN were removed in deafferented state, the joint undershot the target of movement, and stabilized gradually to a slightly larger steady state posture.

The results of abnormal efferent commands were shown in **Figure 8B**. When the ratio of α<sup>s</sup> was doubled, the movement and steady state posture were slightly affected compared to those in

profile were low-passed filtered with the cut off frequency 10 Hz, and the simulated and experimental muscle activation were low-passed filtered with the cut off frequency 50 Hz (Bsh with inverse value). The *Error<sup>P</sup>* of Movement 1, 2, and 3 were 0.19 (deg), −0.04 (deg) and −0.25 (deg), and their *Error<sup>M</sup>* were −3.99 (deg), −5.66 (deg), and −1.44 (deg).

TABLE 3 | Fitting Experimental Data of Posture and Movement.


*Error<sup>P</sup> and ErrorM: the mean difference of* θ *el between simulated and experimental trial (Equation 7), during terminal posture and movement, respectively.*

normal condition. When γ <sup>s</sup> was removed, the joint undershot the target of movement similar to that of deafferentation. After α<sup>d</sup> was removed or γ <sup>d</sup> kept a constant inhibition on PNs, the elbow couldn't make the fast movement as in normal condition. The slow approach to the target posture was brought about by the action of spinal reflexes. Thus, feedforward control of descending commands (α<sup>d</sup> and γ <sup>d</sup>) is absolutely essential for fast movements.

This general pattern of behavior in all subjects is summarized in the phase diagram of errors in **Figure 9**. As illustrated in **Figure 9A**, removing Ia(+), Ia(−) and all Ia afferents generally impacted the terminal postures, these results were consistent with the behaviors observed in humans and animals (Polit and Bizzi, 1979; Gentilucci et al., 1994; Gordon et al., 1995). But removing Ia-PN seemed to reduce errors in both movement and posture. Static and dynamic descending commands showed distinct impacts on posture and movement (**Figure 9B**). The deviated ratios of α<sup>s</sup> for Bsh and Tlh from normal values yielded a wider departure from the target posture, which was different for individual subjects. The absence of γ <sup>s</sup> resulted in an error primarily in posture. The importance of dynamic commands for movement was clearly seen from the large Error<sup>M</sup> in **Figure 9B**. This is consistent with the observation in cats after the PNs innervating the upper extremity muscles were removed (Alstermark et al., 1981; Alstermark and Isa, 2012).

# Discussion

An important issue in motor control has been to differentiate the role of the γ motor system from that of the α motor

system (Stein, 1974; Houk and Rymer, 1981; Bizzi et al., 1991). Although increasing evidence revealed α-γ co-activation during movement and posture control (Vallbo, 1971; Taylor et al., 2006; Prochazka and Ellaway, 2012), the dominant effects of the α motor system tended to diminish the function significance of the γ motor system. Using a computational virtual arm (VA) model (He et al., 2013), Lan and He (2012) re-interpreted a set of experiment data from Stein et al. (2004), Cordo et al. (2002), Taylor et al. (2006), and suggested that γ <sup>s</sup> may encode centrally planned information of joint angle and reinforce the planned joint angle though regulating spindle sensitivity. By coupling the VA model with a corticospinal (CS) network of PN in analyzing involuntary oscillatory movements (Hao et al., 2013), we further hypothesized that γ <sup>d</sup> represents centrally planned joint acceleration. In this paper, the combined CS-VA model was used to implicate the necessity of α-γ coordination in movement and posture control. Only the set of α-γ descending commands were adjusted to fit human movement data. Close match of model behaviors to those observed in human experiments as demonstrated in the results (**Figure 7**, **Table 3**) established the validity of the model, therefore, providing a neurophysiologically realistic, multi-scale computational model to evaluate the contribution of various components of descending control in sensorimotor functions. In particular, this model will also be valuable to understand sensorimotor dysfunctions (Hao et al., 2013) and to design novel rehabilitation strategies for motor relearning (Zhuang et al., 2015).

A distinct feature of this computational model is the division of sensorimotor control into movement module and posture module by the dynamic and static descending command sets (Bizzi et al., 2008; Diedrichsen and Classen, 2012). Results of this study indicate that it is possible to coordinate the two sets of descending α-γ commands to achieve accurate control of movement dynamics and stable maintenance of final posture. Methods adopted in this study to determine descending commands are novel, in that these descending commands are specified according to proven rules of central planning for kinematics (Hogan, 1984; Flash and Hogan, 1985) and EMG data collected in human subjects performing reach-and-hold tasks. Assumptions are also adopted from previous analytical studies regarding γ <sup>s</sup> encoding of joint angle (Lan and He, 2012), and γ <sup>d</sup> representation of joint acceleration of planned movement trajectory (Taylor et al., 2006; Hao et al., 2013). Results in **Figure 5** and **Table 2** indicate that α<sup>s</sup> and α<sup>d</sup> commands can be tuned based on EMG signals of human data at steady state and during movement. **Figure 5** shows that tuning α<sup>s</sup> based on the linear ratio of antagonistic muscles in **Figure 4** is necessary to achieve independent control of joint posture and stiffness, which is an important aspect of regulation of motor functions by the central motor system (Mussa-Ivaldi et al., 1985). For movement control, the α<sup>d</sup> command is integrated with γ <sup>d</sup> command at the PN to

FIGURE 9 | Phase diagram of the effect of abnormal afferent feedback and efferent commands on posture and movement for six subjects. All simulations were based on the Movement 1, shown in Table 3. (A) The errors of terminal posture and movement comparing with experimental trials, without Ia(+), Ia(−), Ia-PN, and all of them, respectively. *Error<sup>P</sup>* and *Error<sup>M</sup>* are the mean difference of <sup>θ</sup>*el* between simulated and experimental trial, during terminal posture and movement, respectively. The dashed lines represented no posture or movement error. (B) The errors of terminal posture and movement comparing with experimental trials, with abnormal α*s*, blocked γ *<sup>s</sup>*, blocked <sup>α</sup>*<sup>d</sup>* and constant <sup>γ</sup> *<sup>d</sup>*, respectively. The average of *Error<sup>P</sup>* and *Error<sup>M</sup>* was 0.21(deg) and 14.75 (deg) for condition of <sup>α</sup>*<sup>d</sup>* removed, and 0.20 (deg) and 14.92 (deg) for condition of <sup>γ</sup> *<sup>d</sup>* kept constant.

distribute properly the activation to flexor or extensor acting at the joint, and its pulse amplitude and width can be tuned according to the speed of movement and duration of EMG bursts. We illustrated that adjusting these descending commands can fit reach-and-hold movements for a range of amplitude and direction in different subjects. The ability of the model to fit experiment movements suggests that the computational model captures the neural mechanism of corticospinal computation, as well as the modular nature of organization and coordination of descending α-γ commands by the central motor control system (Ghez et al., 2007; Scheidt and Ghez, 2007; Scheidt et al., 2011; Poston et al., 2013).

The model predicted the contribution of descending α-γ commands to movement and posture by deefferenting the CS-VA model in simulation. With abnormal α<sup>s</sup> command out of the range of experimental ratio (**Figures 8B**, **9B**), the terminal angle deviated from its targeted position. The large variation shown at the terminal position suggests that accurate γ <sup>s</sup> command is essential for posture maintenance. The tardy movements under abnormal α<sup>d</sup> and γ <sup>d</sup> (**Figure 9B**) indicated that fast reaching movement must be performed with proper coordination of α<sup>d</sup> and γ <sup>d</sup> commands. This is consistent with the observation that cats were unable to carry out skilled reaching movement without PN (Alstermark et al., 1981; Alstermark and Isa, 2012).

Proprioceptive afferents from muscle spindle are important for motor learning (Jeannerod, 1988; Schmidt and Lee, 2011), but are not found indispensible for control of movements, since deafferentation in human patients and animals did not entirely disable their movement execution. Early study in deafferented patients indicated obvious motor dysfunction only with larger errors in terminal position acquisition (Polit and Bizzi, 1979; Gentilucci et al., 1994; Gordon et al., 1995) and lower joint stiffness during posture maintenance (Bizzi et al., 1984), but changes in movement kinematic and muscle firing pattern were not obvious (Taub et al., 1966, 1975; Vaughan et al., 1970; Rothwell et al., 1982; Bizzi et al., 1984; Gordon et al., 1995). This implied that proprioceptive afferent played an important role in posture maintenance and contributed to fine control of movement.

This functional role of proprioceptive afferents is reiterated by deafferenting the CS-VA model (**Figures 8A**, **9A**). It was shown that, when Ia(+) and Ia-PN was removed respectively, the movement slowed down, and the terminal posture shifted, and after Ia(−) was removed, the movement speeded up, and the terminal posture targeted at a lower angle. This confirmed the positive feedback of Ia(+) and Ia-PN, and inhibition role of Ia(−). However, despite the difference of peak velocity, these movements presented similar bell-shaped velocity profile, only settled at different terminal angles. Deafferented simulation showed slowed movement and increased errors in terminal angle. This was similar to the behaviors observed in deafferented primates (Polit and Bizzi, 1979), which showed that the deafferented primate after intensive training was able to control pointing movement with normal like kinematics and muscle activation, but inaccurate positions.

# Conclusion

A corticospinal computational model based on the modular organization of movement and posture was validated in this study by fitting the model to experimental human data. Analysis of simulated movement and posture with intact and altered model structures demonstrated that it is necessary to coordinate the set of α-γ descending commands in order to achieve effective control of accurate movement dynamics and stable postures. Results suggest that the central commands of posture module are mediated via a mono-synaptic corticospinal pathway, while those of movement module are transmitted to spinal motoneruons through a multi-synaptic corticospinal pathway involving the propriospinal neurons (PN). The PN network plays the pivotal role to integrate the α<sup>d</sup> and γ <sup>d</sup> commands for movement generation. The model is able to capture many essential aspects of motor behaviors, such as independent regulation of joint angle and stiffness and the signature temporal pattern of EMGs, by simply tuning the α<sup>s</sup> and α<sup>d</sup> commands. This study suggests a plausible neural computational mechanism for the central motor system to control movement and posture. The model will be useful as a complementary tool to understand neural control of movements, as well as a valuable platform to aid design of novel rehabilitation strategy for motor disabilities.

# Author Contributions

SL performed model simulation, analyzed experimental data, prepared figures and tables and drafted the manuscript; XH and MH contributed to set up the computational model; JM

# References


and CZ contributed in the analysis of human experiment data. CZ also carried out part of simulation work. CN offered intellectual suggestions and edited the manuscript. NL conceived the computational approach, designed the human subject experiment, proposed analytical method and edited the final version of the manuscript.

# Acknowledgments

This work is supported in part by grants from the Natural Science Foundation of China (No. BC0820041 and No. BC0820010), and the National Basic Research Program of Project 973 of the Ministry of Science and Technology of China (2011CB013304).


**Conflict of Interest Statement:** The Chief Editor Prof Wu declares that, despite having sharing affiliation with the Reviewer Dr. Wang, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Li, Zhuang, Hao, He, Marquez, Niu and Lan. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Computational Model for Aperture Control in Reach-to-Grasp Movement Based on Predictive Variability

#### Naohiro Takemura\* † , Takao Fukui † and Toshio Inui †

*Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Kyoto, Japan*

#### Edited by:

*Vincent C. K. Cheung, The Chinese University of Hong Kong, Hong Kong*

#### Reviewed by:

*Robert Van Beers, VU University Amsterdam, Netherlands Yutaka Sakaguchi, University of Electro-Communications, Japan*

#### \*Correspondence:

*Naohiro Takemura naohiro.takemura@nict.go.jp*

#### †Present Address:

*Naohiro Takemura, Center for Information and Neural Networks, National Institute of Information and Communications Technology, Suita City, Japan; Takao Fukui, Department of Rehabilitation for Brain Functions, Research Institute of National Rehabilitation Center for Persons with Disabilities, Tokorozawa, Japan; Toshio Inui, Department of Psychology, Otemon Gakuin University, Ibaraki, Japan*

> Received: *31 March 2015* Accepted: *09 November 2015* Published: *10 December 2015*

#### Citation:

*Takemura N, Fukui T and Inui T (2015) A Computational Model for Aperture Control in Reach-to-Grasp Movement Based on Predictive Variability. Front. Comput. Neurosci. 9:143. doi: 10.3389/fncom.2015.00143* In human reach-to-grasp movement, visual occlusion of a target object leads to a larger peak grip aperture compared to conditions where online vision is available. However, no previous computational and neural network models for reach-to-grasp movement explain the mechanism of this effect. We simulated the effect of online vision on the reach-to-grasp movement by proposing a computational control model based on the hypothesis that the grip aperture is controlled to compensate for both motor variability and sensory uncertainty. In this model, the aperture is formed to achieve a target aperture size that is sufficiently large to accommodate the actual target; it also includes a margin to ensure proper grasping despite sensory and motor variability. To this end, the model considers: (i) the variability of the grip aperture, which is predicted by the Kalman filter, and (ii) the uncertainty of the object size, which is affected by visual noise. Using this model, we simulated experiments in which the effect of the duration of visual occlusion was investigated. The simulation replicated the experimental result wherein the peak grip aperture increased when the target object was occluded, especially in the early phase of the movement. Both predicted motor variability and sensory uncertainty play important roles in the online visuomotor process responsible for grip aperture control.

Keywords: motor control, reach-to-grasp movement, online vision, computational model, Kalman filter

# INTRODUCTION

One of the noted behavioral features of primates is their ability to use their hands to interact with objects in various situations. Motor outputs that control the hands are generated based on environmental information collected through sensory inputs. The central nervous system (CNS) computes a suitable transformation of these sensory inputs to motor outputs that allows motor performance required in daily life. One of these performances is the reach-to-grasp movement, in which the primate extends the arm toward an object placed in front of it, and then grasps the object with its fingers.

Jeannerod (1981, 1984) has investigated behavioral properties associated with these prehension movements. In his experiments, human participants were instructed to perform natural reachto-grasp movements toward visually presented target objects of several sizes placed at several distances. He found some basic features: the velocity profile of the hand exhibits one peak in the first half of the movement duration, while the grip aperture demonstrates one peak in the second half. The peak grip aperture (PGA) is scaled to the size of the target object (e.g., Marteniuk et al., 1990). Appropriate transformation from the visual perception of the target object into the generation of the PGA is required for successful prehension movement.

How the CNS computes this transformation from sensory input of the visual object size to motor output in the form of a PGA is not yet clear. Previous studies have reported that the PGA is larger when vision is not available during the movement (e.g., Wing et al., 1986; Jakobson and Goodale, 1991; Fukui and Inui, 2006) or when the movement speed is particularly rapid (Wing et al., 1986). One interpretation of these observations is that the grip aperture is controlled to prevent inappropriate collisions of the fingers with the target and that the PGA becomes larger when visual uncertainty and/or motor variability are increased because of visual occlusion and/or faster movement (Wing et al., 1986). Actually, if the target object is presented in an eccentric view and its actual position is uncertain, the PGA will increase linearly with the eccentricity of the view (Schlicht and Schrater, 2007), which indicates that visual uncertainty is influential in the mechanism for grip aperture control. Elucidating the underlying mechanism for this grip aperture control would be possible with the use of appropriate modeling studies. Although previous models have described the reach-to-grasp movement (Hoff and Arbib, 1993; Haggard and Wing, 1995; Zaal et al., 1998; Smeets and Brenner, 1999; Ulloa and Bullock, 2003; Simmons and Demiris, 2006), no model has yet satisfactorily explained the effect of visual uncertainty on the generation of the PGA.

Traditionally, one of the major problems in modeling motor control is determining one trajectory of movement in the presence of motor redundancy. A prevailing idea is that the CNS solves this problem by minimizing costs to generate these movement trajectories, and several costs have been proposed (e.g., Flash and Hogan, 1985; Uno et al., 1989; Harris and Wolpert, 1998; Todorov and Jordan, 2002). One index of the cost minimizations that successfully explains human reaching trajectory is movement smoothness (Flash and Hogan, 1985; Uno et al., 1989). However, why the CNS adopts such costs is not clear. Harris and Wolpert (1998) focused on the importance of the cost of the final variance, because it is clearly related to the achievement of the task. If the final body state is less variable, the movement is more likely to be successful. Motor control also needs to optimize the possibility of task achievement in the presence of motor variability (Miyamoto et al., 2004; Todorov, 2004; Trommershäuser et al., 2005). Thus, recent computational models of motor control have emphasized the importance of both sensory and motor variability (e.g., van Beers et al., 2002, 2004; Guigon et al., 2008).

In a reach-to-grasp movement, the compensation for motor variability would result in adjustment of the PGA. A minimal variance model proposed by Simmons and Demiris (2006) indeed explains the emergence of PGA in reach-to-grasp movement. Specifically, their model reported that a higher movement speed in a reach-to-grasp movement creates a larger motor variability, and the grip aperture increases in a compensatory manner to prevent undesirable collisions of the hand with the target. However, since visual occlusion of the target and/or of the hand would affect sensory uncertainty, not motor variability, the manner how sensory feedback contributes to variability-based motor control should be revealed.

To clarify this manner, Todorov and Jordan (2002) proposed a computational model for motor control based on stochastic control theory. In their model, called optimal feedback control, a copy of the motor command and the noisy sensory feedback are used for internal estimation of the current state of the body. The optimal motor command that minimizes the task cost is then computed from this state estimation. The body state is estimated based on both motor and sensory variability, and the motor command is generated by taking into account the uncertainty of the state estimation. The concept underlying this computational model could explain the effect of online vision during reachto-grasp movement, as it gives sensory feedback the role of compensating for motor variability and improving the estimation of the current body state. Specifically, when visual feedback is absent, the motor variability would directly cause uncertainty in the estimated body state due to the loss of the compensation by sensory feedback. Our hypothesis is that a larger PGA would appear as a result of compensating for the increased uncertainty of state estimation.

In the present paper, we propose a stochastic control model for grasping that can simulate the effects of online vision. Although, Todorov and Jordan (2002) verified their own model by simulating arm reaching (pointing) movement, application of their model directly to reach-to-grasp movement is difficult because the movement is too complicated to be described by the task cost that is available in their model. Instead, we avoid optimal control and consider "compensation for sensory and motor variability" for calculating motor command, thereby preserving the concept of stochastic control.

An argument could be made that the effect of visual occlusion during grasping might be explained without invoking an online control mechanism that includes motor and sensory variability; i.e., some strategic effect against removing vision (e.g., the visual feedback schedule) might be causing a larger PGA (Jakobson and Goodale, 1991). However, Fukui and Inui (2006) investigated the temporal and spatial effects of visual occlusion on the grip configuration when several occlusion conditions were randomly presented (see also Whitwell et al., 2008) and found that removing the target vision in the early phase of the movement enlarged the PGA, which indicated that the PGA was modified by the availability of online vision, not only by the motor strategy. Here, we verify that the effect of visual occlusion on the PGA is due to an online control mechanism that includes motor and sensory variability by simulating the experiments of Fukui and Inui (2006) using the following model.

# MATERIALS AND METHODS

# Overview of the Aperture Control Model

Reach-to-grasp movement is traditionally thought to have two control components: a transportation (reaching) component and an aperture (manipulation) component (Jeannerod, 1981). Although the kinematics of trajectories of each finger are different, which may result from detailed control of each finger (Smeets and Brenner, 1999), the grip aperture size that represents the spatial relationship of these fingers shows significant correlation with the object size (Marteniuk et al., 1990).

Schlicht and Schrater (2007) suggested, based on the principal components analysis, that only one principal component, which can be represented by the aperture size, would explain the difference in the trajectory of the fingers during reach-tograsp movements where the target is placed in different visual eccentricities. Therefore, we assume that modeling grip aperture between thumb and index finger, and not the kinematics of each finger, is sufficient for revealing the grip control mechanism influenced by online vision (see also Vilaplana and Coronado, 2006).

An outline of the model is shown in **Figure 1**. The output motor command that controls grip aperture is transmitted to the end effector (hand) with delay and noise, and this changes the grip aperture. At the same time, the efference copy of the motor command is transmitted to the State Estimator, with no delay, and the body state (grip aperture) is estimated (or predicted) by the forward model. The estimated aperture is compared with the sensory feedback derived from vision and proprioception of the actual grip aperture, and the estimation error is then used to correct the body state estimation. The next motor command is generated based on the estimated (predicted) body state and the uncertainty of the estimation. The size and uncertainty of the target object that is observed by vision are then used to calculate the motor command. Further details are described below.

# Aperture Size to Compensate for Sensorimotor Variability

In the model setting, the hand travels from the start position to the object position along the y-axis, as indicated in **Figure 2A**. This hand transportation is not modeled here. The transportation component (the profile of hand position along the y-axis) is realized by the data from the experiment by Fukui and Inui (2006). The finger aperture opens along the x-axis because the target object is a simple cylinder and the grip orientation has little effect on the aperture size. When the transportation component properly controls the hand position so that the center point between the thumb and finger matches the center of the target object, the problem becomes how the aperture size is controlled as the hand approaches the object.

To avoid an undesirable collision during a reach-to-grasp movement, the grip aperture has to include a margin of safety that accounts for uncertainty of the hand and the object. For example, consider the situation shown in **Figure 2B** to represent the beginning of the finger closure phase of the movement. If the position on the x-axis and the diameter of the object are denoted by xo, s<sup>o</sup> respectively, the position of the right edge of the object is x<sup>o</sup> + so/2. Similarly, the position of the finger at the right side of the object (i.e., the index finger) is x<sup>h</sup> + a/2, if the hand position (the center point between the thumb and the finger) and the aperture size are denoted by x<sup>h</sup> and a, respectively. To avoid collision of the finger with the object, the finger position has to be to the right of the object edge, i.e., x<sup>o</sup> + so/2 < x<sup>h</sup> + a/2. We assume that x<sup>o</sup> − x<sup>h</sup> = 0 because the hand position should be controlled to match the object position by

FIGURE 1 | Outline of the model. The output motor command that controls grip aperture is transmitted with delay and noise. At the same time, an efference copy of the motor command is transmitted to the State Estimator, and the body state (grip aperture) is estimated (or predicted) by the forward model. The sensory prediction is computed from the estimated aperture, and is compared with the sensory feedback (vision and proprioception) of the actual grip aperture. The Kalman filter corrects the estimated body state based on the sensory prediction error. The next motor command is generated based on the estimated (predicted) body state and the variability (variance) of estimation. The size and variability (variance) of the target object that are observed by vision with noise and delay are then used to calculate the motor command. Although the transport component was assumed to contribute to the aperture control, the result of parameter fitting showed that it was not effective.

the transportation component. However, the motor variability and sensory uncertainty of hand position relative to the object position still exists, and here, the standard deviation (SD) of the variability (uncertainty) is denoted by σh. The variability (uncertainty) of the aperture size and the object size (the SDs of them are denoted by σ<sup>a</sup> and σo) also take effect. The situation before successful grasping has to satisfy a − s<sup>o</sup> >0 under the condition where the variability of a − s<sup>o</sup> is characterized by the variance σ <sup>2</sup> = σ 2 <sup>a</sup> <sup>+</sup> <sup>σ</sup> 2 <sup>h</sup> <sup>+</sup> <sup>σ</sup> 2 o , if Gaussian distribution and the independence of these variabilities is assumed. The same situation takes place at the left side of the object. Here, the probability Φ (a,so, σ) that the target object is inside the grip aperture is formulated by:

$$\Phi\left(a, s\_o, \sigma\right) = \frac{1}{\sqrt{2\pi\sigma^2}} \int\_0^\infty \exp\frac{-\left(a' - (a - s\_o)\right)^2}{2\sigma^2} da' \tag{1}$$

This is the cumulative distribution function that describes the probability that the grip aperture a is larger than target object s<sup>o</sup> in the presence of Gaussian noise.

In order to execute a grasping movement when uncertainty is involved, the CNS has to take the risk of collision into account. If we consider a particular probability φ, which the CNS adopts as a success ratio, the grip aperture to achieve is then calculated by a <sup>∗</sup> = Φ−<sup>1</sup> (φ,so, σ). Note that φ is the probability that the CNS predicts before executing the movement with the uncertainty and variability (i.e., σ 2 ) that the CNS estimates, and that φ does not represent the actual success rate of the grasping. Here, we call the aperture a ∗ the target aperture. Once the target aperture is determined, the controller generates a motor command that moves the current grip aperture toward the target aperture.

Among infinite possibilities of motor commands that make the current grip aperture approach the target aperture, one motor command should be determined to generate an actual movement. In human arm control, the movement trajectory is determined by minimizing a certain criterion (Flash and Hogan, 1985; Uno et al., 1989; Harris and Wolpert, 1998). Smoothness of movement is an important factor in determining the trajectory; the trajectory of point-to-point movement starting and ending with zero velocity and zero acceleration is well explained by the minimum jerk model (Flash and Hogan, 1985). However, the start point of the trajectory formulation required in the present model is not the movement start point but the current point during movement, and the initial velocity and acceleration of the trajectory are not zero. Minimum acceleration, which is another criterion for kinematic smoothness of movement, also well explains the reaching trajectory if boundary conditions (i.e., the initial and final velocity and acceleration) are specified (Ben-Itzhak and Karniel, 2008). Here, we adopted the minimum acceleration criterion for generating a motor command (Note that the model only generates the next motor command from the current grip aperture, not the whole trajectory).

To achieve successful grasping, the thumb and fingers have to attain a suitable contact with the surface of the object. At the same time, they have to avoid undesirable collision with the object. The relative timing of these requirements is different: first, for avoiding collision, and then, for contacting the object. Winges et al. (2003) investigated finger formation for different shaped objects during reach-to-grasp movement under several visual conditions, and they concluded that visual occlusion prolonged the final low-speed phase of reaching when the precise finger formation occurred. Our model is focused on the mechanism for generating peak grip aperture, not for contacting the object; therefore, we did not implement the mechanism for closing fingers upon suitably touching the object surface. Details are mentioned in the section that covers the simulation (Section Simulation).

# Formulation of Aperture Size Control

The movement of grip aperture is modeled by a state-space representation:

$$\begin{aligned} \mathbf{x}\_{t+\Delta t} &= F\mathbf{x}\_t + D\boldsymbol{u}\_t + G\boldsymbol{w}\_t \\ F &= \begin{bmatrix} 1 \ \Delta t \\ 0 \ 1 \end{bmatrix} \quad D = \begin{bmatrix} 0 \\ \Delta t \end{bmatrix} \quad \mathbf{G} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} \end{aligned} \quad \text{(2)}$$

Here, **x**<sup>t</sup> = [at, a˙t] T , where a<sup>t</sup> and a˙<sup>t</sup> is grip aperture and its velocity at time t, respectively, and [✷] <sup>T</sup> denotes transposition of a vector or matrix. Motor command u<sup>t</sup> is acceleration of grip aperture and w<sup>t</sup> is the motor noise generated from N(0, Qt), while ∆t is a discrete time step in simulation. The controller generates motor commands to achieve the target aperture a ∗ at the movement end time T. As a requirement in the experiment, the movement duration is predefined to be 1 second (Fukui and Inui, 2006). In order to simulate a smooth movement that achieves target aperture a ∗ and target aperture velocity a˙ <sup>∗</sup> = 0 at the movement end, we adopted the minimum acceleration criterion. The velocity at time τ (t ≤ τ ≤ T) in the minimum acceleration profile is described by:

$$\dot{a}\_{\tau} = \dot{a}\_{t} + b\_{1} \left(\tau - t\right) + b\_{2} (\tau - t)^{2} \tag{3}$$

$$b\_1 = \frac{6(a^\*-a\_t)}{(T-t)^2} - \frac{4\dot{a}\_t}{T-t} \tag{4}$$

$$b\_2 = -\frac{6\left(a^\*-a\_t\right)}{\left(T-t\right)^3} - \frac{3\dot{a}\_t}{\left(T-t\right)^2} \tag{5}$$

(See Appendix A). What the controller determines here is not the whole trajectory, but the instant acceleration at time t. Based on Equation (3), the aperture velocity at time t + ∆t is:

$$
\dot{a}\_{t+\Delta t} = \dot{a}\_t + b\_1 \Delta t + b\_2 \Delta t^2 \tag{6}
$$

Since a˙t+∆<sup>t</sup> = ˙a<sup>t</sup> + ut∆t comes from Equation (2), the motor command at time t in the minimum acceleration criterion is:

$$
\mu\_t = b\_1 + b\_2 \Delta t \tag{7}
$$

Although, the motor command is calculated from the current and target aperture, the CNS does not know the true state of the current body. The information about body state has to be obtained by the sensory system. Here, grip aperture is observed by vision and proprioception:

$$\mathbf{y}\_t = H\mathbf{x}\_t + \mathbf{v}\_t \tag{8}$$

$$H = \begin{bmatrix} 1 & 0\\ 1 & 0 \end{bmatrix}$$

For simplicity, we assumed that only the grip aperture (not the aperture velocity) is used as input information. Here, the first element of **y**<sup>t</sup> is for vision and the second one is for proprioception, where **v**<sup>t</sup> is observation noise generated from N (0, Rt). The Kalman filter (Appendix B) estimates the body state from sensory observation. If the estimated body state is <sup>b</sup>**x**<sup>t</sup> <sup>=</sup> - <sup>a</sup>ˆt,ba˙<sup>t</sup> T then a<sup>t</sup> and a˙<sup>t</sup> in Equations (4) and (5) are replaced by <sup>a</sup>ˆ<sup>t</sup> andba˙<sup>t</sup> , respectively:

$$
\hat{b}\_1 = \frac{\mathbf{6} \left( a^\* - \hat{a}\_t \right)}{\left( T - t \right)^2} - \frac{4 \hat{a}\_t}{T - t} \tag{9}
$$

$$\hat{b}\_2 = -\frac{6\left(a^\*-\hat{a}\_t\right)}{\left(T-t\right)^3} - \frac{3\hat{a}\_t}{\left(T-t\right)^2} \tag{10}$$

These equations generate the motor command based on the estimated body state.

The actual brain-body system undergoes transmission delays in the motor command and the sensory feedback pathway. When motor delay and sensory delay are d<sup>m</sup> and d<sup>s</sup> , respectively, Equations (2) and (8) are modified to:

$$\mathbf{x}\_{t+\Delta t} = F\mathbf{x}\_t + D\boldsymbol{u}\_{t-d\_m} + G\boldsymbol{w}\_t \tag{11}$$

$$\mathbf{y}\_t = H\mathbf{x}\_{t-d\_s} + \mathbf{v}\_t \tag{12}$$

The motor command generated at time t has to be based on the body state (estimation) at time t + dm. Here, Equations (9) and (10) are modified to:

$$\hat{b}\_1 = \frac{\left(\hat{a}^\* - \hat{a}\_{t+d\_m}\right)}{\left(T - \left(t + d\_m\right)\right)^2} - \frac{4\hat{\hat{a}}\_{t+d\_m}}{T - \left(t + d\_m\right)}\tag{13}$$

$$\hat{b}\_2 = -\frac{\mathbf{6}\left(a^\*-\hat{a}\_{t+d\_m}\right)}{\left(T-\left(t+d\_m\right)\right)^3} - \frac{\mathbf{3}\hat{\hat{a}}\_{t+d\_m}}{\left(T-\left(t+d\_m\right)\right)^2} \tag{14}$$

These equations imply that the "current" body representation in the brain precedes the actual body state. This function is necessary for predictive remapping (Colby and Duhamel, 1996; Melcher, 2007), where neural representations of planned saccade movements are modified to compensate for future changes of eye position. Similarly, the grip aperture also has to be predicted before the motor command reaches the effector during reach-tograsp control.

# Prediction of Uncertainty

As mentioned in Section Aperture Size to Compensate for Sensorimotor Variability, the prediction error variance at the end of the movement is necessary for determining the target aperture. The variance of grip aperture σ 2 a is obtained by prediction of the Kalman filter (Kalman, 1960; see Appendix B). The Kalman filter predicts the future body state following the statespace representation, Equation (11), as well as the variance of prediction error. Although, in the Kalman filter, the variance is typically used to determine filter gain for correction of an estimated body state based on sensory feedback, the proposed model also uses the calculated estimation variability for aperture control.

The prediction of the SD of hand position σ<sup>h</sup> is assumed to be proportional to the distance of residual hand transportation. That is:

$$
\sigma\_h = \alpha (h^\* - h\_t),
\tag{15}
$$

where h ∗ is the distance between the initial hand position and the target object, h<sup>t</sup> is the distance between the initial hand position and the hand position at time t, and α is the proportionality coefficient. This assumption is based on the idea that increases in distance to the target position give rise to more uncertainty in the prediction of final hand position. This agrees with the fact that travel distance and the target width are proportional in Fitts' law (Fitts, 1954). If target width is interpreted as "tolerable size of variability of the final hand position," then Equation (15) will be derived. As the hand approaches the target position, the prediction of the SD decreases because the residual distance to be traveled decreases.

In a reach-to-grasp movement, the effect of the aperture component on the transportation component is small (Paulignan et al., 1991), especially before the movement end point. Consequently, the use of a fixed transportation component is reasonable for the simulation of the generation of peak grip aperture. Here, we used the wrist position profile (**Figure 3A**) that is obtained by temporally normalizing and averaging the experimental data of Fukui and Inui (2006).

Finally, because the target object is only observed visually, the variability is determined by visual noise. The variance of uncertainty of the target object size σ 2 o is obtained from the vision element of the covariance matrix of sensory noise R<sup>t</sup> (see Section Simulation).

# Simulation

Using the proposed model, we simulated a visual occlusion experiment by Fukui and Inui (2006). In the simulation, movement end time T and a discrete time step ∆t were set to 1000 and 10 ms, respectively. Both visual and motor delays were set to

100 ms (Huettel et al., 2014). Because the movement onset in the experiment was defined by wrist release from a start position, the grip aperture and the aperture velocity at the movement onset were not zero. Therefore, in the simulation we set the initial grip aperture to be an average grip aperture at the movement onset in all trials recorded in the experiment. Additionally, the initial motor command (acceleration) in the simulation was set to be a value that achieved the average velocity at the movement onset in the experiment. The initial motor command in the simulation was generated taking into account the d<sup>m</sup> (motor delay) period preceding movement onset.

For the simulation of visual occlusion during movement, the variance of visual feedback noise was assumed to be larger than usual when vision was unavailable. On the other hand, proprioceptive noise was assumed to be constant. If the variance of visual noise during occlusion (no vision) and during nonocclusion (full vision) are denoted by σ 2 NV and <sup>σ</sup> 2 FV, respectively, and the variance of proprioceptive noise is denoted by σ 2 P , the covariance matrix of sensory noise is:

$$\begin{aligned} R\_l &= \begin{bmatrix} \sigma\_{NV}^2 & 0\\ 0 & \sigma\_P^2 \end{bmatrix} \quad \text{(during oscillation)}\\ R\_l &= \begin{bmatrix} \sigma\_{FV}^2 & 0\\ 0 & \sigma\_P^2 \end{bmatrix} \quad \text{(during non-clockwise)} \end{aligned} \quad \text{(16)}$$

No correlation is assumed between visual noise and proprioceptive noise. Viewed from an engineering perspective, visual occlusion should have been simulated by simply setting the Kalman gain to zero. This implementation means that the system estimates the target state only from Equation (2), but we did not adopt this implementation in the current study. Westwood and Goodale (2003) have demonstrated that the grip aperture during visually occluded grasping was affected by a size-contrast illusion, while the grip aperture in normal vision was not. We assume that this may reflect switching of the visual processing pathways from the dorsal "how" stream to the ventral "what" stream (Goodale and Milner, 1992). Therefore, the two kinds of noise assumed in the present model Equation (16) represent these two different types of visual processing.

As described in Equation (2), motor noise is assumed to be a single dimension and affects the system in the same domain as the motor command. This implies that the motor noise originates from motor command. However, the variance of motor noise is assumed to be constant; that is, Q<sup>t</sup> = σ 2 M.

Here, the proposed model has six parameters: the variance of motor noise σ 2 <sup>M</sup>, the variance of visual noise during occlusion and non-occlusion σ 2 NV, <sup>σ</sup> 2 FV, the variance of proprioceptive noise σ 2 P , the proportionality coefficients α between distance and the variance of hand position, and the probability φ that the object has to be inside the aperture. For successful grasping, the probability φ should be high. However, if φ is too high, the noise variances have to be very small in order to achieve the appropriate grip aperture, and small variances make it difficult to differentiate the visual conditions. Therefore, φ was fixed a priori at 0.9 based on the empirical inspection that people could fail prehension performance once out of 10 trials. The other parameters were determined by fitting the model outputs to the experimental data.

A probability φ of 0.9 might be rather small for successful grasping. However, the proposed model was not designed to explain precise control of the final phase of the movement, where the successful achievement of grasping is realized by the mechanism for adjusting the aperture size to the object size. Rather, this model aims to demonstrate how the generation of the PGA is influenced by online vision during the movement. We assume that the probability φ, which determines the PGA, does not have to be strictly high during the movement. In order to examine the effect of φ on the aperture profile, the simulation with various values for φ (= [0.8, 0.85, 0.9, 0.95, 0.99]) was also conducted.

# Parameter Fitting

Five parameters, σM, σP, σFV, σNV and α, were fixed by fitting the grip aperture profile obtained from the simulation to that in the experiment by Fukui and Inui (2006). In the experiment, a visual occlusion experiment was conducted while the reach-to-grasp movement was in process with two experimental paradigms: a shutting paradigm (SP) and a reopening paradigm (RP). In the SP, vision was available during target presentation and reaction time phases, and occlusion started immediately after movement onset (NV), or 150 ms (150S), 350 ms (350S), 500 ms (500S), or 700 ms (700S) after movement onset; or vision is never occluded during movement (FV). On the other hand, in RP, occlusion once started at movement onset, and vision was recovered immediately after movement onset (FV), or 150 ms (150R), 350 ms (350R), 500 ms (500R), or 700 ms (700R) after movement onset; or vision was never recovered (NV). Both paradigms had conditions in which vision during the movement was constantly available (FV) or constantly unavailable (NV). Model parameters were fitted against the aperture profile from these constant conditions. For the simulation and comparison of simulated aperture profiles to the ones obtained from the experiment, the movement time of each trial of experimental data was normalized to 1 s, and the grip aperture and wrist position were resampled at 100 Hz by interpolating the experimental data. Note that the participants in the experiment were instructed to keep the movement duration of the trials at approximately 1000 ms. Then, the temporally normalized aperture profiles of all subjects and trials were averaged for each condition.

Fitting was performed by minimizing the squared sum of errors at each time point between the aperture profile of the experiment and the simulations for two constant vision conditions (i.e., FV and NV conditions). Minimization was achieved by the Nelder-Mead simplex method (function fminsearch of MATLAB optimization toolbox). The fitting result is shown in **Table 1**.


*Note that the degrees of variability are standard deviation, not variance.*

As a result of fitting, parameter α was converged to zero. This means that we do not have to assume the relationship between the transport and aperture component in the variability domain. In other words, the aperture component considers the graspability of the target object only with respect to the variability (uncertainty) of the grip aperture and the object size (i.e., σ<sup>a</sup> and σo). This result is in line with the visuomotor channel hypothesis (Jeannerod, 1981, 1984, discussed below). The grip aperture profiles of the experiment and the simulation in the FV and NV conditions after parameter fitting are shown in **Figure 4**. The whole aperture profile is well-fitted to the experimental data.

The temporal development of the variables in the simulation of grasping toward a 4 cm target object in FV and 150S conditions is shown in **Figure 3**. As the hand approached the target object (**Figure 3A**), the predicted SD of the grip aperture decreased as the movement continued (**Figure 3B**) because the time for future prediction (i.e., the time left until the movement ends) decreased. Consequently, the target aperture decreased (**Figure 3C**) because the uncertainty of the final posture and the "safety margin" decreased. The profile of the grip aperture (**Figure 3D**) was generated by smooth following of the target aperture, exhibiting a peak at around 600 ms after movement onset. After the vision was occluded (**Figure 3**; the gray vertical line denotes 150-ms shutting), the visual noise increased (**Figure 3E**, dashed line) compared to the values obtained in FV conditions after sensory delay (i.e., 100 ms), and the predicted uncertainty of the final grip aperture (**Figure 3B**, dashed line), the target aperture (**Figure 3C**, dashed line) and the grip aperture (**Figure 3D**, dashed line) increased compared to the values obtained in FV condition.

The output aperture profiles of the model's simulation in all visual conditions, with the parameters described in **Table 1**, are shown in **Figure 5**. For all conditions in both paradigms, the peak grip aperture appeared around 550–650 ms, as it did in the experimental results. Despite the lack of a mechanism for closing fingers upon touching the object, the model first opened the fingers larger than the object size and then closed them. This is because the predicted variability at the final phase of the movement decreased as the movement progressed. This decrease in the predicted variability arose primarily from the decrease in the duration to the end of the movement. Since, the Kalman filter calculates future aperture variability by repeatedly applying the incremental forward model (Appendix B), the later prediction points (i.e., prediction in earlier phase of the movement) will be more variable.

Peak grip apertures in both the simulation and the experiment are shown in **Figure 6**. In the simulation, the peak grip apertures in 350S, 500S, 700S, and FV (shutting paradigm) conditions were remarkably similar value, as well as in 350R, 500R, 700R, NV (re-opening paradigm). The peak grip apertures observed in the experiment in these conditions did not show statistical difference, indicating that visual availability after 350 ms does not affect the peak grip aperture. The simulation replicated the experimental result that showed that vision in the early phase of the movement is important for grasping. Note that the parameters in the model

were determined by fitting to the aperture profile of the NV and FV conditions. Other conditions were not considered in the fitting. Therefore, no temporal information about the change in visual availability was contained in the data used for fitting. Even then, the simulation still reproduced the effect of early phase visual availability on the PGA. This occurred because the predicted variability mostly decreased in 350 ms in the model, and consequently, the visual conditions became insensitive to the aperture control.

In order to verify the effect of online vision on the aperture profile, we investigated whether the simulated aperture velocity profiles between 150S (150R) and 350S (350R) conditions diverge as shown in the experiment of Fukui and Inui (2006). In the experiment, the difference between the aperture velocities for the 150S (150R) and 350S (350R) conditions was statistically significant after 400–475 ms following movement onset. The aperture velocity profile of the model's output in 150S, 150R, 350S, and 350R conditions for 4 and 6 cm target objects is shown in **Figure 7**. The time points when the aperture velocities diverged between 150S (150R) and 350S (350R) conditions were around 400–450 ms. Specifically, the motor response to the onset (recover) of visual occlusion at 150 ms in 150S (150R) conditions was expected to start at 350 (i.e., 150 + 200) ms after movement onset. Considering that the response becomes noticeable only after the divergence is large enough, the result that the divergence time is similar in the simulation and experiment (i.e., 400–450 ms) indicates that the sensorimotor delay used in the simulation (=200 ms) is reasonable in this prehension task.

To verify the robustness of the model to the variability of sensorimotor delay, we performed the simulations with different sensorimotor delays (100 and 300 ms). We found similar effects of (non-)available duration of online vision on the PGA. Specifically, the PGA in 150S and 350S conditions increased with less sensorimotor delay (100 ms), and the PGA in 150R and 350R conditions increased with more sensorimotor delay (300 ms; **Figure 8**). The time of divergence in aperture velocity was dependent on sensorimotor delay (**Figure 9**). Among these three conditions, the simulation in the 200-ms delay condition was best fitted to the experimental result in terms of temporal divergence, and this 200-ms delay would be consistent with neural response latency suggested by Huettel et al. (2014).

These additional simulations indicated that this emerging effect of online vision on the PGA was not due to sensorimotor delay but to the control mechanisms of grip aperture with the predicted variability assumed in the present model.

The grasping probability parameter φ had a strong effect on the peak grip aperture. According to the simulation results in which various values were set to φ (= [0.8, 0.85, 0.9, 0.95, 0.99]) without changing other parameters, the grip aperture was larger when the probability requirement was larger

(**Figure 10**). This is consistent with the intuition that a larger safety margin is preserved when humans grasp things more carefully.

Furthermore, we performed the simulations with minimum jerk instead of minimum acceleration because the minimum jerk was generally used as a criterion of movement smoothness in previous works. We found a slower decrease of the grip aperture in the final phase of the movement and an earlier time to PGA than observed in the experiment.

# DISCUSSION

# Sensory and Motor Variability in Models for Reach-to-grasp Movement

We proposed a model for reach-to-grasp movement that controlled grip aperture to compensate for predicted sensory uncertainty and motor variability. This is the first model that includes a control principle to explain how the grip aperture is arranged to grasp an object based on sensory uncertainty and motor variability. The formation of grip aperture was modeled in several previous studies. Hoff and Arbib (1993) divided the aperture formation into preshaping and finger closure phases. In the preshaping phase, aperture is opened toward the peak grip aperture that scaled with object size. They provide no explanation for how the peak grip aperture is actually determined. Smeets and Brenner (1999) modeled grasping as pointing of two fingers to the opposite side of the object, with the trajectory formed by minimum jerk principle. The overshoot of the grip aperture to the object size was generated by the constraint that the fingers had to approach the object orthogonally to the surface of the object. No explanation was given for determination of this "approaching parameter." Ulloa and Bullock (2003) proposed a dynamic system model that included an equation to describe the influence of the transportation component on the aperture component. These models did not capture the function of sensory and motor variability in reach-to-grasp movement.

The fitting parameter α in the present model resulted in zero. This means that the prediction of wrist position variability is not necessary to determine the grip aperture in reach-tograsp movements. However, it does not mean that there is no connection between the transport and aperture components; instead, a relationship between them in the time domain is implemented in the present model. Specifically, the grip configuration progresses by predicting future state variability at the movement end, which was pre-determined following the experimental instruction of movement time (i.e., 1000 ms). The movement time in the experiment was defined by the duration from the time when the wrist was released from the table to the time when the wrist started to move the target object

approximately 30 cm towards the body. In other words, the grip configuration was partially regulated by the movement time, which is defined by the transport component. This type of temporal relationship between the two components has been modeled previously (Hoff and Arbib, 1993). Despite the assumption that the connection of these components was taken into account in the variability domain in the present model (i.e., the function of parameter α), the parameter that represents

and Inui (2006). Error bars indicate the standard errors of the values between participants.

this connection resulted in zero, suggesting the visuomotor channel hypothesis (Jeannerod, 1981, 1984), which assumes an independence of each component in visuomotor transformation. Although the visuomotor hypothesis does not incorporate the framework of variability-based motor control, the present model suggests that this independence of each component could be applicable to the level of variability-based motor control. The assumption of the temporal coordination of these visuomotor

channels (Hoff and Arbib, 1993) is also implemented in the present model.

One possibility for the result that α became zero after fitting could be the inappropriate assumption of the proportionality between the residual distance and the predicted variability of the hand position (Equation 15). However, the fact that the grip aperture profile was successfully reproduced without assuming the predicted variability of the hand position indicates that the variability-based model of reach-to-grasp movement without assuming a transport component has the capacity to parsimoniously explain the effect of online vision on the PGA shown by Fukui and Inui (2006). Nevertheless, incorporating a predicted variability of the hand position (which might not be describable by a simple function of the residual distance) into the model will contribute to a better understanding of reachto-grasping movements. Further investigations are required to identify the relationship between the aperture and transport components in the variability domain.

As mentioned in the introduction, Simmons and Demiris (2006) have shown the importance of variability in the control of grip aperture. They modeled aperture control as a movement that goes through via-points for each finger. They found that when

delays. The target object is 4 cm. Top: 100 ms sensorimotor delay. Middle: 200 ms sensorimotor delay. Bottom: 300 ms sensorimotor delay. Note that the figure in the middle panel is identical to that shown in Figure 7.

motor variability was dependent on the amplitude of a motor command, the via-points that resulted in the smallest variance at the end point of the movement were chosen. However, their model only focused on motor variability. To explain the effect of visual feedback in grasping, the function of sensory uncertainty also has to be modeled. Our model includes the Kalman filter for predicting future variability of the aperture, so that sensory uncertainty also plays a role in the control.

The present model suggests a new computational view expanding the conventional Bayesian sensorimotor framework: the online prediction of motor variability at the task end is used for feedback control. Previous studies have demonstrated that motor variability and sensory uncertainty are integrated on the basis of the Bayesian framework (or the Kalman filter) in order to enhance the task performance in the presence of sensory and motor noise during motor planning (Harris and Wolpert, 1998; van Beers et al., 2004; Trommershäuser et al., 2005), sensorimotor learning (Körding and Wolpert, 2004) and online control (Izawa and Shadmehr, 2008). In the present model, the target aperture is determined using the predicted variability of grip aperture at the movement end. This means that the online feedback controller changes the "immediate goal (target)" necessary for achieving the task depending on both the Bayesian estimation of the current state of the body and the environment and the Bayesian prediction of a future state. This feature of the present model explains how the effect of online vision during prehension varies according to the phase of the movement. In the early phase of the movement (0–350 ms), where the prediction of a far future (> 650 ms) is required, the predicted task end state is uncertain and the sensory feedback is important to reduce this uncertainty. We consider that this online variability prediction plays an important role in performing a grasping movement.

In the present model, motor noise is assumed to be a constant during the movement rather than signal-dependent noise (Harris and Wolpert, 1998). The controller used the predicted variability to determine a target aperture (a ∗ ), and it was impossible to predict the future variability of the grip aperture with signaldependent noise, because the noise affects the target aperture,

the motor commands used to achieve this target aperture, and, consequently, the control noise itself. A cost function is necessary to fix the motor command in the presence of signal-dependent noise. The parameter φ, which was set to 0.9 in the present study, might be determined with such a cost function and signaldependent noise in future studies.

# Neural Computation of Variability

We calculated the future variability of the aperture and updated the variability due to sensory feedback by implementing the Kalman filter in our model. During the estimation of the visual image, forward and inverse calculations similar to the Kalman filter appear to be conducted during the visual process (Kawato et al., 1993). In a reaching movement, the states of the body and environment are estimated in a manner similar to the Kalman filter, in off-line adaptation (Izawa et al., 2008) as well as in online processing (Izawa and Shadmehr, 2008). The Bayesian inference necessary for the Kalman filter can be computed by the neural population that fires in a probabilistic manner (Ma et al., 2006), and a neural implementation of the Kalman filter has been suggested (Denève et al., 2007). However, the way that the Kalman gain is computed still remains to be resolved.

The neural substrates related to the calculation necessary for the Kalman filter have been speculated (Ogawa et al., 2007). During performance of a tracing or tracking movement using a computer mouse with an artificial delay introduced occasionally, the right posterior parietal cortex was activated in relation to the function that detects the error between the target and the mouse cursor. The right temporo-parietal junction was involved in state estimation of self-movement during visually guided movement. Unfortunately, these studies on neural computation related

# REFERENCES


to the Kalman filter only investigated reaching movements, not grasping. However, motor control of the body is clearly related to the sensory and motor variability, and a reach-tograsp movement also has to be explained in this manner. The present study demonstrated that a computational model based on mechanisms of variability prediction with the Kalman filter explains the online effect of vision during reach-to-grasp movement.

# CONCLUSIONS

We have proposed a computational model for a reach-to-grasp movement, where the state of the hand is estimated and predicted by Kalman filters, and a motor command is generated that establishes a target grip aperture that is sufficiently large, in the stochastic manner, in relation to the target object size. The simulations of the model reproduced the effect of visual occlusion on the PGA during grasping. Online control of the movement would therefore require: (i) internal prediction of future states of the body and its variability, and (ii) motor commands based on the prediction and task constraints. The model was constructed within the framework of optimal feedback control in the sense of predictive stochastic control under sensory and motor variability, although the control has not yet been fully optimized.

# ACKNOWLEDGMENTS

This research was supported by Japan Science and Technology Agency, ERATO, Asada Synergistic Intelligence Project and by MEXT Grant-in-Aid for Scientific Research on Innovative Areas "Constructive Developmental Science" (15H01580).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Takemura, Fukui and Inui. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX A

# Trajectory in Minimum Acceleration Principle

Considering the trajectory from a current state x<sup>t</sup> at current time t to a target state x<sup>T</sup> at movement end time T, the cost to minimize is described by the following:

$$C = \frac{1}{2} \int\_{t}^{T} \left(\frac{d^2 \mathbf{x}\_{\mathbf{f}'}}{dt'^2}\right)^2 dt' \tag{A1}$$

In the similar manner as discussed by Flash and Hogan (1985) for the minimum jerk principle, the solution is given by a third order polynomial:

$$x\_{\text{f}} = a\_0 + a\_1 \left(\mathbf{r} - t\right) + a\_2 \left(\mathbf{r} - t\right)^2 + a\_3 \left(\mathbf{r} - t\right)^3 \tag{A2}$$

where τ denotes the arbitral time during movement (t ≤ τ ≤ T).

The solution of (A2) can be obtained by applying boundary conditions, which specify the position x<sup>t</sup> and the velocity x˙<sup>t</sup> at time t:

$$\begin{aligned} a\_0 &= \boldsymbol{x}\_t \quad a\_1 = \dot{\boldsymbol{x}}\_t\\ a\_2 &= \frac{3(\boldsymbol{x}\_T - \boldsymbol{x}\_t)}{\left(T - t\right)^2} - \frac{\dot{\boldsymbol{x}}\_T + 2\dot{\boldsymbol{x}}\_t}{\left(T - t\right)}\\ a\_3 &= -\frac{2\left(\boldsymbol{x}\_T - \boldsymbol{x}\_t\right)}{\left(T - t\right)^3} - \frac{\dot{\boldsymbol{x}}\_T + \dot{\boldsymbol{x}}\_t}{\left(T - t\right)^2} \end{aligned} \tag{A3}$$

Taking the derivative of (A2),

$$\alpha\_{\mathfrak{r}} = a\_1 + 2a\_2(\mathfrak{r} - t) + 3a\_3(\mathfrak{r} - t)^2 \tag{A4}$$

The velocity profile in the minimum acceleration principle, Equations (4) and (5), is obtained from (A3) and (A4).

# APPENDIX B

# The Kalman Filter with Time Delay

The system dynamics are described by Equations (11) and (12). The estimation of the state at time t<sup>1</sup> based on the observation until time t<sup>2</sup> is denoted by xt1|t<sup>2</sup> , and the covariance of the estimation error is denoted by Pt1|t<sup>2</sup> . Because of sensory delay, the observation at time t is used to correct the estimation of state at time t − d<sup>s</sup> :

$$K = P\_{t - d\_l|t-1} H^T \left[ H P\_{t - d\_l|t-1} H^T + R\_t \right]^{-1} \tag{B1}$$

$$
\hat{\mathbf{x}}\_{t-d\_i|t} = \hat{\mathbf{x}}\_{t-d\_i|t-1} + K(\mathbf{y}\_t - H\hat{\mathbf{x}}\_{t-d\_i|t-1}) \tag{B2}
$$

$$P\_{t-d\_i|t} = P\_{t-d\_i|t-1} - KHP\_{t-d\_i|t-1} \tag{\text{B3}}$$

Here, [✷] <sup>T</sup> denotes a transposition of the vector or matrix, while [✷] <sup>−</sup><sup>1</sup> denotes an inverse matrix. The estimation of the state at t is calculated by corrected estimation xˆt−d<sup>s</sup> <sup>|</sup><sup>t</sup> and a motor command that has already been output, by applying the following in order of i = −ds, . . . , −1:

$$
\hat{\mathbf{x}}\_{t+i+1|t} = F\hat{\mathbf{x}}\_{t+i|t} + D\boldsymbol{\mu}\_{t+i-d\_m} \tag{B4}
$$

$$P\_{t+i+1|t} = F P\_{t+i|t} F^T + G Q\_{t+i-d\_m} G^T \tag{B5}$$

The motor command generated at time t − 1 arrives at the end effector at time t + d<sup>m</sup> − 1 because of the motor delay. Therefore, the state estimation xˆt+dm|<sup>t</sup> at time t + d<sup>m</sup> can be calculated at time t by applying (B4) and (B5) for i = 0, . . . , d<sup>m</sup> − 1. As mentioned in Section Formulation of Aperture Size Control, this estimation is necessary for generation of the motor command u<sup>t</sup> . Equation (B5) is also used to predict future variability at the end of movement. As variability is predicted farther into the future, the predicted variability increases because repetitive application of Equation (B5) accumulates the motor variability Q<sup>t</sup> .

# An Assessment of Six Muscle Spindle Models for Predicting Sensory Information during Human Wrist Movements

#### Puja Malik <sup>1</sup> , Nuha Jabakhanji <sup>1</sup> and Kelvin E. Jones 1, 2, 3 \*

*<sup>1</sup> Department of Biomedical Engineering, University of Alberta, Edmonton, AB, Canada, <sup>2</sup> Faculty of Physical Education and Recreation, University of Alberta, Edmonton, AB, Canada, <sup>3</sup> Neuroscience and Mental Health Institute, University of Alberta, Edmonton, AB, Canada*

Background: The muscle spindle is an important sensory organ for proprioceptive *information*, yet there have been few attempts to use Shannon information theory to quantify the capacity of human muscle spindles to encode sensory input.

Methods: Computer simulations linked kinematics, to biomechanics, to six muscle spindle models that generated predictions of firing rate. The predicted firing rates were compared to firing rates of human muscle spindles recorded during a step-tracking (center-out) task to validate their use. The models were then used to predict firing rates during random movements with statistical properties matched to the ergonomics of human wrist movements. The data were analyzed for entropy and mutual information.

#### Edited by:

*Ning Lan, Shanghai Jiao Tong University, China*

#### Reviewed by:

*Robert A. Gaunt, University of Pittsburgh, USA Hong Qiao, Institute of Automation, Chinese Academy of Sciences, China*

> \*Correspondence: *Kelvin E. Jones kejones@ualberta.ca*

Received: *27 April 2015* Accepted: *21 December 2015* Published: *14 January 2016*

#### Citation:

*Malik P, Jabakhanji N and Jones KE (2016) An Assessment of Six Muscle Spindle Models for Predicting Sensory Information during Human Wrist Movements. Front. Comput. Neurosci. 9:154. doi: 10.3389/fncom.2015.00154* Results: Three of the six models produced predictions that approximated the firing rate of human spindles during the step-tracking task. For simulated random movements these models predicted mean rates of 16.0 ± 4.1 imp/s (mean ± *SD*), peak firing rates <50 imp/s and zero firing rate during an average of 25% of the movement. The average entropy of the neural response was 4.1 ± 0.3 bits and is an estimate of the maximum information that could be carried by muscles spindles during ecologically valid movements. The information about tendon displacement preserved in the neural response was 0.10 ± 0.05 bits per symbol; whereas 1.25 ± 0.30 bits per symbol of velocity input were preserved in the neural response of the spindle models.

Conclusions: Muscle spindle models, originally based on cat experiments, have predictive value for modeling responses of human muscle spindles with minimal parameter optimization. These models predict more than 10-fold more velocity over length information encoding during ecologically valid movements. These results establish theoretical parameters for developing neuroprostheses for proprioceptive function.

Keywords: proprioception, Ia afferent, sensorimotor control, spike train, entropy

# BACKGROUND

The realization of restoring movement using brain-machine interfaces has begun and there is a great deal more work needed to refine and develop this technology (Wessberg et al., 2000; Nicolelis, 2003; Donoghue et al., 2004; Schwartz, 2004). An area of current development is the incorporation of sensory feedback from proprioceptors, or artificial proprioceptor-like sensors (Clark et al., 2014; Fisher et al., 2014, 2015; McGee et al., 2014; Niu et al., 2014). One of the afferents involved in proprioception is the primary (Ia) afferent innervating the specialized sensory structures in skeletal muscle: muscle spindles (Gandevia, 1996; Prochazka, 1996). It is our belief that studying the sensory coding schemes that have evolved in muscle spindles will be important for developing biomimetic prosthetics.

Information theory has been a useful tool to estimate the capacity of communication channels in engineering and design codes that take advantage of that capacity (Shannon, 1948). Engineering closed-loop neuroprosthetics with proprioceptive sensory encoding would benefit from measuring the information about movement encoded in the firing rate of the muscle spindle afferents. These data can be used to estimate the bandwidth of proprioceptive feedback in the intact sensorimotor system that is being replaced with a neuroprosthetic device. Information theory also provides an estimate of the precision of the neural code, informing the choice of resolution of synthetic sensors (McGee et al., 2014) or stimulus parameters for activating intact sensory systems (Clark et al., 2014; Fisher et al., 2014). To characterize the potential information content it is important to capture the full entropy of the sensory source that would be encountered in ecological conditions. In the visual system investigations of neural coding moved from synthetic input, e.g., oriented bars and sinusoidal gratings, to natural stimuli improving the decoding algorithms (Simoncelli and Olshausen, 2001; Olshausen and Reinagel, 2003). We propose that the same approach should be used to define the information content of the sensory systems underlying proprioception. Prior to collecting experimental data, our intention was to predict the sensory information encoded in muscle spindle firing during movements of the wrist joint in humans. To make these predictions, a plausible model of human muscle spindle firing in response to movement needed to be determined.

Quantitative models based on cat muscle spindle physiology started in the late 1960s (Matthews and Stein, 1969) and continued slowly over the next three decades (Houk et al., 1981; Hasan, 1983; Scott and Loeb, 1994; Prochazka and Gorassini, 1998b; Mileusnic et al., 2006; Niu et al., 2014). To date, the mathematical models of muscle spindle primary afferents have only been tested with data sets acquired in acute and chronic cat experiments. It is imperative that these models be tested against responses from human muscle spindles, especially given suspected differences between species (Prochazka, 1999). Thus a secondary aim of this study is to determine whether mathematical models based on the cat, have predictive value for human muscle spindle responses. We chose to follow the approach of Prochazka and Gorassini (1998b) who compared the performance of six models to muscle spindle data from cat hamstring muscles during locomotion: a behavioral task with natural sensory statistics. In that study there was only a modest improvement from adding an EMG signal to mimic coactivation of gamma and beta motor neurons. Therefore, we have excluded this parameter in our initial study of these models. The six models were assessed in relation to the ensemble firing rate profile of eight human extensor carpi radialis (ECR) muscle spindes (presumed group Ia) recorded during a step-tracking, center-out task. Our emphasis is on a first order assessment of the utility of these models for capturing basic features of the step-tracking task and then using the models to predict the response during a more complex task: random two-dimensional tracking (Paninski et al., 2004).

# METHODS

# Overview

The simulations and models developed for this study are schematically illustrated in **Figure 1**. The first step was generation of hypothetical wrist movements for two different tasks: centerout and random movements. The kinematics of the wrist movements were simplified to rotations about two orthogonal and intersecting axes. The movements were measured in degrees with respect to a neutral position defined as the origin of a co-ordinate system with flexion/extension movements along the x-axis (extension being positive) and radial/ulnar deviation movements along the y-axis (radial deviation being positive). This Cartesian system was transformed into a polar co-ordinate system. In the polar system pure extension movements pointed toward the right at 0◦ , pure flexion movements to the left at 180◦ , radial deviation movements pointed upwards at 90◦ , and ulnar deviation movements pointed downwards at 270◦ (**Figure 1**, left).

The resulting rotations of the wrist were input to a model of tendon displacement for the extensor carpi radialis brevis (ECRb) muscle (Loren et al., 1996). The tendon displacement output was then used as input to six different muscle spindle models to generate predicted Ia afferent firing rate. These models have been evaluated in relation to the natural movements during cat locomotion and exhibited good fits to firing rate profiles of ensembles of cat Ia afferents (Prochazka and Gorassini, 1998a,b). For our goal of predicting sensory feedback from Ia afferents during natural human wrist movements, these models seemed a clear choice. A more detailed and comprehensive model was not included in our assessment (Mileusnic et al., 2006). A major goal of this model is the accurate modeling of fusimotor effects, which we did not investigate in this initial assessment. In addition, the length input to this model is fascicle length (in units of optimal muscle fascicle length). Converting from our approximation of tendon displacement to fascicle length was dubious given the sparse experimental data from the human ECR muscle.

All simulations and analysis were performed in Matlab ver 7.0.4 and Simulink ver 6.2 with a time step of 1 ms.

# Center-Out Movement Simulations

The center-out task is a staple of sensorimotor neurophysiology and has been used extensively for 2D and 3D reaching

experiments in monkeys and humans (Georgopoulos et al., 1982; Gordon et al., 1994). This task, also referred to as step-tracking, has been intensively studied during wrist movements by Hoffman and Strick in both man and monkey (Hoffman and Strick, 1986a,b, 1990, 1993, 1999; Kakei et al., 1999). Our simulations of afferent responses during the center-out task build on the wide-spread use of these movements for studying sensorimotor systems.

The kinematics of the center-out movements were calculated to eight equally spaced targets around a circle in wrist joint space (**Figure 1**). Minimum jerk trajectories were used as they are descriptive of wrist movements (Stein et al., 1988) and similar to the kinematics predicted by a more accurate causative optimal control model for the wrist (Haruno and Wolpert, 2005). The amplitude of the movements was 7◦ of joint rotation to match with a subset of data extracted from a previous human muscle spindle study (Jones et al., 2001). The simulations included a short period of 500 ms in the neutral wrist position followed by a movement phase of 650 ms that followed the minimum jerk trajectory (Flash and Hogan, 1985). The simulations continued for a period of 500 ms after reaching the target. The average velocity of movement to the targets was 10.8 deg/s with a peak velocity equal to 20.7 deg/s. These movement velocities correspond to those previously reported for the human data: average 9–10 deg/s with peak velocities of 20–30 deg/s (Jones et al., 2001).

# Center-Out Movement—Human Data

Eight ECR muscle spindle afferents were selected from a larger sample of data to compare with the predicted firing rate of the spindle model. This subset of data were selected because: the data were from putative Ia afferents in the ECR muscle; the responses were obtained during a center-out task with kinematics that approximate minimum jerk trajectories; and responses were obtained during movements to all eight targets. In a previous study of the ensemble response of spindle primary afferents in the cat hamstring, the authors reported that after five or six had been averaged to estimate a population response, the addition of more units did not change the population response significantly

#### TABLE 1 | Wrist movement statistics.


*Typist data was given as mean* ± *within-subject SD (*Tables 2*,* 3*.) and the probability distribution of position was assumed to be Gaussian. The non-zero mean joint position was not a parameter for the spindle models so it was zeroed in the simulations and the standard deviations were matched. Velocity was rectified prior to calculating statistics: mean* ± *SD for data; range of median velocities for simulations.*

(Prochazka and Gorassini, 1998b). On this precedent, we felt confident that our data sample was sufficient to estimate general ensemble firing rate behavior of human ECR spindle primary afferents. The spike trains were aligned to movement onset and the average movement duration was 634 ± 19 ms (mean ± SD). To calculate the mean firing rate and 95% confidence intervals for the ensemble of ECR afferents, the individual single-trial spike trains were convolved with a Gaussian kernel then averaged. The width of the kernel function (60 ms) was chosen iteratively considering the phasic response and background discharge rate while comparing the ensemble to the predicted firing rates from the six muscle spindle models.

# Random Movement Simulations

While the center-out task has been widely used in sensorimotor neurophysiology to investigate neural coding, there are a number of shortcomings that have been noted (Paninski et al., 2004). First, these movements, as with all point-to-point movements, are relatively straight and exhibit bell-shaped velocity profiles where the peak velocity is proportional to the target distance. These invariant features of movement result in the coupling of position and velocity, two variables that can have independent effects on responses of muscle spindles and proprioception. Another difficulty with the center-out task is that the data are statistically non-stationary, i.e., the mean and variance change

over time during the task. The non-stationarity precludes the use of any analysis methods that require a stationary stochastic time series; information-theoretic analysis for example. In addition to the inevitable coupling of position and velocity and the non-stationary data, the center-out task results in movements that occupy a limited amount of the available joint space, and while natural, do not approximate the statistics of ergonomic movements of the wrist. For these reasons, we decided to simulate a random pursuit-tracking task introduced to sensorimotor neural coding studies (Paninski et al., 1999, 2004).

The goal of the random movement simulations was to reproduce the statistics of human wrist movements during typing (Serina et al., 1999). The random movements were generated by a Gaussian random number generator so that the standard deviation of position matched the ergonomic data (**Table 1**). In order to match the statistics of wrist velocity during typing, a 6thorder low-pass Butterworth filter with cut-off frequency of 1.5 Hz was applied.

Given the importance of the simulations of the random movements, we have illustrated the key features in **Figure 2**. Gaussian curves overlying the histograms (**Figures 2A,B**) illustrate that a constant position distribution was conserved in the final random wrist movement signals. The curves have the same parameters for a given axis at fast and slow speeds. Power spectral analysis was carried out for position and velocity along each wrist axes to check for a flat spectrum below eleven cut-off frequencies: 0.5–1.5 Hz in 0.1 Hz increments. The flat nature of the power spectral density plots indicated the signals were approximately white within the band-pass of interest and therefore considered random (**Figures 2E,F**).

# Tendon Displacement Model

The rotations about the two axes of the wrist were used to calculate the tendon displacement of the ECRb muscle resulting from the movements. These calculations were based on a human cadaveric forearm study that reported equations for instantaneous moment arms with respect to joint rotation (Loren et al., 1996). The equation for ECRb, in the neutral wrist posture, was integrated to generate an equation for instantaneous tendon displacement (in mm) with respect to joint rotation. Tendon displacement and velocity, computed with the transfer function 500 s/(s+500) where s is the Laplace operator, were then used as input to the six muscle spindle models (**Figure 1**). Corrections for compliance or muscle fiber pennation were not features of the spindle models tested, so were ignored for the present study.

wrist movements. (C,D) Histograms of tendon displacement and rectified velocity. At the faster movement, the tendon velocity distribution spreads out to higher values. (E,F) Polar plots of tendon displacement and rectified velocity (below) for 1.5 Hz (E) and 0.5 Hz (F). Positive displacement or velocity is black, negative is gray.

The tendon displacement and velocity during the random wrist movements are illustrated in **Figure 3** for the fastest and slowest speeds. The key feature of this figure is that it illustrates the range and distribution of the two inputs to muscle spindles during statistically natural human wrist movements. These data allow comparison of the statistics of tendon movements in humans and that in other species, e.g., cats during locomotion. In the middle column the velocity distribution has been rectified prior to calculating the median value at each of the 11 different filter rates. A Kurskal-Wallis test showed a significant difference in the median tendon velocities as a function of filter rate (p < 0.05). The last column illustrates polar plots where the angle is determined by the position in wrist joint space and the distance to a point is the value of tendon displacement or velocity. The polar plots of displacement illustrate the ECRb tendon is stretched (positive displacement, black) for wrist positions combining flexion and ulnar deviation and shortened for wrist positions in the opposite direction. Tendon velocity is independent of wrist position and can have positive or negative values anywhere in wrist joint space. This illustrates a key objective of the random movement simulations, uncoupling of wrist joint position, and tendon velocity that are unavoidably linked in the center-out task.

# Predicted Ia Afferent Firing Rates

Six muscle spindle models were used to generate predicted firing rate time series resulting from the simulated movements (Matthews and Stein, 1969; Chen and Poppele, 1978; Houk et al., 1981; Hasan, 1983; Prochazka and Gorassini, 1998b). All of these models were shown to have some predictive value for estimating ensemble Ia afferent firing rate during normal cat locomotion, though when compared with slow ramp-&-hold stretches one emerged as the most general and accurate. The muscle spindle models were implemented in Simulink, using a previous published block-diagram (Figure 7 in Prochazka and Gorassini, 1998b). As the models were no longer available from the cited internet site, correct implementation was verified by comparing model behavior to the published responses during cat locomotion (Figure 3 in Prochazka and Gorassini, 1998b). The solver used in our simulations was the Euler algorithm with a step-size of 1 ms. For comparison to the human data set, the baseline firing rate of the models was set to 10 imp/s from its original 82 imp/s. This decision was made after averaging the mean firing rate for each of the eight human muscle spindles while holding in the central starting position: 10.2 (SD 1.4) imp/s. Our goal was

to make the minimal adjustment to the models rather than leaving parameters free and running a traditional optimization algorithm. This was done to evaluate if the models had predictive value with minimal change, to test the hypothesis that there is little difference between cat and human muscle spindle responses.

# Data Analysis

# Center-Out

The most wide spread approach to analysis of neurophysiological data during the center-out, or step-tracking, task is the "mean vector" analysis method by which the directional tuning is determined (Georgopoulos et al., 1982; Gribble and Scott, 2002). This approach has been extensively used for analysis of muscle spindle coding during passive movements of the ankle (Bergenheim et al., 2000; Roll et al., 2000, 2004; Ribot-Ciscar et al., 2002, 2003) as well as our previous study of active and passive movements of the wrist (Jones et al., 2001). The length of the mean vector was normalized to values between 0 and 1. To test for significance of directional tuning, a bootstrap test was used where the mean rate and target angle are resampled from the original data (4000 resampling trials) and the resulting resampled vector length is compared to the original. If fewer than 200 resampled vectors are longer than the original, the directional tuning is considered significant with 95% confidence. Note, resampling is not done if the original mean vector length is <0.001, as this already indicates a non-directional distribution in circular statistics (Batschelet, 1981).

# Quantitative Comparison of Human Data and Model

Both qualitative and quantitative comparisons were done to evaluate the ability of the six muscle spindle models to capture the dynamics of the human experimental data during the center-out task. Root mean square (RMS) error was calculated between each model and the ensemble data on the following measures: mean firing rate, directional tuning vector length, static index, dynamic index, and temporal similarity. For the first two measures, an RMS error was calculated during two phases of the task: (1) movement to the target, and (2) holding on the target. The RMS error for a particular measure was normalized by dividing by the median RMS error for the six models. The normalized errors were summed over all the measures to give a total error score. The error score has a lower limit of zero, which corresponds to perfect prediction, and an unbounded upper limit. The consolidated error score was used to rank order the six models in their ability to predict the human ensemble data. Most of these measures are self-explanatory with full details given in Malik (2006). The measure of temporal similarity evaluated the difference between the ensemble instantaneous firing rates and the firing rates predicted by the models over intervals of 100 ms. The squared difference in firing rate at each interval was summed over the duration of a trial to each of the eight targets. The final RMS error for each model was computed by dividing the sum of squared Malik et al. Human Muscle Spindles: Models

differences by the number of 100 ms bins and number of targets, then taking the square root.

#### Random Movements

The random movement data was analyzed using entropy and mutual information for a continuous channel (Shannon, 1948). This approach to analysis of sensory stimuli in neural systems has proven to be a powerful framework for nonparametric and nonlinear analysis (e.g., Bialek et al., 1991; Bialek and Rieke, 1992). Following Shannon we adopt the logic that information = uncertainty = entropy and, using base 2 for our logarithms, interpret the entropy as the smallest number of bits needed to communicate our signal of interest, which is tendon displacement or velocity. The ability to transmit these signals that convey the mechanical state of the muscle using muscle spindle firing rate was assessed by mutual information. This quantity measures the average number of bits an observer receives about the tendon displacement or velocity by observing firing rate. The probability density function for a signal was estimated from a histogram of the signal after which entropy was calculated. Mutual information is a linear addition of the respective entropies, Ixy = H<sup>x</sup> + Hxy, where H is entropy, x is the input signal (tendon displacement or velocity) and y is the firing rate of one of the six muscle spindle models. The relevant equations can be found in Shannon (1948), and have been reproduced in other sources (e.g., Cover and Thomas, 1991; Rieke et al., 1999). Calculations were done in Matlab using (Moddemeijer, 1989).

# RESULTS

# Center-Out Movements

In this section we assess whether a group of six muscle spindle models, which have been developed and used to interpret data from cat studies, give a general and accurate prediction of the responses of human muscle spindles during a step-tracking task.

# Comparing Time Series Data

In the human data set that was analyzed for this study, the subjects were asked to move quickly to the targets with an emphasis on accuracy. The main movement phase to the target was in a straight line with a bell-shaped velocity. Movements to some targets, for which the ECR muscle is the primary agonist, resulted in a decrease in firing rate of the human muscle spindles as illustrated in **Figure 4**. During movement to target 2 (negative tendon displacement, **Figure 3E**), this muscle spindle was temporarily unloaded resulting in a pause in the spike train (**Figure 4A**, left upper trace). A continuous estimate of firing rate generated from the ensemble firing rates of all eight spindles in the data set showed a transient period of decreased firing rate during movement followed by recovery of firing rate during the static hold-target phase (**Figure 4A**, left middle trace). Movement in the opposite direction to target 6 (positive tendon displacement, **Figure 3E**) resulted in a burst of action potentials from the muscle spindle afferent with instantaneous firing rates reaching almost 90 imp/s followed by adaptation to a lower rate during the static phase. The continuous firing rate estimate

FIGURE 5 | Comparing the temporal dynamics of firing rate and variability for a human muscle spindle and six models. Continuous firing rate responses from a single human ECR spindle, recorded during four repeat movements to the target at 270◦ . A smoothed response was generated from the spike trains using a Gaussian kernel (60 ms bandwidth) and the five repeated movements were averaged to find the mean response and 95% confidence intervals (gray band). The movement duration was 662 ± 95 ms (mean, *SD*). Overlaid are predicted firing rates from the six models for a minimum jerk trajectory (650 ms duration) to the same target. Baseline firing rate in the models was set to 10 imp/s and the average for the human data was 8.5 imp/s. Records are aligned at movement onset (0.5 s).

from the ensemble showed a clear separation of dynamic and static firing rate responses during "movement to" and subsequent "holding on" the target.

The changes in firing rate predicted by the models during movements to these same targets had similar dynamic and static phases. During movements to target 2, five of the six models showed a transient decrease in firing rate during movement with recovery of firing rate during the static phase of holding on the target. Simulations of movements to target 6 showed a transient increase in predicted firing rate that was higher than the rate after reaching the target. The firing rates predicted by four of the models during the dynamic phase were of similar magnitude to the human data, but only three of these predicted rates similar to the human example in the static phase.

While the individual movements to these two targets were nearly 100 ms shorter than the simulated movements (650 ms), the grouped data for all eight spindles during movements to all targets was 634 ± 19 ms (mean ± SD). Therefore, the durations in the experimental data and simulations are of a similar magnitude to allow qualitative comparison. To check how well the models predicted the temporal behavior and variability during this task, the responses of a single spindle during four repeated movements to the same target were averaged (**Figure 5**). The data were aligned to movement onset and the average firing rate with 95% confidence intervals calculated (**Figure 5**, gray band). The human spindle showed a clear dynamic phase where the average maximum was 18 imp/s, up from a baseline of 8.5 imp/s, and a static component where average rate was 12 imp/s.

Since these movements are well described by a minimum jerk trajectory, we calculated the minimum jerk trajectory to the same target, with movement duration equal to 650 ms (**Figure 5**). Three of the models showed qualitative features similar to the human data (**Figure 5**). The remaining three models did not make good predictions of the static phase activity: one was too strong (orange) while the others were too weak (yellow and green) compared to the human data.

These results suggest that some of the models do a better job at predicting the temporal profile of human muscle spindle firing rates during this task. While we have not attempted to match the human kinematics in detail, the predicted minimum jerk trajectories result in predicted firing rates that have clear dynamic and static phases that are similar in magnitude to the average human data. Next we examine whether the models can predict firing rates during movements to all targets in the center-out task.

## Comparing Directional Tuning

The human ECR spindle data were analyzed to measure the mean firing rate during the dynamic and static phases of movement to each of the eight targets. Six of the eight afferents were directionally tuned during the dynamic (move) phase (bootstrap, p < 0.05) while only three were significantly tuned during the static (hold) phase. The data were then used to estimate the directional tuning for the population. The mean vector for the dynamic phase had a normalized length of 0.28 and an angle of 239◦ . The mean vector for the static phase was 0.11 long with an angle of 240◦ . A bootstrap test showed that both vectors were significantly tuned (p < 0.05). This result differs from the previous study in which these units were part of a larger sample of ECR spindle afferents. In the previous study, the data were not significantly tuned during the static phase (Jones et al., 2001).

All the models showed directional tuning during the dynamic phase with a mean vector angle of 225◦ . The lengths of the mean vectors are given in the legend for **Figure 6** where the directional tuning is illustrated in polar coordinates. The central gray area illustrates the distribution of the human data and the black arrow the direction of the mean vector. The accuracy of the mean vector predicted by the models (**Figure 6** red arrow) was surprising given the variability in the human data and the simplifying assumptions for the biomechanical wrist model and simulations. The human data shows higher firing rates than those predicted by the models in the direction opposite the preferred direction, which could be due to co-activation of gamma or beta motor neurons not accounted for in these simulations.

The data in **Figures 4**–**6** provide qualitative evaluation of the ability of the models to match the human ensemble data. RMS errors were calculated for seven different measures during the center-out task during movements to, and holding on, the eight targets to provide a quantitative comparison of the models. The RMS errors were normalized and summed to produce a total error score, which is a relative measure of the models against each other. These data are presented in **Table 2** and indicate that the three models with the lowest error scores in rank order are: red, Chen and Poppele (1978); purple, Hasan (1983); and blue,

green) were not significantly tuned during the hold phase (mean vector length = 0.0), while the others had the same preferred direction (225◦ ) and normalized mean vector lengths of: 0.25 (red), 0.64 (orange), 0.28 (blue), and 0.28 (purple). The mean vector for the human data had an angle of 240◦ and a length of 0.11, which was significant (*p* < 0.05).

Prochazka and Gorassini 2 (Prochazka and Gorassini, 1998b). Therefore, these models provide the best overall prediction of the ensemble human muscle spindle data.

# Random Movements

In the previous section we found that previous models of cat muscle spindles could approximate firing rate and directional tuning of a sample of human ECR muscle spindles. The three models with the lowest overall error scores were used to estimate how much information muscle spindle firing rate carries about movement. This was achieved by simulating random movements

#### TABLE 2 | Overall error for model ranking.


*RMS errors for the seven different measures were calculated then normalized by dividing by the median error of the six models in each column. Errors less than one indicate the model is in the top three for predictive ability for that measure. The last column is the sum of the normalized errors. The red model Chen and Poppele (1978) has the lowest summed error.*

that dissociate joint position and velocity to maximize the entropy of the stimulus while remaining an ecologically valid input.

#### Prediction of the Firing Rate Time Series

The time series plots of firing rate predicted by the three spindle models during simulated random movements at two speeds are shown in **Figure 7A**. The predicted firing rates during the fast movements had a striking "peaky" structure that showed sharp rises to a peak firing rate followed by periods of low to zero firing rates. The primary differences predicted by the three models are the dynamic gain and frequency response with the Chen and Popele (red) model having the lowest dynamic gain and frequency response. The mean predicted firing rates, over 5 min of simulated movement, for the three models ranged from 10.5 to 13.1 imp/s with peak firing rates <50 imp/s (**Table 3**). Using the mean firing rate to characterize the different models was complicated by periods of zero firing rate. The models predicted periods of unloading, or silencing, that ranged between 12.5 and 35.0% of movement duration. This had a significant effect on the probability distributions for the firing rate signals (**Figure 7B**). The probability distribution of firing for the Chen and Popele model is unimodal (if the peak at zero is ignored) while the other two models predict a bimodal distribution of firing rates. The differences in distribution are associated with differences in the dynamic gain and frequency response illustrated in the time series plots.

## Assessing the Representation of Tendon Displacement and Velocity Using Entropy and Mutual Information

The final goal of this investigation was to characterize the muscle spindle responses to sensory stimuli using information-theoretic analysis. The amount of information in the sensory stimuli was quantified by the entropy of tendon displacement and velocity at each of the eleven random movement speeds. The average entropy and standard error for tendon displacement was 3.05 ± 0.01 bits per symbol across all eleven speeds while the entropy of the tendon velocity signal increased linearly from 3.88 ± 0.01 at the slowest speed to 5.49 ± 0.01 bits per symbol at the fastest speed.

TABLE 3 | Descriptive statistics of firing rates during fast random movements.


*The units for mean and peak firing rates are (imp/s). Mean firing rates were calculated in two ways, with and without (in parentheses) the time during which predicted firing rates were zero. %Silent is the percentage of time during the random movements that firing rate is zero. Hrate* = *total entropy of firing rates (bits). Irate*,*disp* = *mutual information between firing rate and tendon displacement. Irate*,*vel* = *mutual information between firing rate and tendon velocity. Units for mutual information are bits per symbol. The standard error for mutual information estimates was* ≤*0.01 in all cases.*

The muscle spindle is a physiological transducer and the mathematical representation of transduction states that the entropy of the firing rates should be less than or equal to the entropy of the source they are encoding (Theorem 7 in Shannon, 1948). The entropies of the firing rates predicted by the models at the fastest speed (**Table 3**) are all greater than the entropy for tendon displacement and less than the entropy for the tendon velocity signal. Theoretically, if the models were perfect inverse transducers of displacement and velocity then the entropy of the firing rates would be equal to the sum of their source entropies, in this case 3.05 + 5.49 = 8.54. Instead the firing rate entropies are about half this value. But these measures of firing rate entropy do not quantify how much information firing rates encode about the stimulus.

To measure the transmission of tendon displacement or velocity information in the firing rates of the models we calculated Shannon's mutual information. The mutual information values were very low between firing rate and tendon displacement (Irate,disp in **Table 3**). On average about 3.3% of the information about tendon displacement is preserved in the firing rates of the models. In comparison about 22.8% of the information about tendon velocity is preserved in the firing rates of the models during the fastest movements (Irate,vel in **Table 3**).

histograms all have the same scale starting at zero firing rate and the dotted line indicates 20 imp/s. The means of these distributions, with and without using zero firing rate in the calculations, are given in Table 3. The shapes of firing rate histograms predicted by the three models are different: unimodal and bimodal.

DISCUSSION

Muscle spindles are an important source of sensory feedback that signal information about the position and movement of joints (Stein et al., 2004). Over the past three decades recordings have been made from single muscle spindle afferents in behaving cats, monkeys and humans. The experiments are technically difficult so valid models are helpful to consolidate understanding and guide further experiments. Until now, none of the six models compared here had been tested against a human data set. Our goal in this report was to compare six models against human data and then predict responses expected from human muscle spindles during a novel experimental paradigm: random pursuit tracking.

Experimental data during a similar task should be plotted in a similar fashion to evaluate the model predictions.

In simulations of center-out movements, we found that the ensemble firing rate and directional tuning of human ECR Ia afferents was adequately captured by three of the six models: Chen and Poppele (red), Prochazka and Gorassini 2 (P&G2, blue) and Hasan (purple). The three remaining models were considered a poorer representation of the ensemble given the error scores (**Table 2**).

# Limitations

We have used minimum jerk trajectories rather than fitting to the actual kinematics; we have excluded any terms that would capture phasic gamma or beta motor neuron activity; we have excluded the finer details of the muscle geometry and biomechanics by using tendon displacement as the main stimulus input to the models; and, we have used models that output firing rate as a continuous function rather than a series of action potentials. Despite these simplifications, as a first order approximation these models have clear predictive value for estimating human ECR muscle spindle response during wrist movements. One of the differences we noted between the models and human data was the prediction of zero firing rates during movements opposite to the preferred direction. The human spindle response during movement to target 2 (**Figure 4A**) was atypical for the sample of eight spindles reported here. The majority of human ECR spindles (75%) did not fall silent during movement to targets opposite to the preferred direction. In comparison, the majority of the models predict a period of zero firing rate. This difference between model predictions and human data could be improved by adding an EMG-linked gamma motor neuron activation term. This should be done in conjunction with human experiments that include a load at the wrist since the "non-forceful" contractions in these step-tracking data do not recruit the ECR muscle strongly. We hypothesize that by creating a load that would recruit the ECR muscle during this task, responses would switch from "hamstring-like" to "triceps surae-like." The triceps surae muscles are more strongly recruited during locomotion and required an EMG-linked term to improve the prediction of the models (Prochazka and Gorassini, 1998b).

After convincing ourselves that the models had predictive value for human spindles, we simulated the random movement task. We had a number of objectives for pursuing these simulations. First, we wanted to address the issue of whether firing rates approaching those reported during cat locomotion would be predicted during human movements that were ecologically valid. The human experimental data has been criticized for being unnaturally slow, and this has been used as a possible explanation for the lower overall firing rates (Prochazka, 1999). Second, we wanted to use a task that could predict tuning properties of muscle spindles when joint position and velocity were dissociated. Finally, we wanted to use a task that generated data amenable to analysis by statistical methods other than the "mean vector" directional tuning approach. These objectives could have been achieved in a number of ways, but we chose to match the statistics of the movements to the ergonomics of typing.

# Importance of Natural Movement Statistics?

What is the appropriate stimulus ensemble for evaluating the sensory information encoded in muscle spindle responses? If you subscribe to the notion that sensory neurons have adapted and evolved in an environment where some stimuli are more likely than others, then the answer is found in the statistics of that natural environment. Over the past decade, it has become clear that sensory encoding in the visual and auditory systems is specifically adapted to the statistics of natural environments (Simoncelli and Olshausen, 2001; Olshausen and Reinagel, 2003). In the primary visual system for example, it has been shown that models based on synthetic stimuli do not generalize to natural stimulus statistics (David et al., 2004). Based on these findings from other areas of sensory neurophysiology, we wanted to test the muscle spindle models with natural stimulus ensembles—in a statistical sense.

The random movement simulations resulted in a number of predictions that are amenable to testing. While we have ranked the models according to similarity of responses to human

# REFERENCES


data in the center-out task, the firing rates predicted during random movements will allow more definitive ranking of the models against human data. The information-theoretic analysis was useful for estimating a lower boundary of information transmission by muscle spindles. Since the models output continuous functions, that are filters of the original spike train, the entropy in the firing rates is less than that of the original spike trains. Less entropy means a lower capacity for transmitting information about the stimulus.

# Conclusions

We have found that models based on cat muscle spindle primary afferents have some predictive value for predicting the dynamic and static features of human muscle spindle firing during a simple task. The remaining discrepancy between cat and human data is the lower overall mean firing rate in humans. These models were then used to predict the information content of natural wrist movements and the mutual information between kinematics and spindle firing. This analysis predicts that the information that is represented in spindle firing rate is more strongly weighted to velocity when compared to length. This prediction, which needs to be tested empirically, suggests that biomimetic neuroprosthetic systems should concentrate more on providing velocity feedback to central structures responsible for decoding muscle spindle signals.

# ACKNOWLEDGMENTS

The primary support for this work was a grant from The Whitaker Foundation with additional support from the Alberta Heritage Foundation for Medical Research (AHFMR) and the Canadian Institutes of Health Research (CIHR). PM received a scholarship from CIHR-INMHA. We are grateful to Lora Major and Neil Tyreman who provided assistance with preparation of figures. We are also indebted to Professor Prochazka for the implicit and explicit mentoring.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Malik, Jabakhanji and Jones. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

