BIOSIGNAL PROCESSING AND COMPUTATIONAL METHODS TO ENHANCE SENSORY MOTOR NEUROPROSTHETICS

EDITED BY: Mitsuhiro Hayashibe, David Guiraud, Jose L. Pons and Dario Farina PUBLISHED IN: Frontiers in Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-718-7 DOI 10.3389/978-2-88919-718-7

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **BIOSIGNAL PROCESSING AND COMPUTATIONAL METHODS TO ENHANCE SENSORY MOTOR NEUROPROSTHETICS**

Topic Editors: **Mitsuhiro Hayashibe,** University of Montpellier, France **David Guiraud,** University of Montpellier, France **Jose L. Pons,** CSIC, Spain **Dario Farina,** Georg-August University, Germany

PLP treatment based on Myoelectric Pattern Recognition (MPR) and Augmented Reality (AR). a) Surface electrodes and marker place on the stump.

b) Augmented Reality – the virtual arm is controlled by phantom movements decoded using MPR.

c) Gaming using phantom movements.

d) Matching target limb postures as a rehabilitation task.

Figure taken from: Ortiz-Catalan M, Sander N, Kristoffersen MB, Håkansson B and Brånemark R (2014) Treatment of phantom limb pain (PLP) based on augmented reality and gaming controlled by myoelectric pattern recognition: a case study of a chronic PLP patient. Front. Neurosci. 8:24. doi: 10.3389/fnins.2014.00024

Though there have been many developments in sensory/motor prosthetics, they have not yet reached the level of standard and worldwide use like pacemakers and cochlear implants. One challenging issue in motor prosthetics is the large variety of patient situations, which depending on the type of neurological disorder. To improve neuroprosthetic performance beyond the current limited use of such systems, robust bio-signal processing and model-based control involving actual sensory motor state (with biosignal feedback) would bring about new modalities and applications, and could be a breakthrough toward adaptive neuroprosthetics. Recent advances of Brain Computer Interfaces (BCI) now enable patients to transmit their intention of movement. However, the functionality and controllability of motor prosthetics itself can be further improved to take advantage of BCI interfaces.

In this Research Topic we welcome contribution of original research articles, computational and experimental studies, review articles, and methodological advances related to biosignal processing that may enhance the functionality of sensory motor neuroprosthetics. The scope of this topic includes, but is not limited to, studies aimed at enhancing:


**Citation:** Hayashibe, M., Guiraud, D., Pons, J. L., Farina, D., eds. (2016). Biosignal Processing and Computational Methods to Enhance Sensory Motor Neuroprosthetics. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-718-7

# Table of Contents


*110 Treatment of phantom limb pain (PLP) based on augmented reality and gaming controlled by myoelectric pattern recognition: a case study of a chronic PLP patient*

Max Ortiz-Catalan, Nichlas Sander, Morten B. Kristoffersen, Bo Håkansson and Rickard Brånemark

*117 A neurochemical closed-loop controller for deep brain stimulation: toward individualized smart neuromodulation therapies*

Peter J. Grahn, Grant W. Mallory, Obaid U. Khurram, B. Michael Berry, Jan T. Hachmann, Allan J. Bieber, Kevin E. Bennet, Hoon-Ki Min, Su-Youne Chang, Kendall H. Lee and J. L. Lujan

*128 Post-stroke balance rehabilitation under multi-level electrotherapy: a conceptual review*

Anirban Dutta, Uttama Lahiri, Abhijit Das, Michael A. Nitsche and David Guiraud


Soichiro Morishita, Keita Sato, Hidenori Watanabe, Yukio Nishimura, Tadashi Isa, Ryu Kato, Tatsuhiro Nakamura and Hiroshi Yokoi


Thomas C. Bulea, Saurabh Prasad, Atilla Kilicarslan and Jose L. Contreras-Vidal

*198 Temporal alignment of electrocorticographic recordings for upper limb movement*

Omid Talakoub, Milos R. Popovic, Jessie Navaro, Clement Hamani, Erich T. Fonoff and Willy Wong


# Editorial: Biosignal processing and computational methods to enhance sensory motor neuroprosthetics

Mitsuhiro Hayashibe<sup>1</sup> \*, David Guiraud<sup>1</sup> , Jose L. Pons <sup>2</sup> and Dario Farina<sup>3</sup>

<sup>1</sup> French Institute for Research in Computer Science and Automation, University of Montpellier, Montpellier, France, <sup>2</sup> Spanish National Research Council, Madrid, Spain, <sup>3</sup> Department of Neurorehabilitation Engineering, University Medical Center Göttingen, Georg-August University, Göttingen, Germany

Keywords: neuroprosthetics, electromyography, electroencephalography, brain-computer interface, neurorehabilitation

Neuroprosthetics is an interdisciplinary field of study that comprises neuroscience, computer science, physiology, and biomedical engineering. Each of these areas contributes to finally enhance the functionality of neural prostheses for the substitution or restoration of motor, sensory or cognitive funtions that might have been damaged as a result of an injury or a disease. For example, heart pace makers and cochlear implants substitute the functions performed by the heart and the ear by emulating biosignals with artificial pulses. These approaches require reliable bio-signal processing and computational methods to provide functional augmentation of damaged senses and actions.

#### Edited by:

José Del R. Millán, Ecole Polytechnique Fédérale de Lausanne, Switzerland

#### Reviewed by:

Ricardo Chavarriaga, Ecole Polytechnique Fédérale de Lausanne, Switzerland

\*Correspondence:

Mitsuhiro Hayashibe mitsuhiro.hayashibe@inria.fr

#### Specialty section:

This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience

Received: 29 July 2015 Accepted: 26 October 2015 Published: 05 November 2015

#### Citation:

Hayashibe M, Guiraud D, Pons JL and Farina D (2015) Editorial: Biosignal processing and computational methods to enhance sensory motor neuroprosthetics. Front. Neurosci. 9:434. doi: 10.3389/fnins.2015.00434

This Research Topic aims at bringing together recent advances in sensory motor neuroprosthetics. This issue includes research articles in all relevant areas of neuroprosthetics: (1) biosignal processing, especially of Electromyography (EMG) and Electroencephalography (EEG) signals, and other modalities of biofeedback information, (2) computational methods for modeling parts of the sensorimotor system, (3) control strategies for delivering the optimal therapy, (4) therapeutic systems aiming at providing solutions for specific pathological motor disorders, (5) man-machine interfaces, such as a brain-computer interface (BCI), as an interaction modality between the patient and the neuroprostheses.

One challenging issue in motor prosthetics is the variability in the clinical presentation of patients, who show a variety of neurological disorders and physiological conditions. In order to improve neuroprosthetic performance beyond the current limited use, reliable biosignal processing for extracting the intended neural information is needed (Farina et al., 2014). This information extraction stage can also be based on a modeling approach. Personalized neuroprosthetics with bio-signal feedback (Hayashibe et al., 2011; Borton et al., 2013; Li et al., 2014) could be a break-through toward intelligent neuroprosthetics. Combining different engineering techniques, such as in a hybrid approach (Del-Ama et al., 2014), is essential to expand the range of technological applications for wider patient populations. Recent advances of BCI are also relevant in this field to enable patients to transmit their intention of movement and its usage both for functional and rehabilitative purposes.

This Research Topic comprises original research activities in different levels of maturity ranging from hypothesis and poof-of-concept (Dutta et al., 2014; Grahn et al., 2014b) to systems already tested with some patients. It also contains a variety of approaches from computational method to experimental studies. Following the recent intensive developments of advanced BCI systems (Leeb et al., 2015; Muller-Putz et al., 2015), many contributions in this Research Topic are provided in the field of BCI, both with the aim of functional replacement and for neurorehabilitation. We overview those contributions for each category.

## 1. SIGNAL PROCESSING OF EMG AND MECHANICAL SENSORS

Cervical spinal cord injury (SCI) paralyzes muscles of the hand and arm, making it difficult to perform activities of daily living. Any reaching system requires a user interface to decode parameters of an intended reach. Corbett et al. (2014) present the benefits of combining different signal sources to control the reach in people with a range of impairments. A multimodaldecoding algorithm was developed while shoulder EMGs and gaze information were utilized for effective reaching task with assistive robot control, which provides guiding mobilization of the limb.

Powered prostheses are often controlled using EMG signals, which may introduce high levels of uncertainty even for simple tasks. According to Bayesian theories, higher uncertainty should influence how the brain adapts the motor commands in response to the perceived errors. Johnson et al. (2014) provide a simplified comparison framework of prosthesis and able-bodied control by studying adaptation with three control interfaces: joint angle, joint torque, and EMG. Increased errors and decreased visual uncertainty led to faster adaptation. This result suggests that Bayesian models are useful for describing prosthesis control and the man-machine interaction problem.

Lambrecht et al. (2014) present the first steps toward a more user-friendly and context-aware neuroprosthesis for tremor suppression and real-time monitoring. This methodology will enable the monitoring of tremor with context awareness by facilitating the automatic identification of the relative orientation of the sensor location.

## 2. COMPUTATIONAL METHODS FOR MODELING TARGETED SENSORI MOTOR SYSTEM AND CONTROL OF NEUROPROSTHETICS

This section overviews articles that are oriented toward new types of modeling and control for sensory motor neuroprosthetics.

An equilibrium-point control of human elbow-joint movement is proposed in Matsui et al. (2014) by using multichannel functional electrical stimulation. In this study, a computational electrical stimulation method that stimulates units of agonist-antagonist muscle pairs is developed. Muscle co-contraction level along with the total force was controlled for elbow joints with FES. In Klauer et al. (2014), a feedback control system is proposed for Neuro-Muscular Electrical Stimulation (NMES) to enable reaching in people with no residual voluntary control of the arm and shoulder due to high level SCI. NMES is applied to the deltoids and the biceps muscles and integrated with a three degrees of freedom (DoFs) passive exoskeleton, which partially compensates gravitational forces.

As for sensory modeling, Williams and Constandinou (2014) aimed at combining efficient implementations of biomechanical and proprioceptor models in order to generate signals that mimic human muscular proprioceptive patterns for future experimental work in prosthesis feedback. A neuro-musculoskeletal model of the upper limb with seven DoFs and 17 muscles is presented and generates real time estimates of muscle spindle and Golgi Tendon Organ neural firing patterns. The paper (Alnajjar et al., 2015) addresses the concept of sensory synergies. In contrast to muscle synergies, it hypothesizes that sensory synergies play an essential role in integrating the overall environmental inputs to provide low-dimensional information to the CNS. To examine the hypothesis, posture control experiments were conducted involving lateral disturbance on healthy participants.

Decoding the motor intent from recorded neural signals is essential for the development of neuroprostheses. To facilitate online decoding, Abdelghani et al. (2014) describe a software platform to simulate neural motor signals recorded with peripheral nerve electrodes, such as longitudinal intrafascicular electrodes (LIFEs). The simulator uses stored motor intent signals to drive a pool of simulated motoneurons with various spike shapes, recruitment characteristics, and firing frequencies.

A review article of Grahn et al. (2014a) summarizes neuroprosthetic technology for improving functional restoration following SCI and describes BCIs suitable for control of neuroprosthetic systems with multiple degrees of freedom. Additionally, stimulation paradigms that can improve synergy with higher planning centers and improve fatigue-resistant activation of paralyzed muscles are discussed.

## 3. THERAPEUTIC SYSTEMS TARGETED TO SPECIFIC PATHOLOGICAL MOTOR DISORDERS

In this section, we overview the clinical applications enhanced by advanced computations.

Ortiz-Catalan et al. (2014) address the treatment of phantom limb pain (PLP) based on augmented reality and gaming controlled by myoelectric pattern recognition. The technology applied is non-invasive and combines the prediction of motion intent through the decoding of myoelectric signals, with the inclusion of virtual and augmented reality. As opposed to conventional mirror therapy, this system allows full range of motion and direct volitional control of the virtual limb.

Grahn et al. (2014b) demonstrate a neurochemical closed-loop controller for deep brain stimulation (DBS). This technology report article summarizes the current understanding of electrophysiological and electrochemical processing for control of neuromodulation therapies. Additionally, it describes a proof-of-principle closed-loop controller that characterizes DBS-evoked dopamine changes to adjust stimulation parameters in a rodent model of DBS.

Dutta et al. (2014) summarize post-stroke balance rehabilitation under multi-level electrotherapy. This hypothesis article presents a multi-level electrotherapy paradigm toward motor rehabilitation that postulates that while the brain acts as a controller to drive NMES, the state of the brain can be altered toward improvement of visuomotor task performance with non-invasive brain stimulation (NIBS). This leads to a multi-level electrotherapy paradigm where a virtual reality-based adaptive response technology is proposed for post-stroke balance rehabilitation.

## 4. BCI APPLIED FOR NEUROPROSTHETICS ENHANCEMENT

Here, we overview four articles related to motor intention extraction through brain signals for reaching and sit-standing by different approaches toward BCI-driven neuroprosthetics.

Choi (2013) presents the reconstruction of the joint angles of the shoulder and elbow from non-invasive electroencephalographic signals. The cortical activities were estimated from 64 channels electroencephalography (EEG) signals using the Hierarchical Bayesian estimation while continuous arm reaching movements. From the estimated cortical activities, a sparse linear regression method was used to reconstruct the electromyography (EMG) signals of nine arm muscles. Then, a modular artificial neural network was used to estimate four joint angles from the estimated EMG signals.

Morishita et al. (2014) address BMI to control a prosthetic arm with monkey's electrocorticography (ECoG) during periodic movements. This study demonstrated an improvement of the response time for detecting the motor intention from the cortical signal. It focused on the generation of a trigger event by decoding muscle activity in order to predict integrated electromyograms (iEMGs) from the ECoGs.

In Lew et al. (2014), single trial prediction of self-paced reaching directions from EEG signals is demonstrated. The feasibility of predicting movement directions in self-paced upper limb center-out reaching tasks in single trials is studied. Spontaneous movements executed without an external cue, are natural motor behavior in humans. Thus, BCI for self-paced motions is important. It reports results of non-invasive EEG recorded from mild stroke patients and healthy participants.

Bulea et al. (2014) discuss sitting and standing intention decoded from scalp EEG recorded prior to movement execution. Low frequency signals recorded from non-invasive EEG, in particular movement-related cortical potentials (MRPs), are associated with preparation and execution of the movement. The paper investigated the ability to decode movement intent from the delta-band (0.1–4 Hz) of the EEG signal recorded immediately before the movement execution in healthy volunteers. This study demonstrates that delta-band EEG recorded immediately before the movement carries discriminative information regarding movement type.

The detection of movement-related components is useful in brain-machine interfaces. A common approach is to classify the brain activity into a number of templates or states. However, complex arm movements such as reaching and grasping are prone to cross-trial variability due to the way movements are performed. The paper by Talakoub et al. (2015) presents a method of alignment that accounts for the variabilities in the way the movements are conducted. Arm speed was used to align neural activity. Four subjects had ECoG electrodes implanted over their primary motor cortex using the upper limb contralateral to the site of electrode implantation.

Human learning effect through neuro feedback in BCI are addressed in two articles in this Research Topic. In Prins et al. (2014), an adaptive BMI that can handle inaccuracies in the feedback is described and it is shown that it produces adaptive reinforcement learning based BMIs in a simulation study. A critic confidence measure, which indicated how appropriate the feedback is for updating the decoding parameters of the user is introduced. The results show that with the new update formulation, the critic accuracy is no longer a limiting factor for the overall performance.

Restorative BCI are increasingly used to provide feedback of neuronal states to normalize pathological brain activity and achieve behavioral gains. However, patients often show a large variability, or even inability of BCI control. The paper by Bauer and Gharabaghi (2015) presents a Bayesian model of neurofeedback and reinforcement learning for different threshold selection strategies in a simulation to study the impact of threshold adaptation of a linear classifier on optimizing restorative BCIs.

The contributions in this Research Topic describe a large variety of computational methods with unique approaches. As we have seen the necessity of different approaches for different applications, there are significant needs to correspond to patientspecific problems in neurorehabilitation and neuroprosthetics. This issue demonstrated a way to manage such complex scientific questions through biosignal processing and computational methods. The relevance of the presented contributions is testified by the fact that this Research Topic is the most viewed among all special issues in the category of neuroprosthetics under Frontiers in Neuroscience (61,182 views as of 20 Oct 2015). We would like to acknowledge all the authors of the 19 papers in this issue. As neurofeedback loop is essential to improve neuroprosthetic control, the exchanges and discussions in this interdisciplinary field will lead the advancement of neuroprosthetics technology with active information loop in our society. We hope this Research Topic may take a role of triggering synergistic effect for further development among researchers in this field.

## REFERENCES


a Bayesian simulation. Front. Neurosci. 9:36. doi: 10.3389/fnins.2015. 00036


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Hayashibe, Guiraud, Pons and Farina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multimodal decoding and congruent sensory information enhance reaching performance in subjects with cervical spinal cord injury

## *Elaine A. Corbett 1,2,3, Nicholas A. Sachs 4, Konrad P. Körding1,2,5 and Eric J. Perreault 1,2,4\**


#### *Edited by:*

*Jose L. Pons, Consejo Superior de Investigaciones Científicas, Spain*

#### *Reviewed by:*

*Ricardo Chavarriaga, Ecole Polytechnique Fédérale de Lausanne, Switzerland Martin Lotze, University of Greifswald, Germany*

#### *\*Correspondence:*

*Eric J. Perreault, Sensory Motor Performance Program, Rehabilitation Institute of Chicago, 345 E Superior St., Chicago, IL 60611, USA e-mail: e-perreault@ northwestern.edu*

Cervical spinal cord injury (SCI) paralyzes muscles of the hand and arm, making it difficult to perform activities of daily living. Restoring the ability to reach can dramatically improve quality of life for people with cervical SCI. Any reaching system requires a user interface to decode parameters of an intended reach, such as trajectory and target. A challenge in developing such decoders is that often few physiological signals related to the intended reach remain under voluntary control, especially in patients with high cervical injuries. Furthermore, the decoding problem changes when the user is controlling the motion of their limb, as opposed to an external device. The purpose of this study was to investigate the benefits of combining disparate signal sources to control reach in people with a range of impairments, and to consider the effect of two feedback approaches. Subjects with cervical SCI performed robot-assisted reaching, controlling trajectories with either shoulder electromyograms (EMGs) or EMGs combined with gaze. We then evaluated how reaching performance was influenced by task-related sensory feedback, testing the EMG-only decoder in two conditions. The first involved moving the arm with the robot, providing congruent sensory feedback through their remaining sense of proprioception. In the second, the subjects moved the robot without the arm attached, as in applications that control external devices. We found that the multimodal-decoding algorithm worked well for all subjects, enabling them to perform straight, accurate reaches. The inclusion of gaze information, used to estimate target location, was especially important for the most impaired subjects. In the absence of gaze information, congruent sensory feedback improved performance. These results highlight the importance of proprioceptive feedback, and suggest that multi-modal decoders are likely to be most beneficial for highly impaired subjects and in tasks where such feedback is unavailable.

#### **Keywords: eye-tracking, electromyography, spinal cord injury, Kalman filter, proprioceptive feedback**

## **INTRODUCTION**

Injuries to the cervical spinal cord can be devastating, resulting in lost function in both the upper and lower limbs. Many people with such injuries consider the restoration of hand and arm function to be of highest importance for improving their quality of life (Anderson, 2004; Collinger et al., 2013). People with high tetraplegia—injuries at the fourth cervical level (C4) or above—may have no movement in the arm except for possibly shoulder shrug through upper trapezius activity. For these individuals, simple every-day tasks such as feeding and grooming cannot be achieved without assistance. Consequently, methods to improve reaching and grasping could greatly increase the level of independence for this population. One of the major difficulties associated with developing such assistive devices is the limited set of physiological signals available for use in a control interface. Furthermore, sensory feedback of the reaching movement may be vital for control, and is often impaired or absent in these individuals. As more complex systems are being developed that can provide control of continuous reach trajectories to people with high tetraplegia (Crema et al., 2011; Hart et al., 2011; Cooman and Kirsch, 2012; Schearer et al., 2013), finding appropriate signal sources and developing intuitive user interfaces is an even greater challenge.

Many approaches for inferring an intended reach trajectory rely on neural signals, of which electromyograms (EMGs) are an attractive option when a non-invasive or minimally invasive approach is desired (Kilgore et al., 2008). However, when the set of available muscles is extremely limited or unrelated to the intended movements, control can be difficult and unintuitive (Williams and Kirsch, 2008). Brain-machine interfaces (BMIs) have the potential to provide more natural control (Collinger et al., 2012; Ethier et al., 2012; Hochberg et al., 2012), although most BMIs that have successfully controlled reach involve invasive cortical recordings, a technology that is currently inaccessible in most clinical situations. Combining information from disparate sources has been proposed as a solution when there are few signals accessible (Batista et al., 2008; Pfurtscheller et al., 2010; Leeb et al., 2011; Corbett et al., 2013a; Novak et al., 2013; Kirchner et al., 2014). As the set of usable signals from each individual may be different, it is important to be able take advantage of all the useful channels available. To gain an understanding of how the benefits afforded by different combinations of signal sources are influenced by impairment level, interface approaches must be tested in users with a variety of needs and abilities.

The feedback provided to the user is also critical when controlling trajectories. The type of feedback can vary depending on the function of the interface and the needs of the user. External robotic arms can enable people to interact with their environment (Hochberg et al., 2012), typically providing only visual feedback of the robot during control. Recent research promises to enhance control by artificially providing additional feedback through electrical (Dhillon and Horch, 2005; London et al., 2008; Rossini et al., 2010; Tan et al., 2013) or optogenetic (Gilja et al., 2011) stimulation. However, a subset of users may be able to take advantage of at least some natural proprioceptive information if their arm is moved with the assistive device. This could be achieved by mechanically moving the hand and arm with a robotic exoskeleton (Cavallaro et al., 2006), or using functional electrical stimulation (FES) to stimulate the motor nerves and reanimate paralyzed muscle (Hart et al., 1998; Kilgore et al., 2008; Schearer et al., 2013). Proprioception is critical in normal motor control (Sainburg et al., 1995), and studies suggest that it can also enhance BMI performance in unimpaired monkeys (Suminski et al., 2010) and humans (Ramos-Murguialday et al., 2012). It is still unclear how assisted reaching in paralyzed individuals is affected by whether they are controlling movement of their arm vs. an external device.

The objective of this study was to investigate how the control of reach trajectories in individuals with cervical SCI was affected under various decoding conditions, by testing two critical aspects of the interface. We evaluated the utility of combining disparate signal sources to enhance trajectory control, and also compared two different feedback approaches. The participants had a wide range of impairment levels; some had substantial control of the proximal arm muscles, while others had little or no ability to move the arm. We tested their performance using two decoders one combining gaze and EMG and another with EMG alone—in a robot-assisted reaching paradigm that we had previously developed (Corbett et al., 2013a). We also evaluated how reaching performance was influenced by task-related sensory feedback by testing the decoder using EMG alone under two conditions comparing remote control of the robot to that when the robot moved the arm in the task. By evaluating these tasks in people with a variety of injury characteristics we could examine the benefits of the different assisted reaching approaches with respect to their level of impairment. Portions of this work were presented at the 6th International IEEE EMBS Conference on Neural Engineering (Corbett et al., 2013b).

## **MATERIALS AND METHODS**

To establish the utility of the multimodal decoder, combining gaze and EMG, we compared its performance to a decoder that used EMG alone and one combining EMG with perfect target information. While perfect target information would be unlikely to be available in a practical setting, this condition was useful for comparison, serving as a best-case scenario for the paradigm. These comparisons were performed in subjects with a range of injury levels, so that the benefits afforded by sensor fusion for different impairment levels could be assessed. Additionally, to assess the importance of providing the subjects with congruent sensory feedback of the task we compared the assisted reaching task to a remote control paradigm with the EMG-alone decoder. This was performed by a subset of subjects who could activate sufficient muscles to make control with EMG alone viable. The decoding algorithms have been previously described in detail (Corbett et al., 2012, 2013a, 2014); here we outline the intuition behind them before describing the experiments, which are the main contribution of the present work.

## **DECODING ALGORITHMS**

We used a Bayesian approach to combining signal sources, taking into account the uncertainty inherent in the predictions of the various models. The decoder using EMG alone was a generic Kalman filter (KF) (Kalman, 1960; Wu et al., 2006). The state vector that we were trying to estimate consisted of the reach kinematics. At each time-step the KF propagated a prior estimate of the current state from the previous state posterior estimate using a linear trajectory model that described the probabilistic evolution of the state. This prior estimate was then updated using that time-step's observation—features from the corresponding window of EMG—through a linear observation model, resulting in the current posterior state estimate. For the KF, trained using reaches to a set of targets, the trajectory model biased the movement toward an "average target," while movements in other directions could be generated through the observations (the subject's EMG).

We created a directional trajectory model by inserting the target position into the state vector (Kemere and Meng, 2005; Mulliken et al., 2008). With perfect knowledge of the target we called this model the KFT. In this case the trajectory model biased the movement toward the target, thus requiring less directional change through the user's EMG in the observation update. This inclusion of the target into the trajectory model also allowed for a more stereotyped model of the reach, where the hand would speed up when the target was distant and slow down when it was close (Corbett et al., 2012).

When obtaining target estimates from gaze we had to account for multiple potential targets, as people may also look at other locations before initiating a reach. To achieve this we used mixture of KFTs (mKFT), where we initiated an instance of the KFT for each potential target and weighted them probabilistically. The weights were proportional to a prior probability for each target that we obtained from the gaze data, and the likelihood of the observations (EMG) for each model. Therefore, as the reach progressed and more EMG information was integrated the decoder output converged to the most likely of the possible trajectories. The signal sources used in each of the algorithms are summarized in **Table 1**.

### **SUBJECTS**

Eight subjects with tetraplegia participated in this study. Each subject provided informed consent to the protocol, which was approved by Northwestern University's Institutional Review Board. Before commencing the experiment, we asked the subject a few basic questions to ensure safety in the experiment and to establish the number of years since his/her injury. We asked the subjects to perform shoulder flexion voluntarily, and measured the angle achieved. The subjects were separated into two groups. Group 1 consisted of the subjects who could perform more than 5◦ of shoulder flexion in their right arm using the deltoid muscle; subjects who had little or no voluntary ability to perform this movement made up Group 2. Shoulder flexion was the degree of freedom that best reflected the subjects' ability to perform the task (see description in Section Experimental setup). Group 1 included subjects who participated in the decoder comparison experiments (Group 1a) and the remote control experiments (Group 1b). There was substantial overlap between these groups but they were not identical due to subject availability (see **Table 2**). Group 2 only participated in the decoder comparison experiments.

#### **Table 1 | Decoders tested and the corresponding signal sources.**


#### **Table 2 | Subject details.**


## **EXPERIMENTAL SETUP**

To generate reaching movements we used a robotic system that served as an assisted reaching prosthesis. Each subject was seated in his/her own wheelchair during the experiments. For the experiments in which the subject's arm was moved through the reaches, his/her right arm was supported against gravity by an elevating mobile arm support (JAECO Orthopedic MASEAL, Hot Springs, AR), while he/she wore a wrist splint that was attached to the handle of a 3 degree-of-freedom robot (HapticMaster; Moog FCS, the Netherlands). A magnet attachment was designed to release if excessive forces were applied at the hand. The velocities predicted by the decoders were used to position the robot handle, enabling a clear comparison of performance issues related to decoders and signal sources.

All experiments involved a reaching task, either with (Assessing the influence of impairment on decoder performance) or without (Assessing the influence of proprioceptive feedback on decoding performance) the subject's arm attached to the robot. The goal of the task was to move the robot to a target on a touch-screen monitor (Planar PT19, Beaverton, OR) in front of the subject (**Figure 1A**). A spring-loaded stylus was attached to the robot end-effector, and used to detect contact with the target. The monitor and HapticMaster positions were recorded using an Optotrak motion analysis system (Northern Digital Inc., Canada) so that positions on the monitors could be transformed into the HapticMaster coordinate system. We recorded eye movements with an EYETRAC-6 head mounted eye tracker (Applied Science Laboratories, Bedford, MA), whose position was also monitored with the Optotrak. The position of the eye was digitized relative to the eye-tracker before its use, so that the gaze data could be projected onto the screen and transformed into the appropriate coordinate systems. All signals were recorded simultaneously and processed at 60 Hz, so as to generate a real-time velocity command signal to control the robot.

Consistent with our previous experiments in able-bodied subjects, we recorded EMGs from the three heads of the deltoid and the upper trapezius from the subjects who could voluntarily activate those muscles, and just the upper trapezius from one

subject who had no voluntary control of the deltoid (**Figure 1B**). The EMG signals were amplified and band-pass filtered between 10 and 1000 Hz using a Bortec AMT-8 (Bortec Biomedical Ltd., Canada), anti-alias filtered using 5th order Bessel filters with a cut-off frequency of 500 Hz, and sampled at 2400 Hz. Features were extracted from a 16.6 ms window of each EMG channel for use as observations in each of the decoders. The square-root transformed RMS and number of zero-crossings were selected as amplitude and frequency related features, respectively.

#### **PROTOCOLS**

#### *Assessing the influence of impairment on decoder performance*

The goal of the first set of experiments was to establish the utility of combining gaze with EMG, compared to decoding with EMG alone, with subjects spanning a range of needs and abilities. For these experiments the robot moved the subjects' arms along with the decoded reach, providing similar feedback to an exoskeleton, or possibly an FES interface. Before the decoders could be tested, training data was collected to train the models. Each experiment began with a set of training reaches in which EMG and kinematic data were collected. This involved the robot moving automatically along a straight-line trajectory to a set of nine targets spanning the monitor area. Each target appeared four times in random order. The subject was instructed to gently assist the reach as their hand was moved along the trajectory. EMGs were recorded (**Figure 2**) to quantify subject involvement and to train the decoding algorithms. We chose this method because we wanted control to be intuitive; it was important that the recorded EMGs corresponded as closely as possible to those a subject would naturally generate when attempting to make smooth reaching movements in our experimental setup. These same data were used to train all three decoders listed in **Table 1**, as we have described previously (Corbett et al., 2013a).

We presented subjects with a reaching task to evaluate decoding quality. For each trial a target randomly appeared on the monitor, 1 s before an auditory go cue. The goal was to place the stylus as close to the center of the target as possible. After the go cue, the reach was initiated when the square-root-transformed

RMS value of any EMG channel increased above twice its level prior to the go cue. For the subject with no voluntary deltoid activation, the contralateral upper trapezius was also recorded to allow her to initiate reaches where she would not normally activate the ipsilateral muscle, by shrugging her left shoulder. However, this muscle was not included as a part of the decoder as it was not involved in the natural reach. Thus, while the subjects were unable to control the robot before the go cue, the reaches were self-paced in the sense that they could initiate them at their leisure after the cue. After initiation, the decoded velocity was used to control the robot's reach.

Upon initiation of a reach, the decoder was provided with the initial state vector including the robot's current position. When testing the KFT and mKFT, target estimates were also initialized in the state vector. In the case of the KFT, the actual location of the target center was provided. For the mKFT, the gaze data from the half-second period prior to initiation were used to estimate three potential targets with which to initialize a corresponding mixture component. The three-dimensional location of the eye gaze was calculated by projecting its direction onto the monitors. The first, middle and last samples were selected, and all other samples were assigned to a group according to which of the three was closest. The means of these three groups were used to initialize three KFTs in the mixture model and their priors were assigned proportionally to the number of samples in them. If the subject looked at multiple positions prior to reaching, including the target, the correct target would be accounted for in one of the mixture components.

Each target consisted of a green circle of 1 cm radius surrounded by five rings of various colors 1 cm thick. When the target was attained its color changed to that of the location corresponding to where the stylus touched. For a missed target, or if the reach timed out (after 10 s), the target turned red. For attaining the green circle the subject received a score of 10 points and for outer rings they received 9, 8, 7, 6, and 5 points. Feedback of the cumulative total of the most recent 10 reaches was displayed to increase motivation.

Each subject in Groups 1a and 2 performed an experiment for the interface with EMG alone (KF) and one for the models incorporating target information (KFT and mKFT). The order of these experiments was randomized across subjects. Due to difficulty obtaining an eye-tracking signal we were unable to test the mKFT with Subject 4, though he performed the KFT experiment. The KFT, with perfect target information, represented an idealized benchmark for the performance of a combined target and EMG decoder. After initial setup, each experiment began with the training protocol described above. In the KF experiments this was followed by between 10 and 30 practice reaches and 60 test reaches. For the experiments with target information, 60 KFT reaches were performed first. This was followed by eyetracker calibration, up to 10 practice mKFT reaches and finally 60 test reaches with the mKFT. Eye-tracker calibration was checked periodically throughout the experiment and if found to be off, generally due to the headset shifting on the subject's head, we recalibrated the system and repeated any affected trials.

To put the decoder performance in context with the subjects' voluntary reaching abilities, we also asked them to attempt to reach each of the training targets while the HapticMaster was in "free mode," supporting its own weight against gravity. This would differ from their unassisted reach abilities, as their arms were supported against gravity with the mobile arm support.

## *Assessing the influence of proprioceptive feedback on decoding performance*

Many of the subjects could voluntarily activate sufficient EMG at the shoulder to make control with EMG alone viable. It was unclear whether this would be possible in a different decoding scenario such as an external robotic arm or computer-based interface, where their arm was not being moved in congruence with the decoder. The robot-assisted reaching task was providing these subjects with at least some natural proprioceptive information, and we wanted to establish how important a role this played in our results. Therefore, for the subjects who had more voluntary ability, we compared performance of the KF (with EMG alone) for both remote control of the robot and attached control as described in the previous experiment.

The protocol for attached control was exactly as described above. For remote control, the models were trained by the subjects attempting to mimic the movement of the robot as naturally as possible in the training reaches, without any physical attachment to the robot. In testing, subjects were free to move their arm as they wished while attempting to direct the robot to the targets. At least 20 practice reaches were performed before the testing reaches. This protocol meant that the conditions were compared using models that were trained on different data, a factor that we had previously found to have a small effect on performance in able-bodied subjects (Corbett et al., 2013b). However, we considered it more important to have consistency between training and testing for these subjects as, when unassisted, it may have been impossible for them to replicate the movements generated while attached to the robot in training. The order of the two conditions was randomized across subjects. To see whether any effect of removing feedback would hold when target information was included, we also tested the KFT remotely for two of the subjects.

#### **ANALYSIS**

We used two metrics to quantify performance in both experiments. The first was a measure of how accurately the target was achieved. This was quantified as the shortest distance between the stylus tip and the target center during the reach. As the target center had a 1 cm radius, any distance less than 1 cm would correspond to perfect task performance. The second measure was one of reach straightness, used to measure the efficiency of the generated movement. This was quantified as the path efficiency, the ratio of the cumulative distance of the reach to the straight-line distance. To put the results in context with the individual subjects' abilities, these measures were compared to their voluntary performance when the weight of the arm was supported by the passive mobile arm support. We then used the grouping system described above for statistical analyses. To compare the performance of the two decoders, and how this was affected by the subjects' impairments we used an analysis of variance (ANOVA) to look at the effect of the interaction of *algorithm* and *group* on the performance metrics, with *subject* as a random effect. Tukey tests were performed for *post-hoc* comparisons, and all statistical comparisons used a significance level of α = 0*.*05. To evaluate the effect of the proprioceptive feedback in the second experiment we compared the remote and attached conditions again using an ANOVA with *condition* as a fixed effect and *subject* as a random effect, with a Tukey *post-hoc*.

## **RESULTS**

## **ASSESSING THE INFLUENCE OF IMPAIRMENT ON DECODER PERFORMANCE**

As would naturally be expected, the subjects' voluntary ability to reach the targets when assisted by the mobile arm support depended on their impairment. In fact, some of the less impaired subjects could reach much of the target area with only this gravity assistance. However, the irregular shape of the trajectories, illustrated by one of the example reaches with typical path efficiencies (**Figure 3**), suggested that they did so with substantial difficulty, correcting for multiple errors over the course of the reach. The errors at the target measured when the subjects reached voluntarily with the mobile arm support increased with subject impairment level (**Figure 4A**). Path efficiencies did not follow a similarly fixed pattern but were clearly lowest for the two most impaired participants (**Figure 4B**). While some of the subjects were clearly unable to reach the targets with gravity support, others did better but left room for improvement, particularly in terms of reach straightness.

The effectiveness of the KF decoder using EMG alone was also strongly dependent on the voluntary abilities of the subjects. Subjects from Group 1 could often guide the robot close to the target with their EMG signals (see example reach, **Figure 5A**), while those in Group 2 had greater difficulty (see example reach, **Figure 5C**). Subjects 1 and 2, the least impaired subjects, were in fact less accurate at the target with the KF than in the gravity assistance condition (**Figure 4A**). However, the decoder clearly provided improvements in reach straightness for these subjects

(**Figure 4B**). For all other subjects the EMG-alone decoder provided improvements in both accuracy and straightness relative to the mobile arm support, and this was most pronounced for the two most impaired subjects (**Figures 4A,B**). Nonetheless, the accuracy and straightness of the reaches by the subjects in Group 2 were dramatically lower than those in Group 1 using the KF (**Figures 4C,D**). The EMG control allowed the more impaired subjects to reach toward the target-display monitor, but their accuracy was very poor.

The multimodal decoders were much more consistent across individuals and enabled accurate reaching for all subjects. Unsurprisingly, the distance to the target center was lowest for all subjects when perfect target information was available—there was very little variability for this condition. The KFT results were within the margin of error for perfect system performance, as the task required an accuracy of 1 cm for a perfect score (**Figure 4**). When gaze and EMG were combined (mKFT) the performance deteriorated slightly from that with perfect target information (*p* = 0*.*003), although this difference was not statistically significant when the subjects were separated into groups (*p* = 0*.*09 in Group 1, *p* = 0*.*39 in Group 2). For this decoder subjects took time to initiate the reach when they were ready—2.3 ±0*.*7 s (mean ± SD) after the go cue—as the target estimates from the gaze position in the 0.5 s before reach initiation allowed them to make effective reaches straight toward the target (**Figures 5B,D**). Subject 1 was the least accurate of the group with the mKFT and was again less accurate than his performance with gravity support; he had some difficulty with the eye-tracking and thought he may have had a "lazy eye" (**Figure 4A**). All other subjects were consistently accurate with the mKFT. Both subject groups showed highly significant improvements between the mKFT and KF (*p <* 0*.*0001), and the difference between the two groups was minimal when the gaze was incorporated (*p >* 0*.*99). The incorporation of gaze allowed excellent target acquisition for all subjects, as would be expected with sufficiently accurate target estimates.

Reaches were also straighter for the models incorporating target information than for the one with EMG alone (**Figure 4D**). In both groups, the mKFT and KFT both averaged above 99% path efficiencies, and were not statistically different for either group (both *p >* 0*.*09). The KF, on the other hand, had average path efficiencies of approximately 95% for Group 1 and 92% for Group 2, which were significantly lower than the mKFT (both *p <* 0*.*001). This indicated that, while dramatically better than the gravity-supported reaches, the KF produced more errors in the trajectories that the users needed to correct for. Incorporating the target into the trajectory model generated more efficient, straight reaches.

Finally, to gain some insight into the subjects' EMG activation during mKFT control, we performed offline decoding using the KF algorithm, trained from the standard training data, of the reaches performed during mKFT control (KFT for Subject 4). We evaluated the accuracy of the reaches by calculating the *R*<sup>2</sup> between the decoded reach and an "ideal" straight-line reach to the target, using the trajectory profile of the training reaches. In the subjects in Group 2 for whom EMG-alone control was clearly ineffective, there was no significant difference between the accuracy of the KF decoded offline and the online KF control (both *R*<sup>2</sup> = 0*.*6, *p >* 0*.*9). The KF decoded offline was more accurate for the subjects in Group 1 (*R*<sup>2</sup> = 0*.*7, *p* = 0*.*006), although substantially lower than online KF control in Group 1 (*R*<sup>2</sup> = 0*.*9, *p <* 0*.*001). It is not surprising that without online feedback from the KF decoder the accuracy of the decoded reaches would be reduced. This result demonstrates that the users interact with each decoder differently, and can exploit the benefits of added target

**to generate them.** Kinematics and square-root transformed root-mean-squared value (RMS) of EMG for an example reach. Subject 3 with

information during mKFT control. Nonetheless, the higher accuracy of the offline KF decoding in Group 1 suggests that for these subjects the EMG information can contribute to the decoding in mKFT control.

## **ASSESSING THE INFLUENCE OF PROPRIOCEPTIVE FEEDBACK ON DECODING PERFORMANCE**

The above results show that while there was clearly an accuracy benefit to using the mKFT, for subjects in Group 1 reasonable control could be achieved using their EMG alone. We wanted to test the dependence of that performance on the natural proprioceptive feedback that was provided to the subjects by moving their arms. To do this, we compared the robot-assisted reaching task with the KF decoder to a remote control task where the subject had no mechanical link to the robot. We found that the remote performance was significantly less accurate than the attached condition, with the errors increasing from 3 to 5.5 cm (*p <* 0*.*001, **Figure 6A**). Path efficiencies were also reduced from 91% to an average of 81% (*p <* 0*.*001, **Figure 6B**). While it is possible remote control of the robot may have improved with further practice, this is unlikely as we did not see improvements over the course of the experiments, suggesting that the subjects were not learning further. Clearly, congruent proprioceptive feedback was a critical component of the interface for the subjects in Group 1, and reaches were less accurate and less straight without it.

To establish whether the importance of proprioceptive feedback extended to the decoder with target information, two subjects additionally performed remote control with the KFT. In this case errors were less than 1 cm, similar to the attached case above.

period before the reach initiation with EMG control; and Subject 6 with **(C)** KF using only the upper trapezius EMG and **(D)** mKFT.

The proprioceptive feedback was apparently critical only in the absence of target information, when the shoulder EMG alone guided the trajectory. With target information, accurate reaching was possible regardless of whether the subject's own arm or an external effector was being controlled.

## **DISCUSSION**

Each person with an SCI will have a unique set of challenges associated with his/her injury, and identifying the best approach to assist with reaching involves careful consideration of a number of factors. In this study we examined the benefits of a multimodal approach to decoding, considering the impact of the various injury characteristics in the group of subjects. We also examined the effect of the proprioceptive feedback that subjects experienced when interacting with the reaching interface. Combining gaze and EMG enabled effective reaching for our participants, even for those who could volitionally activate an extremely limited set of muscles. With proprioceptive feedback of the trajectories, subjects with greater voluntary ability could also perform reaches with their EMGs alone. However, the reaches were less accurate and required the users to correct for errors over the course of the trajectories. When we removed the congruent proprioceptive information, subjects were unable to accurately control trajectories without additional information about the reach target. These results highlight the importance of providing proprioceptive feedback to neuroprothesis users where possible. Furthermore, they demonstrate the promise of incorporating target information, such as that from gaze, in the absence of sufficient feedback or trajectory-related physiological signals.

## **MULTIMODAL DECODING AND THE INFLUENCE OF SUBJECT IMPAIRMENT**

Enhancing the trajectory model with information about the reach target was extremely useful for generating accurate trajectories in our robot-assisted reaching task. Reassuringly, performance was in agreement with previous tests in able-bodied subjects using similar sets of EMGs (Corbett et al., 2013a). The incorporation of the gaze data consistently enabled more accurate reaching than control with EMG alone. Furthermore, the approach produced significant improvements in path efficiencies, indicating that the reaching required less effort from the user. In particular, gains in accuracy from incorporating gaze (mKFT) were dramatic for the most impaired participants in Group 2. While there was a large difference in performance between the groups with EMG alone, they were equally accurate when the gaze was incorporated.

Subjects adapted well to the multimodal interface, finding it accurate and easy to use. This was perhaps surprising for Subject 6 in particular who had not moved her arm volitionally in the 2 years since her injury. When asked how she felt about using the interface, she said it felt like she was naturally moving her arm. This is in contrast to performance with EMG alone, where both subjects from Group 2 had little to no control. They were enthusiastic about the mKFT, to which it was doubtlessly more intuitive and easier for them to adapt than the KF. The impressions from the less impaired subjects in Group 1 were more varied. As mentioned in the results, Subject 1 had some difficulty with the eye-tracking interface, which he attributed to a "lazy eye." While the remaining subjects mostly found the mKFT easy to use, a few also enjoyed the challenge of the EMG-only decoder. For those who were particularly effective with the KF, the greater control over the trajectory was more interesting to them despite the fact that overall it was less accurate than the mKFT. The reduced effort that the multimodal decoder required of the user was also reflected in the reduced offline accuracy of the KF decoder on the mKFT reaches. This information could be useful for future attempts to find a balance between accuracy and allowing the user to use his/her capabilities as much as possible, allowing operation at the "challenge point" (Guadagnoli and Lee, 2004). Hence, even though most subjects in this study preferred the mKFT system, this feedback from the subjects emphasizes the importance of considering factors other than accuracy when determining the most appropriate system for a specific individual.

Assistive devices must be targeted to an individual's injury and, especially with support against gravity, some of the subjects in Group 1 could achieve remarkable performance even without a neuroprosthesis. A device controlling the entire movement of the arm as we have tested here would likely restrict their natural abilities and be unnecessary for these subjects. Nonetheless, many of the subjects would benefit from some assistance with reach, particularly with more distal movements. An assistive device working in seamless integration with their voluntary movements could potentially be enhanced with gaze information, possibly providing greatly improved ease of control. While the eye tracking system used in this study was for proof of concept and was not portable, there are more lightweight systems available at low cost that will be suitable for chronic use outside of the laboratory (Abbott and Faisal, 2012), and will require the development of robust calibration protocols. This multimodal approach could be useful in any situation involving selection between a small number of action candidates, and could also be adapted to a number of different signal sources. Cortical recordings have been used to decode both trajectory (Kim et al., 2008) and target information (Hatsopoulos et al., 2004), as have cortical surface potentials (Schalk et al., 2007; Pistohl et al., 2008; Flint et al., 2012) and non-invasive electroencephalogram and magnetoencephalogram-based systems (Hammon et al., 2008; Waldert et al., 2008). Furthermore, context about reach objectives could be found from scanning the environment and identifying potential targets. As it stands however, the developed interface is far more likely to be useful to people with high tetraplegia injuries at C4 or above.

### **THE INFLUENCE OF PROPRIOCEPTIVE FEEDBACK**

For those less impaired subjects who had reasonable control with their EMG alone we found that the process of moving the arm in congruence with the decoder output was critical to its success, as removing this proprioceptive information resulted in a substantial drop in performance. During unimpaired motor control, people form a sense of their arm position in space through a combination of both visual and proprioceptive cues (Graziano, 1999). Both of these components play an important role in enabling people to reach toward targets in their workspace. However, with many assistive technologies users must rely on visual feedback alone. This is unfortunately unavoidable in many cases, as the most impaired individuals may lose all sense of proprioception. This work therefore highlights the importance of current efforts to restore proprioceptive information through artificial stimulation (London et al., 2008; Gilja et al., 2011; Berg et al., 2013), while emphasizing that it could be extremely effective where possible to provide neuroprosthesis users with natural proprioceptive information about the state of their device.

Some recent work has demonstrated that adding proprioceptive feedback is useful during BMI tasks. BMIs developed for stroke rehabilitation have greater therapeutic impact when the limb is passively moved by a prosthetic device (Birbaumer et al., 2008; Buch et al., 2008). Additionally, Ramos et al. found that providing proprioceptive feedback of hand opening and closing with an exoskeleton improved BMI performance in able-bodied subjects (Ramos-Murguialday et al., 2012). Similarly, in a closedloop BMI based on intracortical recordings from non-human primates, Suminski et al. found that passively moving the arm improved performance of a 2-dimensional cursor control task (Suminski et al., 2010). Furthermore, Gaunt et al. tested providing proprioceptive feedback to a BMI user with complete paralysis but fully intact sensation. They found that in the absence of vision, moving her arm in congruence with a prosthetic arm improved control (Gaunt et al., 2013). While cortical recordings were not involved in the current study, these findings together highlight the parallels between general neuroprosthesis use, BMIs, and normal motor control.

In a decoding setting where a BMI or other neural interface is used to control an external device, the user must learn the new mapping or coordinate transformation that the decoder performs. It is critical that users are provided with effective feedback of these transformations, as trajectories are planned to be straight in visually perceived space (Flanagan and Rao, 1995). Therefore, if the goal of the BMI is to control a cursor on a screen, as in the studies mentioned above, the planning process involved may be different to that of a real reach. Providing proprioceptive cues may facilitate this planning process. As only the robot was being controlled in our paradigm, through EMG signals that are actively involved in the natural reach, we directly affected the control signals that the subjects could produce by moving their arms. This process may have provided them with greater awareness of the robotic system and facilitated more accurate and natural control, despite the fact that their sense of proprioception was impaired.

## **CONCLUSIONS**

With the amount of available signal sources and sensory information varying widely between potential users of neuroprostheses, the choice of assistive device and decoding approach must be considered separately for each individual's specific needs. Moving the arm through reaching movements clearly enables some users to get great benefit from proprioceptive information, and should be seriously considered for those who can take advantage of it. Unfortunately, this approach would be ineffective for people who have lost their sense of proprioception completely. Often, these same individuals have few signals they can volitionally activate that are related to a desired reach trajectory, making neuroprosthesis control a great challenge. A Bayesian approach taking account of the reach goal clearly has many advantages in improving reach accuracy, regardless of the feedback experienced by the user. Especially when the set of neural command signals is small, or the lack of proprioceptive feedback makes trajectory control difficult, gaze or other systems for identifying potential target locations could provide a significant improvement to a neuroprosthetic interface.

#### **ACKNOWLEDGMENTS**

The authors would like to thank Tim Haswell and Ben Walker for their work developing the data acquisition and robot control systems, Aditi Sansanwal for assistance with the experiments and Rachael Lunney for administrative assistance. This material is based upon work supported in part by the National Science Foundation under Grant No. (0939963), and in part by the National Center for Research Resources (NCRR) and the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH) though Grant Number 3UL1 RR025741. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 January 2014; accepted: 06 May 2014; published online: 23 May 2014. Citation: Corbett EA, Sachs NA, Körding KP and Perreault EJ (2014) Multimodal decoding and congruent sensory information enhance reaching performance in subjects with cervical spinal cord injury. Front. Neurosci. 8:123. doi: 10.3389/fnins.2014.00123 This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Corbett, Sachs, Körding and Perreault. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Does EMG control lead to distinct motor adaptation?

## *Reva E. Johnson1,2\*, Konrad P. Kording3,4,5, Levi J. Hargrove1,2,3 and Jonathon W. Sensinger 3,6,7*

*<sup>1</sup> Center for Bionic Medicine, Rehabilitation Institute of Chicago, Chicago, IL, USA*


#### *Edited by:*

*Dario Farina, University Medical Center Göttingen, Georg-August University, Germany*

#### *Reviewed by:*

*Chet T. Moritz, University of Washington, USA Panagiotis Artemiadis, Arizona State University, USA*

#### *\*Correspondence:*

*Reva E. Johnson, Center for Bionic Medicine, Rehabilitation Institute of Chicago, 345 E. Superior St Room 1309, Chicago, IL 60611, USA e-mail: reva@u.northwestern.edu*

Powered prostheses are controlled using electromyographic (EMG) signals, which may introduce high levels of uncertainty even for simple tasks. According to Bayesian theories, higher uncertainty should influence how the brain adapts motor commands in response to perceived errors. Such adaptation may critically influence how patients interact with their prosthetic devices; however, we do not yet understand adaptation behavior with EMG control. Models of adaptation can offer insights on movement planning and feedback correction, but we first need to establish their validity for EMG control interfaces. Here we created a simplified comparison of prosthesis and able-bodied control by studying adaptation with three control interfaces: joint angle, joint torque, and EMG. Subjects used each of the control interfaces to perform a target-directed task with random visual perturbations. We investigated how control interface and visual uncertainty affected trial-by-trial adaptation. As predicted by Bayesian models, increased errors and decreased visual uncertainty led to faster adaptation. The control interface had no significant effect beyond influencing error sizes. This result suggests that Bayesian models are useful for describing prosthesis control and could facilitate further investigation to characterize the uncertainty faced by prosthesis users. A better understanding of factors affecting movement uncertainty will guide sensory feedback strategies for powered prostheses and clarify what feedback information best improves control.

**Keywords: prosthesis control, EMG, motor adaptation, uncertainty, sensory feedback**

## **INTRODUCTION**

Powered upper limb prostheses offer the possibility of restoring abilities lost due to amputation; however, lack of kinesthetic feedback requires users to devote constant visual attention to every task. Myoelectric prostheses are controlled using electromyographic (EMG) signals, which are highly variable byproducts of muscle contraction (Clancy et al., 2002). Despite recent improvements in prosthesis technology (Weir and Sensinger, 2009) and EMG signal processing (Parker et al., 2006), prosthesis movements are imprecise, and many amputees abandon their devices out of frustration (Biddiss and Chau, 2007; Biddiss et al., 2007). Providing additional sensory feedback is an intuitive solution, but this has not yet been implemented clinically (Antfolk et al., 2013). To provide effective sensory feedback, we need to understand how amputees incorporate feedback information into movement planning.

The role of feedback in able-bodied movement is described well by a sensorimotor adaptation framework. This framework theorizes that the nervous system coordinates movements by predicting the state of the body and correcting this prediction using sensory feedback (Wolpert et al., 1995). The state prediction and feedback processes are each estimated with some uncertainty, caused by many possible factors (Orbán and Wolpert, 2011). The relative uncertainties of state prediction and sensory feedback determine how these two sources of information are combined (Kording and Wolpert, 2004). For example, if sensory feedback is very uncertain (due to increased sensory variability, e.g., blurred vision) the brain will rely more heavily on the feedforward state prediction. Thus, the impact of sensory feedback depends on the uncertainty of both the sensory and motor information.

Uncertainty levels are presumably high during prosthesis use, due to EMG signal variability and limited sensory feedback. Some studies suggest that adding sensory feedback reduces uncertainty (Wheeler et al., 2010; Saunders and Vijayakumar, 2011), although others report either no improvement or conflicting results (Antfolk et al., 2013). In many cases, the reasons for the ineffectiveness of sensory feedback remain unclear: Are users perceiving high uncertainty in the feedback? Are users relying entirely on feedforward state predictions, and ignoring feedback? Are users able to generate state predictions at all when using EMG control? To accurately describe prosthesis control and implement effective sensory feedback, we must determine the effects of high motor uncertainty and control signal modality on adaptation.

Several possible factors may affect adaptation with EMG control. High motor variability may affect adaptation rate (Burge et al., 2008) and estimation of error relevance (Wei and Kording, 2009). The lack of direct sensory feedback from EMG activity may increase feedback uncertainty. Central nervous system processes [e.g., efference copy formation (Poulet and Hedwig, 2007) and internal modeling of system dynamics (Kawato, 1999)] are not well-understood for EMG control, which relies on indirect biological signals. This study considers the effect of high motor variability by measuring the effect of mean error on adaptation rate; other factors are considered collectively by measuring the effect of control interface on adaptation rate.

We investigated trial-by-trial adaptation with two levels of feedback uncertainty and three different control interfaces: joint angle, joint torque, and EMG. The control interface influenced the motor uncertainty of the user and enabled a simplified comparison of adaptation behavior between prosthesis and ablebodied control. Trial-by-trial adaptation rate was examined as a function of feedback uncertainty, control interface, and mean error.

## **METHODS**

Eight able-bodied subjects (three female, five male) participated in this experiment, which was approved by the Northwestern University Institutional Review Board. Subjects were between 23 and 32 years old.

### **EXPERIMENTAL PROTOCOL**

Subjects sat comfortably in front of a computer display screen (shown in **Figure 1C**). They used elbow extension movements to control a virtual cursor along a single degree-of-freedom (DOF) circular track (radius = 13 cm). The cursor started at the left side of the circle (180◦) and a target remained stationary at the right side of the circle (0◦). The start of each trial was indicated by an audio signal triggered by the experimenter. Subjects had 3 s

to move the cursor from the starting position to the target. The cursor then returned to the starting position.

Each experiment comprised three phases: familiarization, training, and testing. The familiarization phase consisted of 10 trials, in which the cursor was displayed as one dot that was unperturbed and visible throughout the trial. In the training phase, the cursor was still unperturbed and displayed as one dot, but visual feedback was taken away 0.5 s into the trial. The cursor reappeared after the trial to give 100 ms of terminal feedback (Baddeley et al., 2003; similar to Wei and Kording, 2010; and others). Training continued until the subject was able to complete 10 trials with an average error of under 20◦ (this usually required 15– 20 trials). In the testing phase, subjects were given only terminal visual feedback. The testing phase included 4 blocks of 75 trials each, with approximately 2 min of rest between blocks.

During the testing phase, visual perturbations were applied to the displayed cursor endpoint. Perturbations were randomly distributed between −40◦, 0◦, and 40◦. Subjects were encouraged to hit the target as accurately as possible, and were instructed that the terminal visual feedback represented the true cursor position.

Two levels of feedback uncertainty were created in the testing phase by displaying the final cursor position as either one or five dots (an approach used previously by Tassinari et al., 2006; Wei and Kording, 2010; and others). When subjects saw the cursor as one dot, feedback uncertainty was low. When subjects saw five dots, feedback uncertainty was high. The location of the five dots was drawn from a Gaussian distribution with the mean as the cursor position and a standard deviation of 40◦. Level of feedback uncertainty was randomly assigned for each trial.

#### **CONTROL INTERFACES**

Subjects completed the experimental protocol once for each of the control interfaces: joint angle, joint torque, and EMG. Each control interface was tested on separate days, in randomized order. The experimental setups for each control interface are shown in **Figure 1**.

#### *Joint angle control interface*

In the joint angle control interface, the subject extended the right elbow (isotonic contraction). An electrogoniometer (Biometrics Ltd) measured the elbow angle of the right arm (**Figure 1A**). The end blocks of the goniometer were attached to a hinged two-bar planar linkage. One link was fixed to a flat surface and strapped to the subject's upper arm. The other link was free to rotate, slid easily across the flat surface, and was strapped to the subject's lower arm. A mechanical stop prevented the subject from flexing past 45◦ and served as the starting position for each trial. The subject's view of the arm was blocked. The angle output of the goniometer was filtered with a low-pass cutoff frequency of 50 Hz. Elbow flexion of 45–135◦ was mapped to 0–360◦ of the circular cursor track.

#### *Joint torque and EMG control interfaces*

In the torque and EMG control interfaces, the subject generated isometric extension torque about the elbow (**Figure 1B**). Elbow extension torque was measured by a reaction torque sensor (TFF40, Futek Inc.). EMG activity during isometric elbow extension was measured by a self-adhesive bipolar electrode (Bagnoli™, Delsys Inc.) placed over the lateral head of the triceps brachii. The subject's right arm was strapped into a modified elbow brace that restricted motion (Elbow RANGER Motion Control, ProCare®). The lower arm portion of the brace was fixed to a horizontal link that coupled to the shaft of the torque sensor. The upper arm portion of the brace was fixed to the housing of the torque sensor.

The control signals were calibrated such that equal effort was required to move the cursor for both torque and EMG control interfaces. Subjects exerted approximately 4 N-m of extension torque for 10 s by viewing a screen that indicated their current torque and the goal torque. Both torque and EMG control signals were normalized to the mean absolute values recorded during the 10 s calibration. Control signals were high-pass filtered at 0.1 Hz, rectified, low-pass filtered at 5 Hz, normalized, and mapped to cursor angle with the following transfer function:

$$\frac{\theta \text{ (s)}}{u \text{ (s)}} = \frac{1250}{s^2 + 11s}$$

*.*

Similar dynamics are commonly used as an EMG filter for clinical prostheses (Sensinger and Weir, 2008). Parameters were chosen to match the dynamics of a typical prosthetic arm—the LTI Boston Digital™ elbow (Heckathorne, 2004).

#### **RESULTS**

We investigated the influence of control interface on trial-bytrial adaption to visual perturbations with two levels of feedback uncertainty. Subjects used three control interfaces—elbow extension angle, torque, and EMG—to move a cursor toward a stationary target. Terminal visual feedback was displayed as one dot (low feedback uncertainty) or five dots (high feedback uncertainty). Adaptation rate was assessed as a function of control interface, feedback uncertainty level, and mean absolute endpoint error.

Every subject demonstrated trial-by-trial adaptation for all three control interfaces (**Figures 2**, **4**). When a visual perturbation was applied in the negative direction, the subject typically reacted to the perceived error by overcorrecting on the next trial. Thus, the slope of the regression line (solid line in **Figure 2**) reflects the degree to which the subject adapted to perturbations, and will be referred to here as the adaptation rate. Note that although the slope is always negative, here we present adaptation rates as positive values (correction opposite to perceived error) to avoid any confusion.

Higher mean errors significantly increased adaptation rate. The slope of the overall linear relationship between adaptation rate and mean error is statistically significant (*p <* 0*.*01, **Table 1**) and accounts for a large proportion of variance in adaptation rate (*η*<sup>2</sup> *<sup>p</sup>* = 0*.*21, **Table 1**). This relationship depends on control interface and feedback condition (**Figure 3A**).

The control interface did not affect adaptation; there were no significant differences in adaptation rate between control interfaces (*p* = 0*.*7*,η*<sup>2</sup> *<sup>p</sup>* = 0*.*01, **Table 1**). However, control interface did influence mean error. When using EMG control, subjects' mean errors were significantly higher than when using joint angle or torque control (**Figure 5**, *p <* 0*.*01, One-Way ANOVA with Tukey *post-hoc* tests).

**FIGURE 2 | Representative data from one subject using the joint angle control interface with low feedback uncertainty.** Individual trials are plotted as circles. The x-axis shows the perturbation size for a trial with one-dot terminal feedback, and the y-axis shows the error on the following trial (error is defined as the unperturbed or true distance between the cursor and the target). Adaptation rate is defined as the slope of the linear regression between the unperturbed error of trial (N) and the perturbation of trial (N-1). The regression is plotted as the bold solid line. If a subject showed no adaptation, the regression slope would equal zero, illustrated by the horizontal dotted line. If a subject showed complete adaptation, the regression slope would equal −1, illustrated by the dashed line. Note that the adaptation rate is negative; however, in this paper we present adaptation rates as positive values to avoid confusion.

#### **Table 1 | Results of Three-Way ANOVA on adaptation rate.**


*Categorical factors test for an offset change in the dependent variable and continuous factors test for a slope change in the dependent variable. Effect sizes were assessed using partial eta squared, η*<sup>2</sup> *<sup>p</sup> (Hentschke and Stüttgen, 2011; Richardson, 2011).*

Feedback uncertainty significantly affected adaptation rate for all three control interfaces (*p <* 0*.*01, *η*<sup>2</sup> *<sup>p</sup>* = 0*.*32, **Table 1**). Higher feedback uncertainty decreased the intercept of the adaptation rate curve (**Figure 3A**). This means that subjects adapted less after trials with high feedback uncertainty, i.e., when terminal feedback was presented as five dots instead of one.

Various factors influenced adaptation rate (**Figure 3B**). Because control interface did not have a significant effect on adaptation rate, linear regressions were calculated and plotted across all three control interfaces. Mean error affected the slope

**FIGURE 3 | Adaptation rate as a function of mean absolute endpoint error. (A)** Regression between adaptation rate and error with each control interface, for low feedback uncertainty (solid lines), and high feedback uncertainty (dashed lines). The range of each regression line runs from the

of the adaptation curve and feedback uncertainty affected the intercept.

## **DISCUSSION**

In this work we investigated how prosthesis control affects trialby-trial adaptation by comparing three different control interfaces: joint angle, torque, and EMG. We found that the control interface did not significantly affect adaptation; instead adaptation rates depended primarily on mean error and on feedback uncertainty.

Subjects were able to develop and adapt a simple internal model using EMG control (**Figure 4**). Previous studies show minimum mean error to the maximum mean error across subjects. **(B)** Regression between adaptation rate and error across control interfaces for low feedback uncertainty (solid line), and high feedback uncertainty (dashed line). Shaded areas represent 95% confidence intervals of regression.

that amputees maintain the central nervous system capabilities needed for adaptation (Lotze et al., 1999, 2001). Other studies show that subjects adapt to novel transformations when using EMG control (Radhakrishnan et al., 2008). Our results support these findings and motivate future studies of adaptation behavior that requires more complex internal models during powered prosthesis control.

The relationship between mean error, feedback uncertainty, and adaptation rate supports the Bayesian framework, if we assume that mean error influences feedforward uncertainty. Bayesian theory predicts that feedforward uncertainty speeds adaptation and feedback uncertainty slows adaptation (Wei and Kording, 2010). This interaction of feedforward and feedback uncertainty is critical for the high uncertainty levels associated with powered prosthesis control. When viewed in light of this interaction, results of sensory feedback studies begin to form cohesive patterns. Sensory feedback reduces errors if feedforward control is noisy (Saunders and Vijayakumar, 2011) or if vision is removed (Wheeler et al., 2010), but has no significant effect in many other cases (Antfolk et al., 2013).

The patterns observed here have important implications for prosthesis control. When control is more precise, prosthesis users will rely less on feedback and more on their internal feedforward predictions. When sensory feedback is provided, the perceived uncertainty of this feedback determines whether there is any impact on control. Visual feedback also introduces another factor: if the uncertainty of sensory feedback is greater than that of visual feedback, it will not notably improve control over vision alone, since the two senses are integrated according to their uncertainty (Ernst and Banks, 2002).

The mean error of EMG control was significantly higher than that of both angle and torque control; however, adaptation rates of EMG control were not significantly different. The high mean error of EMG control is not surprising because EMG signals have higher variability than angle and torque signals (Vodovnik and Rebersek, 1974; Clancy et al., 2002). We offer two hypotheses for why we did not find a corresponding difference in adaptation rates. First, there may be a ceiling for adaptation rates. If subjects continually see very large errors, they may be so unsure of their feedforward signals that instead of adapting quickly, they do not adapt at all (e.g., Torres-Oviedo and Bastian, 2012). Second, increasing adaptation rate may not be optimal behavior in every situation. In this trial-by-trial adaptation paradigm, increasing adaptation means continually making large corrections in response to large errors. Furthermore, EMG noise is dependent on signal size; larger control signals (from stronger contractions) are more variable. Studies show that subjects learn to use smaller control signals in the presence of such signal-dependent noise (Chhabra and Jacobs, 2006). The noise characteristics of EMG control signals may have altered optimal adaptation behavior.

Higher mean error increased adaptation rates, and higher feedback uncertainty decreased adaptation rates, but control interface did not have a significant effect (**Table 1** and **Figure 3**). Subjects behaved similarly when using different control modalities, including EMG signals. This result is encouraging, because it suggests that improved prosthesis control systems with lower errors may enable skilled, coordinated movement.

This study introduces the application of adaptation paradigms to powered prosthesis control; however, many questions remain. We chose a single DOF task for a simple initial comparison of EMG-controlled and able-bodied adaptation, but multi-DOF tasks might reveal differences and should be investigated. Similarly, only one muscle, the triceps brachii, was used for single-site proportional EMG control, whereas many powered prostheses are controlled by pattern recognition of multiple EMG signal features (Kuiken et al., 2009) or other multi-site control schemes (Zecca et al., 2002). Other limitations include the difficulties of selecting and matching control ranges for performance comparisons. Furthermore, this study included only able-bodied subjects interacting with a virtual environment. For amputees using physical prostheses, everyday tasks may involve higher levels of uncertainty from a greater variety of sources.

Our results provide a strong motivation for further investigation of adaptation behavior during powered prosthesis control. We found that subjects using EMG control adapted to perturbations in a manner consistent with Bayesian predictions. A better understanding of internal model development and adaptation will guide control and sensory feedback strategies to reduce uncertainty for prosthesis users.

## **ACKNOWLEDGMENTS**

This research was supported by the National Science Foundation through the National Robotics Initiative (NSF-NRI 1317379). Reva E. Johnson was supported by the National Defense Science and Engineering Graduate Research (NDSEG) Fellowship from the Department of Defense (DoD). The authors thank Ann Barlow, PhD for help with manuscript preparation and Tommaso Lenzi, PhD for help with the experimental setup.

## **REFERENCES**


multifunction artificial arms. *J. Am. Med. Assoc.* 301, 619–628. doi: 10.1001/ jama.2009.116


**Conflict of Interest Statement:** The reviewer Dr. Panagiotis Artemiadis declares that, despite having collaborated with Dr. Levi J. Hargrove on a review paper, the review process was handled objectively. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 March 2014; accepted: 03 September 2014; published online: 30 September 2014.*

*Citation: Johnson RE, Kording KP, Hargrove LJ and Sensinger JW (2014) Does EMG control lead to distinct motor adaptation? Front. Neurosci. 8:302. doi: 10.3389/fnins. 2014.00302*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Johnson, Kording, Hargrove and Sensinger. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Automatic real-time monitoring and assessment of tremor parameters in the upper limb from orientation data

## *Stefan Lambrecht\*, Juan A. Gallego , Eduardo Rocon and Jose L. Pons*

*Neurorehabilitation group, Cajal Institute, Spanish National Research Council (CSIC), Madrid, Spain*

#### *Edited by:*

*Dario Farina, Georg-August University, Germany*

#### *Reviewed by:*

*Ricardo Chavarriaga, Ecole Polytechnique Fédérale de Lausanne, Switzerland Jakob Dideriksen, University Medical Center Goettingen, Germany*

#### *\*Correspondence:*

*Stefan Lambrecht, Neurorehabilitation group, Cajal Institute, Spanish National Research Council (CSIC), Ctra Campo Real km 0.2, La Poveda, Madrid 28500, Spain e-mail: s.lambrecht@csic.es*

Upper limb tremor is the most prevalent movement disorder and, unfortunately, it is not effectively managed in a large proportion of the patients. Neuroprostheses that stimulate the sensorimotor pathways are one of the most promising alternatives although they are still under development. To enrich the interpretation of data recorded during long-term tremor monitoring and to increase the intelligence of tremor suppression neuroprostheses we need to be aware of the context. Context awareness is a major challenge for neuroprostheses and would allow these devices to react more quickly and appropriately to the changing demands of the user and/or task. Traditionally kinematic features are used to extract context information, with most recently the use of joint angles as highly potential features. In this paper we present two algorithms that enable the robust extraction of joint angle and related features to enable long-term continuous monitoring of tremor with context awareness. First, we describe a novel relative sensor placement identification technique based on orientation data. We focus on relative rather than absolute sensor location, because in many medical applications magnetic and inertial measurement units (MIMU) are used in a chain stretching over adjacent segments, or are always placed on a fixed set of locations. Subsequently we demonstrate how tremor parameters can be extracted from orientation data using an adaptive estimation algorithm. Relative sensor location was detected with an accuracy of 94.12% for the 4 MIMU configuration, and 100% for the 3 MIMU configurations. Kinematic tracking error values with an average deviation of 8% demonstrate our ability to estimate tremor from orientation data. The methods presented in this study constitute an important step toward more user-friendly and context-aware neuroprostheses for tremor suppression and monitoring.

**Keywords: tremor, MEMS, sensor location, context awareness, real-time estimation**

## **INTRODUCTION**

Pathological tremor encompasses all types of tremors that impair motor performance (e.g., essential tremor and parkinsonian tremor; McAuley and Marsden, 2000), and is the most common movement disorder (Wenning et al., 2005). Sixty five percent (Elble and Koller, 1990) of tremor patients report serious difficulties in the performance of their activities of daily living (ADL) (McAuley, 2000; E Rocon, 2004; Wenning et al., 2005). Furthermore, patients suffering from pathological tremor experience functional disability to the extent that it can lead to social isolation. In this article we refer to pathological tremor as tremor.

Recently new tremor treatment strategies, based on mechanical loading, have been proposed in addition to the existing therapies. These novel strategies are deemed necessary given the low success rate and side effects induced by both drugs and neurosurgery in some types of patients; in 25% of patients tremor is not managed satisfactorily (Rocon et al., 2007b). Tremor suppression through mechanical loading is based on the principle that tremor amplitude can be modified by altering limb impedance through the application of force or by adding mass (Adelstein, 1981; Prochazka et al., 1992; Rocon et al., 2007a). For example, Rocon et al. demonstrated for the first time that a wearable robot that applied force to the upper limb segments could effectively attenuate upper limb tremors (Rocon et al., 2007a). Other studies have shown that it is possible to attenuate the tremor using the human muscle tissue as actuators, through functional electrical stimulation (Javidan et al., 1992; Popovic Maneski et al., 2011; ´ Gallego et al., 2013; Bó et al., 2014). Functional electrical stimulation neuroprostheses avoid a heavier and more obtrusive rigid structure (Gallego et al., 2011).

To avoid constant actuation and the reduction of tremor without functional improvement, total movement must be separated into voluntary and tremulous movement (Rocon et al., 2007a). This is typically performed using adaptive algorithms (see e.g., Gallego et al., 2010; Bo et al., 2011). Tremor suppression devices subsequently intervene only when tremor coincides with voluntary movement. Unlike wearable robots, where most sensors are embedded in the device, neuroprostheses depend on additional sensors. Both MEMS accelerometers and gyroscopes are used to monitor tremor (Grimaldi et al., 2008; Elble, 2009). For example, the neuroprosthesis presented in Gallego et al. (2013) implemented microelectromechanical (MEMS) gyroscopes for measuring tremor. Accelerometers constitute the most popular approach. They however measure linear acceleration, in contrast to human motion which is considered as rotations about joints. Furthermore, there is no accepted model to separate gravity from voluntary motion in the accelerometer data (Veltink et al., 1996; Sabatini, 2011). Gyroscopes measure angular velocity and therefore provide a more direct representation of human movement. Gyroscopes are thus more adequate than accelerometers to extract tremor characteristics from motion data; however they do suffer from a low-frequency bias resulting in an integration drift. This bias does not affect the estimation of tremor, but is inherently present in the voluntary movement component of the signal. The presence of this integration drift inhibits the accurate extraction of joint angles from gyroscope data over longer periods of time (*>*10 s) (Woodman, 2007).

MEMS limitations are often addressed by sensor fusion. The most common approach is to correct the gyroscope data with accelerometers and magnetometers (Foxlin, 2002). Currently the most popular fusion method is Kalman filtering. In magnetic and inertial measurement units (MIMU) the accelerometer and magnetometer data is used to reset the bias of the gyroscopes in quasi-static periods or after filtering the accelerometer and magnetometer data (Roetenberg et al., 2005; Sabatini, 2011). We refer to Sabatini (2011) for more information on sensor fusion and the use of MEMS in human motion analysis. MIMUs thus allow us to obtain orientation data by small, relatively unobtrusive sensors that can be incorporated into a garment (see e.g., Gallego and Rocon, 2011).

To enrich the processing of long-term tremor monitoring and to increase the intelligence of the neuroprosthesis we need to be aware of the context. Context is defined as "any information that can be used to characterize the situation of an entity" (Dey, 2001) and can refer to a situation (being in a meeting, driving a car) or an activity the patient is performing. Context awareness is a major challenge for neuroprosthesis and would allow these devices to react more quickly and appropriately to the changing demands of the user and/or task. This would also permit to monitor the evolution of the therapy provided by the neuroprosthesis, and the evolution of the patient's condition. Kinematic features are traditionally used to increase context awareness. A substantial body of literature supports the use of body-worn sensors for context and ADL classification (Farringdon et al., 1999; Kunze et al., 2005; Kunze and Lukowicz, 2007; Korel, 2010). Their availability and low cost have made accelerometers the most wide spread sensor modality used to extract kinematic features. Recent advances in MIMUs and their incorporation into the latest generation of consumer electronics however are rendering robust orientation data easily available. To accurately obtain joint angles over time, we need a robust measurement of the orientation or position of each segment over time. Joint angles and features derived from joint angles have recently demonstrated their potential (Ofli et al., 2012) and are gaining in popularity with the advent of more wearable and affordable motion capture equipment. In this paper we propose a first step toward increased context awareness for neuroprostheses for tremor management.

In order to be able to extract joint angles it is vital to know where the sensors are placed on the body. Automated sensor location identification facilitates the donning and doffing of patients by medical doctors for instrumented analysis or by the patients themselves, for use of tele-rehabilitation devices or neuroprostheses at home. Little or no research has been done to identify sensors location on the body. So far only one study Kunze and Lukowicz (2007) has looked at sensor placement identification in tasks other than walking. A limitation is that a 6 min window was needed to achieve 85% accuracy for 4 sensor locations spread across the body. The majority of ADLs are shorter in duration, moreover is it not recommendable for our application that the patient endures such a lengthy calibration period. Other studies started from the hypothesis that the patient would be walking, and predominantly focused on sensor placement on the lower limbs (Kunze et al., 2005; Kunze and Lukowicz, 2007; Vahdatpour et al., 2011; Weenk et al., 2013). All previous work has been based on accelerometer data. Weenk et al. were the first to also introduce gyroscopes in an attempt to make their classifier more location invariant. Assuming that the body consists of rigid body segments angular velocity is invariant to location on the segment. Weenk et al. furthermore used characteristics of the walking cycle to achieve orientation invariance. They took advantage of the specific characteristics of walking and made assumptions related to the quality of movement execution. The participant was assumed to be walking in a straight line, the direction of which was subsequently used to transform from local to global sensor orientation. No upper limb task exists that has such stable and repetitive characteristics as walking. Movement disorders moreover severely disrupt task execution in such a way that dominant direction is corrupted by involuntary movement and thus advocate for more easily applicable localization methods.

Here we present a novel method to automatically identify relative sensor location on the upper limb. Our approach is based on an upper limb task, and relies on the observation that movement and tremor are more pronounced distally. We demonstrate that features extracted from the movement we selected can be used to identify relative sensor location based on orientation data. We focus on relative sensor location rather than pure sensor location, because in many medical applications MIMUs are used in a chain stretching over adjacent segments (e.g., Analysis of kinematics, tele-rehabilitation applications), or are always placed on a fixed set of locations (e.g., gait segmentation). In the particular application with a neuroprothesis, this algorithm facilitates the re-instrumentation (placing of the sensors) after cleansing the fabric. The main contribution of this sensor location algorithm in monitoring applications is that it ensures correct and accurate measurements without the need for prior (technical) knowledge. Further, this identification algorithm can be combined by standard MIMU-to-body calibration routines to obtain anatomical joint angles. Subsequently we demonstrate for the first time how tremor can be extracted from orientation data. Therefore, using orientation data we are able to identify sensor location, estimate tremor and derive context information from the same dataset, thus reducing bandwidth requirements.

## **MATERIALS AND METHODS SUBJECTS**

A group of 6 patients (3 male, 3 female; 63.2 ± 11.8 years) affected by essential tremor was recruited for this study. The patients were diagnosed by the neurological personnel of the Hospital 12 de Octubre as definite essential tremor, according to the criteria described in Deuschl et al. (1998). Tremor severity was 30.2 ± 13.0 (ranging from 10 to 48) according to the Fahn-Tolosa-Marin rating scale (Fahn et al., 1998). Patients continued taking their regular medications at the time of the recordings. Informed consent was obtained from all patients prior to starting data collection. Approval for this study was obtained through the Ethics committee of the Hospital 12 de Octubre, granting its accordance to the Declaration of Helsinki.

#### *Protocol*

Patients were asked to perform a finger-to-nose test in repetitive manner while seated. The patient was asked to alternatively touch the nose and knee with the tip of his/her right index finger. Contact with nose and knee had to be maintained for a few seconds during each repetition; the total trial duration was 30 s. Two trials of each patient were analyzed, with a single trial consisting of 3 finger-to-nose cycles. Finger-to-nose is typically used in neurological examinations to activate kinetic tremor (Deuschl et al., 1998). Essential tremor is predominantly manifested during task execution. Finger-to-nose furthermore shares the main kinematic pattern with a multitude of ADLs related to the upper limb such as drinking, eating, and personal hygiene.

#### **INSTRUMENTATION**

We used 4 MIMUs (Tech MCS, Technaid S.L., Madrid, Spain) comprising tri-axial accelerometers, gyroscopes, and magnetometers to measure upper limb kinematics (sampling rate: 100 Hz). They are particularly suited for the estimation of tremor due to their low weight (40 g) and small size (11 × 26 × 36 mm). The sensors were attached with double sided hypo-allergenic tape to the hand, distal forearm, proximal forearm, and humerus (**Figure 1**). Orientation was calculated by the onboard extended

Kalman fusion (EKF) algorithm. Proper alignment between sensor axes and anatomical axes was ensured upon placing the MIMUs. Fixation on soft tissues was avoided to prevent low pass filtering of the motion signal and to eliminate the influence of undesired soft tissue oscillations (Tong and Granat, 1999). In addition to the configuration shown in **Figure 1**, based on the current design of the neuroprosthesis, we also tested a subset more commonly used in biomechanics with only one sensor per segment (hand, forearm, and humerus).

#### **DATA ANALYSIS**

#### *Sensor location identification*

We have chosen features related to angular velocity and thus need to decompose the orientation data into angular velocity. In a three-dimensional scenario, as is the case with orientation data, we cannot obtain angular velocity by direct differentiation of attitude angles. The non-vectorial nature of finite angular displacements nullifies this assumption. We therefore use the Poisson equation to extract the angular velocity [ *.* θ] (Zatsiorsky, 1998):

$$[\dot{\theta}] = [\dot{\mathbb{R}}][\mathbb{R}]^{-1}$$

Where [ *.* R] represents the rate of change of the direction cosines and [R] <sup>−</sup><sup>1</sup> corresponds to the body attitude. This equation has been used to identify the instantaneous helical axis (Veldpaus et al., 1988).

Based on pilot work on a mechanical mockup and healthy subjects (Lambrecht and Pons, 2014) we selected 18 candidate features (**Table 1**). In an attempt to make our features orientation invariant, we rectified the sensor data and combined information from all axes (|x|,|y|,|z|). Our approach is further based on the observation that the kinematic chain has an additive effect regarding movement of individual segments, i.e., movement of proximal segments is (partly) represented in more distal segments. To an extent this pattern is also noticeable in tremor, being more manifest at distal than at proximal segments.

A total of three sensor configurations were adopted, the current neuroprosthesis (NP) setup as shown in **Figure 1** and two configurations each with one MIMU per segment. The latter two differed in the location where the second sensor is placed, being respectively distal and proximal on the forearm. All features were used as ranked values to enhance robustness of the classifiers across intensities of tremor (nearly absent to severe). These classifiers were: random forest, decision tree, and ranking.

Random forest classification generates an ensemble of "bagged" decision trees with random feature and sample selection, each such combination is also referred to as a "bag." In each bag a decision tree is trained on a bootstrap or subsample of the initial data set. The benefit of a random forest over decision tree is that the ensemble of trees can lead to a better result than the best individual tree. We calculate the accuracy of the prediction of the random forest as out-of-bag error, reflecting the accuracy in identifying sensor location for data not used in a specific bag, The random forest was programmed using the treebagger algorithm in Matlab, selecting 4 leaves and 100 trees.

The decision tree, one of the most successful techniques for supervised classification learning, is more intuitive than the



16. - principal component coefficients , velocity

18. - principal component coefficients , acceleration

*To achieve a classification that is robust across various levels of tremor and sensor orientations each feature is based on the rectified values across all axes of each MIMU. The root mean square (RMS) and variance (var) values are calculated over the full trial duration. The values marked in bold are those resulting in high classification performance.*

random forest and computationally less demanding. However, one has to be careful to not overtrain the tree. Overtraining occurs when the classifier reaches a maximum accuracy for the training data used, but performs poorer on new data than a classifier that was not overfitted to the training sample. To identify and avoid overtraining we compute both the resubstitution and 10 fold cross-validation error. The resubstitution error reflects the accuracy of the classifier on the training data. In the case of decision trees, resubstitution error will keep decreasing upon adding nodes to the tree. The cross-validation error represents the misclassification occurring on new data, not used for training. We optimize the combined cost of resubstitution error and crossvalidation error and added a 1 standard deviation window to this value to ensure avoiding an over-fitted sub-optimum. We used the classregtree function in Matlab to compute the decision trees.

Ranking can be considered a form of classification, in particular when applied to chains of sensors. The most important benefits of ranking are that no training is needed and that the configuration of sensors can thus be modified without penalty. Ranking furthermore has a negligible computational cost. The advantage is that the chain of sensors can be shortened or elongated, and slid up or down without the need for retraining or changing between classifiers. The only requirement is that the configuration is known beforehand. The features listed in **Table 1** were individually sorted in descending values; the thus obtained vector was then compared to the reference vector. The reference for all of the above methods is the fixed order in which the sensors were placed on the subjects, starting distally (MIMU 1 placed on the hand, see **Figure 1**).

#### *Tremor estimation*

The orientation data was passed by the same protocol as used by the classification, after which the angular velocity estimate was upsampled to 1 kHz. To obtain joint motion we subtracted the angular velocity data from sensors proximal and distal to the respective joint (Rocon et al., 2006). We focused our analysis on the wrist joint because tremor is more present and disabling further down the kinematic chain (Belda-Lois et al., 2004).

To estimate wrist tremor from the raw movement we used the algorithm presented in Gallego et al. (2010). This algorithm assumes that tremulous and voluntary movement can be separated by frequency distribution. The frequency of voluntary movement during the execution of ADL is between 0 and 2 Hz (Riviere, 1996), with mean around 1 Hz (Mann et al., 1989). Tremor frequency range between 3 and 12 Hz (Deuschl et al., 1998) By estimating the voluntary movement with a g-h filter (Brookner, 1998), and subtracting it from the raw movement data we obtained an estimate of the tremulous movement. The parameter of the g-h filter was set by optimization with a genetic algorithm over all trials, minimizing the total cost over all patients and trials. Since our intention was to reduce the tremor component in the signal, we set bounds at 0.8 and 1. Lower values would likely result in a too high tremor to voluntary motion ratio in the signal. Selection of the initial data was dome randomly with uniform spacing, using a population size of 100. We further applied a crossover rate of 80% with 2 elitist survivors in mutation, and a roulette method for natural selection. The fitness function minimized the kinematic tracking error (KTE) (see below). The parameters thus obtained for the gyroscope and orientation data are respectively 0.9952 and 0.9958.

We compared our results to the online and offline methods based on gyroscope data presented in Gallego et al. (2010). The offline method is considered a gold standard or ideal reference method (Rocon et al., 2006), but cannot be implemented in the control of a neuroprosthetic. The online gyroscope method is used as a reference to compare our results to a practical alternative for real-time tremor estimation. The offline method consists of filtering gyroscope data with a recursive low pass filter (fc = 2 Hz).

The performance of the orientation based tremor estimation was assessed through the KTE. KTE consists of two components that together evaluate the smoothness, response time and execution time of a tracking algorithm relative to a reference method (Rocon, 2006).

$$KTE = \sqrt{\varphi^2 \left| b \right| + \sigma^2 \left| b \right|}$$

Where *ϕ*<sup>2</sup> |*b*| represents the mean of the absolute estimation error - *b* = *yk* <sup>−</sup> *xk*<sup>+</sup>1*,<sup>k</sup>* , and represents how fast the algorithm is capable of reacting when velocity changes. The offline gyroscope estimation, x*<sup>k</sup>* <sup>+</sup> <sup>1</sup>*,k*, is used as reference in the error calculations. The second component *σ*<sup>2</sup> |*b*| is the variance of the absolute estimation error and gages the smoothness of the estimated variable.

### **RESULTS**

#### **SENSOR LOCATION IDENTIFICATION**

**Table 2** summarizes the results, percentage of MIMUs identified correctly, of the various classifiers for all configurations tested. Performance was unaffected by altering the forearm sensor location from distal to proximal, therefor we report average values in **Table 2**. All classifiers achieved a perfect score for the setup when



*The values in gray refer to the gyroscope data, those in black to the orientation data. A score of 1 corresponds with perfect classification, 0 corresponds with incorrect classification of all sensors.*

only 1 MIMU was attached to each segment (left column). In the 4 MIMU configuration, with 2 MIMUs attached to the forearm, a decrease in performance was observed (right column **Table 2**). The fact that gyroscope data (gray) provides the best classification performance indicates that angular velocity is sufficient to identify relative sensor location. Similar results were achieved using the orientation data (black). The poorer result obtained with the decision tree using orientation data was likely due to an overly conservative correction in the cross-validation. Only taking the resubstitution error into account, the accuracy achieved by the decision trees was 0.975. The results from ranking further support this hypothesis.

Ranking proved to be the best option since it does not need training and reached similar levels of accuracy as the other classifiers. When using orientation data 10 features each provided the maximum accuracy reported in **Table 2** (**Figure 2A**), these features are marked in bold in **Table 1** and depicted in **Figure 2**. **Figure 2** furthermore shows that only 4 features achieve this level of accuracy when using gyroscope data (**Figure 2B**). Features proved to perform equally well across subjects and highly redundant amongst each other (**Figure 2C**). Either of the features marked in bold in **Table 1** thus resulted in a similar classification performance.

#### **TREMOR ESTIMATION**

The plots in **Figure 3** provide an overview of the decomposition process. In **Figure 3A** the joint angle obtained by Euler decomposition is shown in red. This signal predominantly represents the voluntary motion, due to the filtering process done by the EKF used for orientation estimation. Tremor frequency is nonetheless preserved in the orientation (**Figure 3B**). The first peak, at 0–2 Hz, represents the voluntary movement whereas the second, much smaller, peak at ∼5 Hz corresponds to tremulous movement. The decomposed signal using the method presented in this paper is depicted in black in **Figure 3A**. The frequency spectrum of this signal indicates that decomposing in this form allows us to extract the tremor characteristics but with a loss of the voluntary signal. This however is not an issue since the voluntary movement is present, with little to no signs of tremor, in the orientation data and can easily be accessed directly extracting the Euler angles from the rotation matrix.

**FIGURE 2 | Performance of each feature to identify sensor location based on ranking.** The gray bars in **(A,B)** correspond to the 4MIMU configurations, the black bars to the configurations with 3 MIMUs. The orientation data is presented in **(A)**, and the gyroscope data in **(B)**. In **(C)** the redundancy of the features is demonstrated by contrasting features 1 and 2, using orientation data both resulting in high scores in **(A,B)**. **(C)** Is representative of the redundancy among the 10 features highlighted in **Table 1**.

In **Figure 4** we show a representative trial using both gyroscope references, online and offline, as well as the proposed method using orientation data. The top plot demonstrates the high correspondence of the proposed method with both the online gyroscope method and an offline gold standard method. The first highlight showcases the strength of the orientation based method, following both the online and offline gyroscope tremor estimates closely in amplitude and in frequency. The second highlight places attention to a limitation of the presented method. It

appears that upon changes in velocity the orientation method is slow in adjusting; the orientation based method in black deviates from both the gold standard in red and the online gyroscope method. We assume that this is due to the intrinsic characteristics of the onboard EKF of the MIMUs used. The EKF parameters are set to track voluntary human movement, characterized by a lower frequency than the tremor we are tracking.

To verify the hypothesis that the EKF is a limiting factor when changes in velocity occur, we analyze both components of the KTE separately (**Figure 5**). It is clear that the differences are predominantly present in the first component (KTE1; the mean of the absolute estimation error), and thus related to the response time of the algorithm. We believe that adapting the EKF could increase the performance of the presented method. KTE values of both online methods, comparing each method to the gold standard, did not differ more than 8% with respect to the value of the orientation based KTE. Mean KTE of the online gyroscope method was 0.2963 ± 0.1146 (min: 0.11750; max: 0.5137), and for the method proposed in this paper 0.3704 ± 0.1548 (min: 0.2034; max: 0.6730). A more direct analysis was not possible in the current study since the EKF used was embedded in the MIMU and acted as a "black box." Future studies should further investigate what the effect of the fusion algorithm is on errors in tremor estimation from orientation data. The current results, although preliminary due to the small sample size and high inter-patient variability, appear to indicate that orientation data is suitable for NP control. Orientation data are likely best combined with an impedance modulation control strategy for the NP (Gallego and Rocon, 2011). Impedance control is less reliant on highly precise data than noise canceling approach. In impedance control, the viscosity and stiffness of the joints are

increased to generate a low pass filter effect on the tremor. This is similar to the co-contractions of healthy subjects to stabilize their upper limbs.

## **DISCUSSION**

We have proposed algorithms that constitute a first step toward a more intelligent neuroprosthesis for tremor suppression. The algorithms are based on orientation data and respectively estimate sensor location and tremor. Relative sensor location was detected, without any a priori information, with an accuracy of 94.12% for the 4 MIMU configuration, and 100% for the 3 MIMU configurations. We were further able to accurately estimate tremor based on orientation data, with a precision comparable to that of state of the art methods. Using orientation data permits us to identify sensor location, estimate tremor and derive context information from the same dataset, thus reducing bandwidth requirements.

Previous work on detecting sensor location focused on absolute location on the body. However, in many applications sensors are placed in a chain or always on the same site(s). This is particularly the case when biomechanical variables are of interest (e.g., NP control, tele-rehabilitation, motion analysis). In such setups we can deduce absolute position of each sensor from their relative position in the chain. We have therefore opted to determine relative sensor location. The benefit of relative vs. absolute sensor location is that it drastically simplifies the classification and classifier.

Four sensors were used in our study, as is the case in the work presented by Kunze et al. (2005); Kunze and Lukowicz (2007). Several studies have detected more sensors, as many as 17 were identified by Weenk et al. (2013). However, our method is designed with the neuroprosthesis presented in Gallego et al. (2011) in mind, and therefore focuses on one limb consisting of 3 segments. In addition, this is the first attempt to identify various sensors placed on the same segment. The presented method can easily be modified to have less/more sensors or segments, as shown in the different configurations adopted in the present work. This is also supported by previous work on healthy subjects; where the trunk was added as a fourth segment (Lambrecht and Pons, 2014).

To our knowledge this is only the second study looking at identifying sensor location that does not rely on walking data. Kunze et al. have previously published a classifier that was able to determine the location of 4 sensors on specific locations spread across the body from arbitrary movement data. They reported a 82% accuracy on 6 min windows (Kunze and Lukowicz, 2007). We judged that for applications in health and telemedicine this window was too long and the accuracy too low. One of our goals is to facilitate the use of wearable sensors by patients, to make them more user-friendly. Our method was tested on 30 s trials, with actual movement ranging between 15 and 18 s. We are hopeful that this window can be further reduced to incorporate only one movement cycle, without a significant decrease in performance. Although we have only included one task, fingerto-nose test, we believe that our method will perform equally well on related upper limb tasks. The finger-to-nose task shares it dominant kinematic pattern with a variety of ADLs such as eating, drinking, combing your hair, putting on glasses, and answering a phone. Furthermore, no training was needed in the presented algorithm thus there is no indication as to why it should be limited to the finger-to-nose task. Any task that involves motion of the major joints and that triggers kinetic tremor is expected to perform equally well on an essential tremor population.

In recent work by Weenk et al. (2013) an attempt has been made to investigate the sensitivity of location of the sensor on the segment. Previous work has exclusively relied on accelerometer data but Weenk et al. were the first to use gyroscopes as an additional sensor. A slight drop in performance was reported but they still achieved a 97.2% accuracy. In our work we only rely on orientation data. Our algorithm only uses gyroscope and accelerometer data indirectly, as it is based on orientation data. This is the first time orientation data has been used for sensor classification. To further assess the influence of sensor location we included two configurations with 3 MIMUs (i.e., one MIMU per body segment), where the sensor of the forearm was placed distal or proximal. No difference in accuracy was observed. Given the results from both configurations using 3 MIMUs and the fact that none of the features relies on movement to occur about specific axes, we conclude that our method is location and orientation invariant. The features chosen display a high redundancy amongst each other. Future work to identify informative yet complementary features could further increase the precision of the method presented. The current features are individually very discriminative and a combination of features was thus not needed, especially given the high redundancy among them. We did however place the sensors on ideal locations to enable extraction of tremor characteristics. Placing the sensors on different locations and/or orientation would not affect the location identification. The tremor estimate would require a calibration procedure to align the sensor frame to the body-segment frame. Soft tissue artifacts might further filter part of the signal and/or introduce noise through wobbling masses.

We estimated tremor based on orientation data following the protocol presented in Gallego et al. (2010) for gyroscope data. We compared our results to those obtained using both an online estimation method and an offline reference method. Our results show, for the first time, that it is possible to accurately track tremor collecting only orientation data. The orientation based method does appear to have more difficulties adapting quickly to changing patterns. This observation was supported by the overall slightly larger values for the first component of the KTE (i.e., the mean absolute estimation error), the figure of merit used to compare the performance of the tremor tracking methods. Difference in performance relative to gyroscopes was particularly noticeable upon changes in velocity. This is most likely due to the nature of the EKF and the parameters defining it. Although we did not have access to the exact parameter values, we believe that altering the fusion filter or the filter parameters can improve the performance of the presented method. As is, the EKF is set to perform well for normal human motion, situated below 2 Hz in the frequency spectrum. Higher sensitivity to changes up to 8–10 Hz and a faster response time will most likely preserve the tremulous movement better and thus result in a better estimate. Further work, with customizable fusion algorithms, is needed to confirm this hypothesis.

The ability to track tremor with orientation data simplifies demands for bandwidth and processing power when incorporated in monitoring applications. It constitutes a significant step toward a more intelligent neuroprosthesis for tremor suppression and opens the door for long-term continuous tremor monitoring with context awareness.

Our future work will be directed toward adding a taskidentifier based on joint angles and joint angle related features to these algorithms; validating the sensor location algorithm on other types of tremor patients and different pathologies, as well as use the presented work to investigate context and evolution of tremor occurrence.

## **CONCLUSION**

The work described in this paper constitutes the first steps toward a more user-friendly and context-aware neuroprosthesis for tremor suppression and monitoring. We predict that this methodology will enable the monitoring of tremor with context awareness and will facilitate the use of wearable sensors in tele-health and tele-medicine applications.

We have introduced a method to automatically identify relative sensor location. This is the first location detection algorithm based on orientation data, the first that only requires upper limb movement and does not need any training, and only the second to be tested on a patient population.

We furthermore introduced an algorithm to track tremor using orientation data. As a direct application we will use this in the long-term monitoring of tremor characteristics and context.

## **AUTHOR CONTRIBUTIONS**

Stefan Lambrecht designed the study, developed the algorithms, performed the literature review, and drafted the manuscript. Juan Alvaro Gallego and Eduardo Rocon collected the data and revised the manuscript. Eduardo Rocon and Jose Luis Pons supervised this research. All the authors have read and approved the final manuscript.

## **ACKNOWLEDGMENTS**

The authors would like to thank the patients that participated in this study for their time and effort. The authors furthermore thank the staff of the Hospital 12 de Octubre in patient screening and selection. This work has been funded by the European project NeuroTremor (ICT-2011.5.1-287739) and the Spanish Consolider project HYPER (CSD2009-00067).

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 March 2014; accepted: 07 July 2014; published online: 24 July 2014. Citation: Lambrecht S, Gallego JA, Rocon E and Pons JL (2014) Automatic real-time monitoring and assessment of tremor parameters in the upper limb from orientation data. Front. Neurosci. 8:221. doi: 10.3389/fnins.2014.00221*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Lambrecht, Gallego, Rocon and Pons. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Equilibrium-point control of human elbow-joint movement under isometric environment by using multichannel functional electrical stimulation

#### *Kazuhiro Matsui <sup>1</sup> \*, Yasuo Hishii 2, Kazuya Maegaki 1, Yuto Yamashita1, Mitsunori Uemura1, Hiroaki Hirai <sup>1</sup> and Fumio Miyazaki <sup>1</sup>*

*<sup>1</sup> Department of Systems Science, Faculty of Engineering Science, Osaka University, Osaka, Japan <sup>2</sup> Fujitsu Limited, Kanagawa, Japan*

*Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*J. Luis Lujan, Mayo Clinic, USA Alessandra Pedrocchi, Politecnico di Milano, Italy*

#### *\*Correspondence:*

*Kazuhiro Matsui, Department of Systems Science, Faculty of Engineering Science, Osaka University, 1-3 Machikaneyama-cho Toyonaka-shi, Osaka 560-8531, Japan e-mail: k\_matsui@ robotics.me.es.osaka-u.ac.jp*

Functional electrical stimulation (FES) is considered an effective technique for aiding quadriplegic persons. However, the human musculoskeletal system has highly non-linearity and redundancy. It is thus difficult to stably and accurately control limbs using FES. In this paper, we propose a simple FES method that is consistent with the motion-control mechanism observed in humans. We focus on joint motion by a pair of agonist-antagonist muscles of the musculoskeletal system, and define the "electrical agonist-antagonist muscle ratio (EAA ratio)" and "electrical agonist-antagonist muscle activity (EAA activity)" in light of the agonist-antagonist muscle ratio and agonist-antagonist muscle activity, respectively, to extract the equilibrium point and joint stiffness from electromyography (EMG) signals. These notions, the agonist-antagonist muscle ratio and agonist-antagonist muscle activity, are based on the hypothesis that the equilibrium point and stiffness of the agonist-antagonist motion system are controlled by the central nervous system. We derived the transfer function between the input EAA ratio and force output of the end-point. We performed some experiments in an isometric environment using six subjects. This transfer-function model is expressed as a cascade-coupled dead time element and a second-order system. High-speed, high-precision, smooth control of the hand force were achieved through the agonist-antagonist muscle stimulation pattern determined by this transfer function model.

**Keywords: functional electrical stimulation (FES), equilibrium-point control, EAA ratio, EAA activity, muscle synergy**

## **1. INTRODUCTION**

In recent years, the number of people affected by strokes and spinal cord injuries has increased because the rapidly aging population and the high incidence of traffic accidents in automobilized societies. Many studies have been conducted on movement support and functional compensation for paralyzed individuals. The use of functional electrical stimulation (FES) to induce muscle activity via direct electrical stimulation of peripheral muscles has attracted particular attention. FES has even been used to assist severely paralyzed patients. According to reported adaptation examples (Giuffrida et al., 2001; Widjaja et al., 2011), FES can help with spastic paralysis in stroke patients. Muscle stimulation is performed by refereing to the antagonistic muscle's electromyogram (EMG). FES can also be used for treating tremor paralysis patients. In this approach, muscle stimulation is performed by refereing to limb tremor. Furthermore, many studies focused on joint trajectory tracking by electrical stimulation of multiple muscles have been reported. They are classified as openloop (Bernotas et al., 1987; Buckett et al., 1987; Hoshimiya et al., 1989; Miller et al., 1989; Chizeck et al., 1991; Veltink et al., 1992; Smith et al., 1996; Chen et al., 1997; Davoodi et al., 1998; Rakos et al., 1999; Ferrarin et al., 2001; Watanabe et al., 2002a,b), closedloop (Chizeck et al., 1980; Crago et al., 1980; Wilhere et al., 1985; Lemay et al., 1997), and hybrid type (Lan et al., 1994; Abbas et al., 1995; Kostov et al., 1995; Chang et al., 1997; Jonic et al., 1999; Qi et al., 1999; Adamczyk et al., 2000; Sites et al., 2000; Ianno et al., 2002; Kurosawa et al., 2005) applications. The hybrid type use of FES shows promise as a control method that combines the advantages of feedforward control, which allows for quick movement without delay, and feedback control, which reduces the effects of disturbance due to fatigue and load. However, it is difficult to derive an appropriate model for inclusion in the controller, because (1) the electrical stimulated musculoskeletal system is characterized by high non-linearity between stimulus current values and muscle force/length and (2) the control of joints that are moved by agonist-antagonistic muscle pairs is an ill-posed problem (Kurosawa et al., 2005), because of the redundancy in joint motion control.

In the field of exercise physiology, the equilibrium point hypothesis states that the stiffness and equilibrium point of the agonist-antagonist drive system are controlled by the central nervous system (Feldman, 1986). In addition, it has been shown that the muscle agonist-antagonist ratio is closely related to the joint angle corresponding to the equilibrium point, and muscle agonist-antagonist activity has a close relationship with the joint stiffness, as is evident from the results of analyses of muscle agonist-antagonist ratio and muscle agonist-antagonist activity (Iimura et al., 2011; Ariga et al., 2012). The muscle agonist-antagonist ratio is represented by the ratio of the EMGs of agonist-antagonistic muscle pair groups, which make up the musculoskeletal system. The muscle agonist-antagonist activity is represented by the sum of the agonist-antagonistic muscle pair group's EMGs. The equilibrium point and joint stiffness can be determined independently based on muscle agonist-antagonist ratio and activity. The muscle agonist-antagonist ratio and activity are used to control multiple pneumatic artificial muscles (Pham et al., 2014). The concept of muscle agonist-antagonist

pair group as well. In this study, we focused on non-linearity and redundancy in developing a method for applying the concept of the muscle agonist-antagonist ratio and activity to electrical stimulation. Problems such as non-linearity and redundancy are encountered when FES is used for controlling the human body. The concept of the muscle agonist-antagonist ratio or activity can be used to determine the equilibrium point and joint stiffness, which are considered in the equilibrium point hypothesis. We assume that we can linearly approximate human motion control by determining the equilibrium point and joint stiffness and by controlling the equilibrium point independently with the help of the EAA ratio and activity, which are based on the concept of the muscle agonist-antagonist ratio and activity. As an example, we use the human elbow joint, which is an antagonistic drive system. We attempt to model the human elbow joint using the proposed method and use the modeling results to control the end-point force (hand force) in an isometric environment. In addition, we conducted experiments to assess the trajectory tracking performance achieved with the method developed.

ratio or activity can be useful in electrically stimulating the muscle

## **2. MATERIALS AND METHODS**

#### **2.1. EXPERIMENTAL ENVIRONMENT**

The experimental environment and system configuration are shown in **Figure 1**. A stimulator manufactured by Multi Channel Systems, Inc. (STG4008) is used for electrically stimulation of the target muscles. The STG4008 can control the stimulus current value. Based on the results of attempts to use various modulation schemes, a sinusoidal electrical stimulation pattern with a frequency of 60 (Hz), generated using the AM (Amplitude Modulation) method, was chosen because it yielded the greatest effect and resulted in the least discomfort. We control only the amplitude of the sine-wave, with the base frequency fixed at 60 (Hz). The cathode-side stimulation electrode is installed at a motor point in the stimulated target muscles, which are the biceps and triceps of the subject's right upper arm (**Figure 2**). A stimulation electrode made by Compex Inc. (Electrode for performance/energy) is used. The motor-points are searched using motor-point pen made by Compex Inc. Before we apply the electrodes, we apply electrode gel made by Compex Inc. to the skin to decrease impedance. During the procedure, the right upper

**FIGURE 1 | Experimental setup, top view.**

arm is held in a horizontal plane by the seat, the wrist is secured with splint material, and the trunk is fixed to the chair with a shoulder belt. The hand force is sampled at a rate of 1000 (Hz) using a three-axis force sensor made by Tech-Gihan, Inc. (USL06- H5-200N). Negative measurements denote flexion and positive measurements denote extension. The experiment is conducted in an isometric environment, and the angle between the upper arm and the body surface is 45◦, while the elbow angle is maintained at 90◦. Healthy adult males A (aged 27 years, right-handed), B (aged 24 years, right-handed), C (aged 21 years, right-handed), D (aged 24 years, right-handed), E (aged 24 years, right-handed), and F (aged 24 years, right-handed) volunteered to participate in the experiment. To eliminate the influence of fatigue, the experiments were limited to 1 min in duration. The purpose and the details of the experiment were explained to the subjects, and they agreed to participate in the experiment. The experiments were conducted with the approval of the Osaka University of Engineering Science Ethics Committee and in accordance with their prescribed procedures.

#### **2.2. ELECTRICAL AGONIST-ANTAGONIST MUSCLE RATIO (EAA RATIO) AND THE EQUILIBRIUM POINT**

We define the elbow joint as the control target. We focus on coordination between the triceps and the biceps. The triceps and biceps act during extension and flexion, respectively, of the elbow joint. We intend to simultaneously stimulate these two muscles. To this end, it is important to understand how humans generate this movement. Human muscle groups have multiple degrees of freedom, and humans operate various muscle groups simultaneously when generating a movement. The human body has different types of solutions that control the various body movements. This implies that any of the solution can be involved in the human body movement. For example, EMG analysis is performed to determine humans' primary motion control. EMG presents a command signal to the muscle from the central nervous system. In this study, we focused on the EMG analysis method. The method is based on a combination of agonist-antagonist muscles. Iimura et al. defined *mf* and *me* as the degrees of flexor and extensor muscle activity, respectively, of agonist-antagonist muscle pairs obtained from EMG. The agonist-antagonist muscle ratio *r* and the agonist-antagonist muscle activity *a* are given by Equations (1, 2), respectively. Iimura et al. showed that both *r* and *a* contribute to the joint equilibrium point and joint stiffness.

$$r = \frac{m\_e}{m\_f + m\_e} \tag{1}$$

$$a = m\_f + m\_e \tag{2}$$

Electrical stimulation contracts human muscles. In this study, the normalized FES intensity to the biceps and triceps are defined as*If* (−) and *Ie* (−), respectively, and the electrical agonist-antagonist muscle ratio (EAA ratio) *rE* and electrical muscle activity *aE*, which are obtained using Equations (1, 2) are defined as the new control variables.

$$r\_E = \frac{I\_e}{I\_f + I\_e} \tag{3}$$

$$a\_E = I\_f + I\_e \tag{4}$$

Note that to minimize differences in the characteristics of the flexor and extensor and facilitate the extraction of the transfer characteristics, the stimulus current values are normalized. The maximum stimulus current *I fmax* (mA) and current *<sup>I</sup> emax* (mA) at which the subject does not feel pain, and the minimum stimulus current *I fmin* (mA) and current *<sup>I</sup> emin* (mA) at which muscle contraction commences are used for normalization, as shown bellow:

$$I\_f = (I'\_f - I'\_{f\text{min}}) / (I'\_{f\text{max}} - I'\_{f\text{min}}) \tag{5}$$

$$I\_{\varepsilon} = (I\_{\varepsilon}' - I\_{emin}')/(I\_{emax}' - I\_{emin}') \tag{6}$$

Where, *I f* , *I <sup>e</sup>* are the stimulus current values. If *rE* is considered to contribute to the joint equilibrium point in a manner similar to that in EMG analysis, any change in *rE* appears as a change in the hand force under constraints on hand movement, i.e., in an isometric environment. In this study, we investigate hand force in an isometric environment, as *rE* is changed while *aE* remains constant.

#### **2.3. ELECTRICAL AGONIST-ANTAGONIST MUSCLE ACTIVITY (EAA ACTIVITY) AND THE JOINT STIFFNESS**

In EMG analysis, how muscle activity *a* contributes to joint stiffness has been shown by Iimura et al. (2011). To confirm that EAA activity *aE* contributes to the joint stiffness in the same way, we conducted an experiment to increase or decrease the EAA ratio *rE* from *rE* = 0 to 1.0 in increments of 0.2 every 3 s (*aE* = {0*.*1*,* 0*.*3*,* 0*.*5*,* 0*.*7*,* 0*.*9*,* 1*.*0}). The averages of three trials for each *aE* are shown along with EAA input ratio in **Figure 3**. The results confirm that the displacement of the hand force increases with *aE*. In addition, when we estimated the transfer function(discussed more below), we determined values of the natural angular frequency *ω<sup>n</sup>* for three values of *aE* = {0*.*5*,* 0*.*8*,* 1*.*0} for subject A. We found that for *aE* = 1*.*0, *ω<sup>n</sup>* = 20*.*5 (rad/s); for *aE* = 0*.*8, *ω<sup>n</sup>* = 19*.*0 (rad/s); and for *aE* = 0*.*5, *ω<sup>n</sup>* = 14*.*3 (rad/s). These findings indicate that *aE* contributes to the joint stiffness.

#### **2.4. CONTROL MODEL**

If we consider motion control of the elbow joint from the perspective of the equilibrium point hypothesis, it is possible to define two parameters as the control variables: joint stiffness and equilibrium point. In this paper, we report on a method for controlling the elbow joint using the EAA ratio: the equilibrium point. As **Figure 4** shows, the exercise command *rE* from the external FES current to the muscle groups is added to the list of movement commands *rh* from the central nervous system to the agonist-antagonist muscle groups (the agonist-antagonist muscle ratio). Agonist-antagonist muscle pairs are driven by the movement command *r*( = *rh* + *rE*), and as a result, hand force *f* is generated in an isometric environment. To confirm this theory, a constant value of EAA activity *aE* = 1*.*0 was used.

#### **2.5. MODELING**

#### *2.5.1. Input–output of elbow joint system*

In this study, to achieve elbow joint control using the EAA ratio, we experimentally determined the frequency characteristics

between the hand force and the electrical stimulation input to the elbow joint system so that we could determine the transfer function for which the input is the EAA ratio and the output is the hand force in the isometric environment. Given that we seek the transfer function of the elbow joint system, a sine-wave EAA ratio with various periods *T* (s) was input to the muscles and the steady-state hand force was measured. We then used one cycle of one sine-wave input, perform sin-cos approximation using a multiple regression model, and expressed the result as a sinewave. Thus, we obtained the output amplitude and phase of each period, as well as the corresponding frequency characteristics of the values obtained. The EAA ratio can be expressed as a function of time as follows.

$$r\_E(t) = -0.5\sin\left(\frac{2\pi}{T}t\right) + 0.5\tag{7}$$

For the input, the EAA ratio was set to a sine-wave with possible values from 0 to 1, and the stimulation current value was determined. The normalized stimulation currents of each muscle, *Ie*(*t*), *If*(*t*), determined using Equations (3, 4), were calculated from the fixed electrical muscle activity *aE*, and the EAA ratio *rE* was determined using Equation (7). The stimulus current values *I <sup>e</sup>*(*t*), *<sup>I</sup> f*(*t*), that were actually applied to the muscle, were determined from the maximum stimulation amplitude *I max* (mA) , and minimum stimulus amplitude *I min* (mA) determined in advance. *I max* and *<sup>I</sup> min* are shown in **Table 1**. The resulting, *<sup>I</sup> <sup>e</sup>*(*t*) and *<sup>I</sup> <sup>f</sup>*(*t*) were determined from Equations (8, 9) in the case of subject B, for example. The hand force *f*(*t*) that appears as an output is approximated using the multiple regression model and can be reduced to a sine-wave by synthesizing the function Equation (10). The output form can be taken as the corresponding sinusoidal input.

$$I\_e'(t) = -3.0 \sin\left(\frac{2\pi}{T}t\right) + 8.0\tag{8}$$

$$I\_f'(t) = 4.5 \sin\left(\frac{2\pi}{T}t\right) + 7.0\tag{9}$$

$$f(t) = A \sin\left(\frac{2\pi}{T}t + \phi\right) + c\tag{10}$$

In these equations, *<sup>A</sup>* <sup>=</sup> <sup>√</sup> *a*<sup>2</sup> + *b*2, sin*φ* = *a/A*, cos*φ* = *b/A*, the output amplitude is *A*, the phase lag is *φ*, and the center value of the output sine-wave is *c*.

**Table 1 | Maximum and minimum stimulation amplitude for the six subjects.**


#### *2.5.2. Estimate of the transfer function*

Three trials involving an input of 10 cycles in each period were performed. The period *T* of the sine-wave EAA ratio represented by Equation (7) is incremented by 0.025 (s) in the 0.1–0.5 (s) range. The input was started 0.5 (s) after the start of measurement. After the measurement, the output was approximated as a sine-wave using multiple regression analysis. First, the output values from the three trials were averaged; then, the measured data was divided into 10 cycles of the input sine-wave, and the values of cycles 3–8 were averaged. These cycles represent steadystate behavior. We performed a multiple regression analysis on one cycle of the averaged output, which was approximated by the sine-wave obtained using Equation (10). We normalized the time axis of subject B's results, shown with the input sine-wave EAA ratio in **Figure 5**. The results show that the elbow joint system is controlled stably and smoothly via the simultaneous stimulation of multiple muscles based on the EAA ratio when either the hand force switches between positive and negative or the stimulation starts. These situations tend to generate unstable responses when multiple muscles are stimulated at different times. Furthermore, the vibration center of the output is shifted to the positive side (the extension side) when *T* is 0.4 (s) or less, but the amount of shift is approximately 0 (N) when *T* is 0.4 (s) or more. This is due to the difference in the response speeds of the extensor and the flexor, a phenomenon observed only in the high-frequency region of the input. Given that FES is intended to support day-to-day activities, the vibration center shift in the high-frequency input region is not considered to be a serious problem. Therefore, we focus only on the input–output amplitude ratio and the input–output phase difference and attempt to model the input–output relationship of the elbow joint system using a transfer function. **Figures 6A,B** show the gain diagram and phase diagram for the input and output data shown in **Figure 5**. The gain is nearly constant in the low-frequency region, and is linearly damped in the high-frequency region, which is typical of an *n*-order delay system. The slope of the high-frequency region, calculated using least squares approximation, is −42.5 (dB/dec) approximation. The gain characteristic is approximated using a second-order delay system. In contrast, the phase diagram shows that the phase has a larger phase lag than the second-order delay system. In this study, this phase delay, which cannot be represented as a second-order lag system, is modeled as a system with dead time.

show the gain and phase characteristics. These are approximated as a second-order system plus a dead time system, as expressed by Equation (12), and are represented by the broken line. We created similar models using the results obtained for subject A, C, D, E, and F. The results for the six subjects are shown in **Table 2**. The *ωn*, *K*, and *τ* values for these five subjects differ from those for subject B, but it is understood that all of the subjects' result can be modeled by transfer functions as a second-order system plus a dead time system. The estimated range of dead times, 0.045–0.100 (s), is consistent with the measured electrical stimulation latency results. We assumed that differences in the parameter values of each individual are related to the ratio of slow-twitch to fast-twitch muscle fibers of an individual and the rate of muscle development. However, in practice, it is possible to determine the optimum parameters values easily for individuals who exhibit some differences. These present modeling method is easy to use and very simple.

$$G(s) = K \cdot \frac{\alpha\_n^2}{s^2 + 2\zeta\alpha\_n s + \alpha\_n^2} \cdot e^{-\tau s} \tag{11}$$

$$G(s) = 11.22 \cdot \frac{420.25}{s^2 + 41s + 420.25} \cdot e^{-0.05s} \tag{12}$$

We assumed that the shape of the transfer function could be represented by Equation (11), where, *ω<sup>n</sup>* is the natural angular frequency, *K* is a constant, and *τ* is dead time. We assumed a value of *ζ* = 1 for the attenuation coefficient. **Figures 7A,B**

#### **3. RESULTS AND DISCUSSION**

#### **3.1. VERIFICATION**

In this section, to verify the effectiveness of the model, we present the following three types of hand force control results obtained in the isometric environment. The results for subject B

**Table 2 | Parameter values for the six subjects.**


are considered to be verified because the results for all subjects are substantially similar. The six subjects' multiple coefficients of determination are shown in **Table 3**.

(1) Response to continuously changing input

(2) Response to stepwise changing input

(3) Interaction with central movement command

#### *3.1.1. Response to continuously changing input*

We considered a task in which the direction and magnitude of the hand force change freely. We stimulated the agonist-antagonist muscle pair of the elbow joint using the synthesized EAA ratios of the two types (*T* = 0*.*3 and *T* = 0*.*6).

$$r\_E = 0.6\left(-0.5\sin\frac{2\pi}{0.3}t + 0.5\right) + 0.4\left(-0.5\sin\frac{2\pi}{0.6}t + 0.5\right) \tag{13}$$

The hand force value estimated using model equation (Equation 12) and the measured value of the hand force with the input waveform are shown in **Figures 8A,B**. Only one

**Table 3 | Multiple coefficients of determination.**


input–output cycle [0.6 (s) period] waveform in the steady state are depicted. Because the estimated and measured values of hand force are nearly equal, the validity of the model can be considered confirmed. The EAA ratio for a period of *T* = 0*.*3 (s) leads to a shift in the vibration center due to the difference between the response speeds of the agonist and antagonist muscles. However, in the input to Equation (13), when combined with the EAA ratio with a period of *T* = 0.6 (s), which is longer than 0.3 (s), there is hardly any shift in the vibration center.

#### *3.1.2. Response to stepwise changing input*

We considered a task with stepwise changes in the hand force magnitude. We increment or decrement the EAA ratio by 0.2 every 3 (s) beginning at *rE* = 0. The hand force value estimated using model equation (Equation 12) and the measured value of the hand force with the input waveform are shown in **Figures 9A,B**. Except when the EAA ratio was near 1 or 0, the difference between the estimated and measured hand force values was 2 (N) or less. The results show that the model can represent steady-state characteristics in a practical manner. When the EAA ratio is near 1 or 0, it is assumed that the extensor or flexor acts alone. Therefore, the model's estimation error increases.

#### *3.1.3. Interaction with central movement command*

In this study, the elbow joint was controlled by the EAA ratio, which is considered an equilibrium point, as shown in **Figure 4**. It was assumed that the equilibrium point is operated by a scheme representing the sum of the FES commands given based on the external and motion commands from the central nervous system. To validate this theory, we performed an experiment in which electrical stimulation was provided in a state in which the subject was generating hand force intentionally. The subject was confirmed that he maintained a positive or negative hand force of approximately 10 (N) without feedback. In addition, our input pattern given by Equation (14) was limited to only two cycles [for 1 (s)]. We repeated this pattern three times at intervals of 2.0 (s).

$$r\_E = -0.5\sin\left(\frac{2\pi}{0.5}(t-a)\right) + 0.5\tag{14}$$

The input stimulation was applied only when *a* ≤ *t* ≤ *a* + 1 (s) with *a* = {1*,* 4*,* 7}. The hand force was estimated using model

equation (Equation 12), The measured value of the hand force and the input waveform are shown in **Figures 10A–C**. We stimulated the agonist-antagonist muscle pair in the elbow joint based on the EAA ratio given by Equation (14) while the subject was generating approximately +10 (N) of hand force (in the elbow extension direction). The results are shown in **Figure 10B**. In addition, we stimulated the agonist-antagonist muscle pair of the elbow joint based on the EAA ratio given by Equation (14) while the subject was generating approximately −10 (N) of hand force (in the elbow flexion direction). The results are shown in **Figure 10C**. In both cases, the results show that the hand force was maintained at a positive or negative value of approximately 10 (N) in the section without electrical stimulation. The hand force varied in the section with electrical stimulation, and the magnitude of these changes were close to the hand force value obtained using Equation (12). These results indicate that FES stimulation can be useful in supporting daily human motions. That is, day-today human tasks, we can design the necessary movement support system based on FES, using motion commands from the central nervous system.

#### **4. CONCLUSIONS**

In this study, muscle co-contraction was employed in FES. We focused on the agonist-antagonist muscle pair that drives the elbow joints. We proposed an electrical stimulation method that stimulates units of agonist-antagonist muscle pairs. The effectiveness of the proposed method was validated through experiments requiring control of the hand force of a single elbow joint with activation of one agonist-antagonist muscle pair in an isometric environment using six subjects. Based on the results obtained from performing simultaneous stimulation of multiple muscles based on the EAA ratio, we can draw the following conclusions.


These findings indicate that our proposed method is an effective solution to the problem of redundancy in an agonist-antagonistic drive system and non-linearity between stimulus current values and muscle force/length. We indicated the possibility that highspeed, highly accurate hand force control can be achieved using this model as an inverse system. This model can also be used for tasks involving joint motion, if this model is applied as a rigid body link model (input:joint torque, output:joint angle).

The results of the experiment in which electrical stimulation was conducted together with the conscious application of hand

force demonstrate that FES can be used to design a system to provide the necessary movement support for daily human tasks using motion commands from the central nervous system.

It is necessary to ensure that stimulation patterns can be adjusted according to the requirements of the FES application to a variety of tasks. Previous FES studies might have inadvertently neglected the regulation of additional properties involved in coordinating various muscles such as joint stiffness or methods of dealing with muscle redundancy (Jarc et al., 2013). Our proposed method offers the following advantages:


In this study, the environment was limited to being isometric, with the moving joint limited to being only an elbow joint and fatigue is excluded. We normalized the FES intensity to a level at which the subject did not feel pain. In our future research, we plan to normalize the FES intensity at a level at which the force is balanced. In the future, we will apply the proposed method to tasks with joint motion, multiple joints, and tasks performed for long periods of time to further validate the effectiveness of the method.

## **REFERENCES**


neuro-mechanical control of upper limb movement. *Adv. Rob.* 28, 5. doi: 10.1080/01691864.2013.876940


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 January 2014; accepted: 29 May 2014; published online: 17 June 2014. Citation: Matsui K, Hishii Y, Maegaki K, Yamashita Y, Uemura M, Hirai H and Miyazaki F (2014) Equilibrium-point control of human elbow-joint movement under isometric environment by using multichannel functional electrical stimulation. Front. Neurosci. 8:164. doi: 10.3389/fnins.2014.00164*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Matsui, Hishii, Maegaki, Yamashita, Uemura, Hirai and Miyazaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Feedback control of arm movements using Neuro-Muscular Electrical Stimulation (NMES) combined with a lockable, passive exoskeleton for gravity compensation

*Christian Klauer 1, Thomas Schauer 1, Werner Reichenfelser 2, Jakob Karner 2, Sven Zwicker 3, Marta Gandolla4, Emilia Ambrosini 4, Simona Ferrante4, Marco Hack5, Andreas Jedlitschka5, Alexander Duschau-Wicke3, Margit Gföhler <sup>2</sup> and Alessandra Pedrocchi <sup>4</sup> \**

*<sup>1</sup> Control Systems Group, Technische Universität Berlin, Berlin, Germany*

*<sup>2</sup> Research Group for Machine Design and Rehabilitation, Vienna University of Technology, Vienna, Austria*

*<sup>3</sup> Hocoma AG, Volketswil, Switzerland*

*<sup>4</sup> NeuroEngineering and Medical Robotics Laboratory, NearLab, Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy*

*<sup>5</sup> Fraunhofer Institute for Experimental Software Engineering, Kaiserslautern, Germany*

#### *Edited by:*

*Jose L. Pons, CSIC, Spain*

#### *Reviewed by:*

*Juan C. Moreno, Spanish National Research Council, Spain Diego Torricelli, Consejo Superior de Investigaciones Cientificas, Spain*

#### *\*Correspondence:*

*Alessandra Pedrocchi, NeuroEngineering and Medical Robotics Laboratory, NearLab, Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Via Giuseppe Colombo 40, 20133 Milano, Italy e-mail: alessandra.pedrocchi@ polimi.it*

Within the European project MUNDUS, an assistive framework was developed for the support of arm and hand functions during daily life activities in severely impaired people. This contribution aims at designing a feedback control system for Neuro-Muscular Electrical Stimulation (NMES) to enable reaching functions in people with no residual voluntary control of the arm and shoulder due to high level spinal cord injury. NMES is applied to the deltoids and the biceps muscles and integrated with a three degrees of freedom (DoFs) passive exoskeleton, which partially compensates gravitational forces and allows to lock each DOF. The user is able to choose the target hand position and to trigger actions using an eyetracker system. The target position is selected by using the eyetracker and determined by a marker-based tracking system using Microsoft Kinect. A central controller, i.e., a finite state machine, issues a sequence of basic movement commands to the real-time arm controller. The NMES control algorithm sequentially controls each joint angle while locking the other DoFs. Daily activities, such as drinking, brushing hair, pushing an alarm button, etc., can be supported by the system. The robust and easily tunable control approach was evaluated with five healthy subjects during a drinking task. Subjects were asked to remain passive and to allow NMES to induce the movements. In all of them, the controller was able to perform the task, and a mean hand positioning error of less than five centimeters was achieved. The average total time duration for moving the hand from a rest position to a drinking cup, for moving the cup to the mouth and back, and for finally returning the arm to the rest position was 71 s.

**Keywords: neuro-muscular electrical stimulation, neuroprosthetics, exoskeleton, feedback control, assistive technology, eye tracking**

## **1. INTRODUCTION**

The consequences of Spinal Cord Injury (SCI) can be severe. Depending on the level of the lesion, SCI causes a loss of motor and sensory functions, and results in the immobilization of the patient. The level of lesion in SCI refers to the vertebrae in the spinal column affected by the injury. The higher the injury on the spinal cord, the more dysfunction can occur. Cervical (neck) injuries usually result in a full or partial tetraplegia (paralysis of the arms, legs, and trunk of the body below the level of the associated injury to the spinal cord). Individuals with a complete lesion at the C7 level or above (C6, C5, . . . ) usually depend on attendant care for all daily life activities.

In SCI patients, the neural pathway from the Central Nervous System (CNS) to the muscles is interrupted. The injury may cause a complete or partial lesions of the upper and/or lower motor neurons. The upper motor neuron originates in the motor region of the cerebral cortex or the brain stem and carries motor information down to the lower motor neurons. All lower motor neurons (LMNs) related to voluntary movements are located in the ventral horn of the spinal cord and anterior nerve roots (spinal lower motor neurons) and innervate skeletal muscle fibers. They act as a link between upper motor neurons and muscles. In case of upper motor neuron lesions, Neuro-Muscular Electrical Stimulation (NMES) can be applied to the lower motor neurons that are still intact to cause artificial contractions of the innervated muscles (Sheffler and Chae, 2007). This will replace the lacking control signals from the CNS to the muscles.

Restoration of grasp function by NMES in spinal cord injured individuals has been realized by different research groups and is even available in form of commercial systems (for an overview see Popovic et al., 2002; Rupp and Gerner, 2007). Available neuroprostheses for grasping are able to restore the two most frequently used grasping styles: the palmar and the lateral grasp (Popovic et al., 2002). C7-C5 complete SCI subjects benefit the most from a grasping neuroprosthesis and achieve a high level of independence in Activities of Daily Living (ADL). These individuals have sufficient residual function of the proximal upper limb muscles that allow them to perform reaching tasks.

Injuries at the high C3 and C4 level result in a significant loss of function at elbow and shoulder level. Deltoid and the biceps muscles are innervated from the C5 and C6 level of the spinal cord. These muscles may be also denervated (lower motor neuron lesion), especially in case of C4 tetraplegia. However, the extent of denervation is likely to vary across individuals. The feasibility to restore shoulder and elbow functions at least partially by NMES was demonstrated by Acosta et al. (2001) in persons with C3/C4 tetraplegia using percutaneous stimulating electrodes and by Bryden et al. (2000) in persons with C5/C6 tetraplegia using a fully implanted stimulation system. However, the generated force in individuals with C3 and C4 SCI was not sufficient to hold the arm against gravity. In this context, it should also be noted, that a long lasting electrical stimulation of shoulder and arm muscles is overall not appropriate due to the fast fatigue of electrically stimulated muscles.

In order to enable reaching functions in individuals with SCI at C3 and C4 level, NMES-hybrid orthoses have been investigated. In Hoshimiya et al. (1989), a balanced forearm orthosis (BFO) was used for supporting arm motions. Smith et al. (1996) used a suspended sling to provide shoulder joint stability, and Nathan and Ohry (1990) applied mechanical splinting. All studies reported limited performance because of insufficient shoulder control. The stimulation was commanded by voice control (Nathan and Ohry, 1990), by breathing patterns (Hoshimiya et al., 1989) or by contralateral shoulder motion sensed by a position transducer (Smith et al., 1996).

Schill et al. (2011) developed the system OrthoJacket—an active NMES hybrid orthosis for the paralyzed upper extremity. The system combined NMES controlled grasping with an electrical/pneumatic actuation of shoulder movements and a flexible fluid actuator for support of elbow-joint movements. For control of the orthosis, EMG signals from arm muscles were acquired. This means that only individuals with some residual arm/hand functions could use this system. Furthermore, NMES was not used for movement generation at the shoulder or elbow-joint.

Within the EU project TOBI, a further NMES hybrid orthosis was developed to support both grasping and elbow-joint movements by NMES (Rohm et al., 2010). However, this system required sufficient residual shoulder function to be provided by the user. To avoid an excessive stimulation of the biceps muscle during holding tasks, the orthosis' elbow-joint was selflocking in direction of flexion and electrically de-lockable. A Brain Computer Interface (BCI) and a shoulder joystick at the non-supported side were provided as interfaces for the control of the orthosis.

In all existing systems, either NMES was applied in an openloop manner using pre-defined stimulation patterns or the patient had to adjust the stimulation intensity by himself, e.g., via a position transducer at the contralateral shoulder or through EMG signals of preserved muscles. None of the systems allows the automatic positioning of the hand at arbitrary positions in the reachable workspace. In addition, deviations from the desired behavior, e.g., due to muscular fatigue, are not automatically compensated.

This study aims at developing a fully feedback-controlled arm neuroprosthesis for individuals with no or very weak residual arm and shoulder functions (such as persons with C3/C4 tetraplegia). In contrast to existing arm neuroprostheses, the proposed solution allows to position the hand at arbitrary desired positions within the reachable workspace. This arm neuroprothesis is a component of the modular assistive framework MUNDUS (Pedrocchi et al., 2013), that has been developed to support and recover arm and hand functions in severely impaired people. The arm reaching functionality can be extended by a robotic or NMES-based module for grasping assistance.

To reduce the amount of required stimulation for the arm and shoulder muscles, a passive light-weight exoskeleton supports the user in addition to NMES. The main purpose of the exoskeleton is the gravity compensation by a passive spring mechanism. In addition to this, the exoskeleton enables all joints to be locked for holding the arm at given positions without NMES. Thus, only point-to-point movements under gravity compensation have to be realized by means of artificial muscle activation, assuming no or insufficient residual motor control by the user over his/her arm and shoulder musculature.

Automatic control of NMES to achieve functional shoulder/arm movements is challenging due to the highly non-linear and time-varying behavior of the electrically stimulated muscles (Lynch and Popovic, 2008). Mimicking physiological movements would require to identify the musculo-skeletal system of the arm for each individual and each time the system is applied. This would require a long lasting calibration procedure infeasible in clinical environments or at home. For the use of NMES in stroke rehabilitation, Iterative Learning Control (ILC) has been proposed in order to generate precise functional reaching movements (Freeman et al., 2012). ILC demands a cyclic movement generation. After every movement cycle, an error trajectory with respect to a given reference movement will be determined and used to either update an open-loop applied stimulation pattern or to update the reference trajectory of an underlying feedback controller. The latter approach guaranties a sufficiently small tracking error even for initial ILC trials but again requires a detailed model in order to design the feedback controller. To avoid any huge calibration effort, we present a simpler movement generation strategy that involves sequential NMES control of all Degrees of Freedoms (DoFs) available in the exoskeleton.

The manuscript is structured as follows: in Section 2.1, an overview of the overall control system architecture is given. Sections 2.2 and 2.3 then describe the employed exoskeleton and the muscle actuation by NMES, respectively, in detail. In Section 2.4, we introduce the kinematic model of the exoskeleton and its parameter identification as well as required coordinate transformations used by the arm controller. In Section 2.5, the feedback controlled generation of arm movements is presented in detail. Then, in Section 2.6, we describe the experimental trials performed on healthy subjects to evaluate the performance of the control system. Section 3 summarizes the results in terms of the positioning error and execution times achieved in the validation trials. The article closes with a discussion and some conclusions.

### **2. MATERIALS AND METHODS**

#### **2.1. CONTROL SYSTEM ARCHITECTURE**

The entire system developed for the support of the reaching movements is depicted in **Figure 1**. Potential users have no or very weak residual voluntary activation of arm, shoulder and hand muscles, but they can still control the head and gaze fixation. They usually sit in a wheelchair in front of a table. The target motions supported by the system are daily life activities, such as drinking, eating, brushing, touching the own body, pushing an alarm button, and moving an object on the table.

The arm/shoulder movements are induced by NMES while an exoskeleton guides the movement and supports the arm during static postures in absence of NMES. The control signals (stimulation intensities and on/off state of the exoskeleton brakes) are generated by a real-time controller that receives commands from the Central Controller (CC) implemented in form of a finite state machine. The central controller instructs the real-time controller to move the hand to a given target position in the reachable workspace. Sensors integrated in the exoskeleton measure joint angles that are used as feedback variables by the real-time controller. The NMES control algorithm sequentially controls each joint angle while locking the other DoFs.

The user interacts with the system by means of an eyetracker. Therefore, a commercial system, the Tobii T60W system (Tobii Technology AB, Sweden), has been extended by a specific GUI for the MUNDUS application. The table-mounted eyetracker is integrated into a 17-- TFT monitor. During tracking, the Tobii T60 uses infrared diodes to generate reflection patterns on the corneas of the user's eyes. Proper image processing is used to identify the gaze point on the screen. The three dimensional position of the user's hand, of the objects to be manipulated, and of the mouth are continuously monitored by environmental sensors, i.e., two Kinect cameras (Microsoft Corp., Redmond, USA). To this end, colored markers are attached to the hand and the objects. The first Kinect camera provides an image of the working space to the eye-tracking screen. To start an interaction with a specific object, the user has to visually fixate this object on the eyetracker screen for a pre-defined time duration. Once an object is selected, the corresponding Kinect coordinates are sent to the CC which transforms these coordinates into the global (exoskeleton) 3D coordinate system. The transformed coordinates will then be used by the real-time controller for movement generation. The second Kinect camera is placed in front of the user and is used to track the face position.

The fixation detection algorithm has been exclusively developed for the specific MUNDUS application, and it comprises user-dependent temporal (i.e., time during which the user has to continuously fix an object or an icon on the screen to select the gazed point) and spatial (i.e., area around the barycenter of the cluster of gaze samples inside which each sample has to fit for a fixation to be revealed) threshold settings. To prevent unwanted fixation detections, a confirmation icon is shown on the eyetracking screen after a fixation event is detected, and the user is asked to confirm or cancel the selection. Moreover, the working space where the user can select the object/action to interact with is shown only when the user him/herself has selected the START icon from the standby interface that is provided by the eyetracking screen when MUNDUS is waiting for user interaction.

Special parts of the eye-tracker screen are dedicated to other available tasks (e.g., activating emergency switch off, touching spots of the body). The emergency icon is always displayed in the top-left corner of the screen, and it is continuously selectable to allow the user to stop MUNDUS. If the emergency icon is fixated, a message is sent by the eye-tracker that stops all MUNDUS components. To trigger sub-actions, specific questions are displayed on the screen and the user can reply by fixating a GO or a STOP icon.

The central controller interfaces all modules and interacts with the eyetracker and the real-time controller. For the purpose of system integration, the software components of the CC and the eyetracker module have been integrated in one single MS Windows-based PC. The real-time controller and the data processing of the environmental sensor module

are based on a computer system running Linux with RTAI extension1 . Development and testing of the control system is performed in Scilab/Scicos 4.1.22 using the real-time framework OpenRTDynamics3 . The communication between all modules is established via UDP and messages are broadcasted in XML format.

#### **2.2. EXOSKELETON**

As a basis for the exoskeleton design, the previously mentioned target motions were analyzed using a motion capture system (Lukotronic, Lutz Mechatronic Technology e.U, Austria) to estimate the required ranges of motion and expected loads at the joints (Karner et al., 2012; Reichenfelser et al., 2013). The 3D mechanical design was done in Catia V5R19 (Dassault Systmes, France), focusing on modularity, simplicity and light weight. The developed exoskeleton with gravity compensation is shown in **Figure 2A**. The available degrees of freedom (DoF) of the exoskeleton are:


The rotation of the forearm around the upper arm axis (humeral rotation) and pronation/supination of the forearm are locked by the exoskeleton as these DoFs are difficult to be controlled by NMES using surface electrodes. Due to the reduced DoFs, the orientation of the hand is not freely adjustable in the workspace. Thus, to allow a safe handling of objects despite this constraint, special objects with an universal joint in the handle have been developed (e.g., cup holder shown in **Figure 2B**).

The exoskeleton is equipped with magnetic encoders (Vert-X, Contelec AG, Switzerland) to measure the angles for all three DoFs. Electromagnetic DC brakes (Kendrion, Germany) can lock the shoulder horizontal rotation with a torque of 2.5 Nm, the shoulder flexion/extension with up to 5 Nm and the elbow flexion/extension with 1.5 Nm to hold the arm in any posture when the stimulation is switched off.

To realize gravity compensation, a pressure spring is integrated in a vertical carbon tube that can be either mounted on a wheelchair as shown in **Figure 2** or alternatively attached to a body harness for mobile use. The spring force is transferred to the elevation lever by a rope and pulley mechanism. **Figure 3** depicts an isometric view of the shoulder joint mechanism and shows the occurring torques as a function of shoulder elevation angle. A slight under-compensation (spring torque smaller than gravity torque) is intended as the arm should move downwards slowly and gravity-induced when the stimulation and the brakes are turned off. The amount of compensation is adjusted manually by changing the wind up length of the rope at the spring adjustment module. A linear guiding provides the connection between the elevation lever and the upper arm shell and compensates misalignment of the anatomical and the mechanical shoulder joint. This also minimizes the reaction forces. For the elbow-joint, an elastic band with a variable attachment point acts as weight support.

The exoskeleton has a total weight of 2.2 kg and can be quickly adjusted to different anthropometric dimensions.

#### **2.3. NEURO-MUSCULAR ELECTRICAL STIMULATION**

The desired arm movements are induced by four stimulation channels activating the anterior, posterior and medial deltoid as well as the biceps muscle (cf. **Table 1**). By stimulating the medial deltoid, the shoulder extension can be actuated, while the anterior and posterior deltoid allow arm rotation in the horizontal plane. Stimulation of the biceps is used to flex the elbow-joint. Shoulder flexion as well as elbow extension are induced by gravitational forces.

One pair of self-adhesive hydrogel electrodes (oval shaped with size 4 × 6.4 cm) is used for each stimulated muscle. For the generation of the biphasic stimulation pulses, the current-controlled stimulator RehaStim Pro (HASOMED GmbH, Germany) is used.

<sup>1</sup>http://www*.*rtai*.*org

<sup>2</sup>http://www*.*scilab*.*org

<sup>3</sup>http://openrtdynamics*.*sourceforge*.*net/

#### **Table 1 | Stimulation channels.**


The stimulation frequency for all channels is fixed at 25 Hz, while the individual current amplitudes and pulse widths can be adjusted in real-time using the open ScienceMode protocol4 through a galvanically isolated USB interface.

The stimulation intensity in terms of pulse charge *ν<sup>i</sup>* serves as control signal for the muscle *i*. **Table 1** shows the used control signal notation. The pulse charge *ν<sup>i</sup>* of the muscle *i* is defined as product of the current amplitude *Ii* and the pulsewidth *pwi*. In this application, a given charge is equally distributed to pulse width and current amplitude (normalized to their maximal values) as follows:

$$\rho \mathbf{w}\_{i} = \sqrt{\frac{\nu\_{i} \rho \mathbf{w}\_{\text{max}}}{I\_{\text{max}}}}, \quad I\_{i} = \sqrt{\frac{\nu\_{i} I\_{\text{max}}}{\rho \mathbf{w}\_{\text{max}}}}, \quad \mathbf{0} \le \nu\_{i} \le (I\_{\text{max}} \,\rho \mathbf{w}\_{\text{max}}),$$

where *pw*max = 500*μ*s and *I*max = 127 mA are the maximal values of pulse width and current amplitude, respectively.

In a calibration phase that is always performed before using the MUNDUS system, the maximal tolerated pulse charge *ν<sup>i</sup>* of each muscle *i* is determined. Additionally, for the medial deltoid, the stimulation intensity *νd,<sup>m</sup>* that causes the onset of a visible muscle contraction is determined. This value is required for the implementation of the more complex shoulder flexion/extension controller described in Section 2.5.2.

#### **2.4. KINEMATIC MODEL AND COORDINATE TRANSFORMATIONS**

To calculate the hand position from a given set of joint angles or vice versa, a kinematic model of the exoskeleton is required. In addition, a transformation from the Kinect coordinate system to the global (exoskeleton) coordinate system must be determined for the following reason: Objects to interact with may be arbitrarily located on the table in front of the user. The Kinect is required to determine the object position in the local Kinect coordinate system. In order to bring the hand to objects by NMES, the Kinect coordinates must be mapped into exoskeleton 3D coordinates and corresponding exoskeleton angles. The latter are used to describe the hand position in the real-time arm controller.

It is assumed that the placement of the Kinect as well as the settings of the exoskeleton may change from day to day. Therefore parameters need to be determined with simple and fast procedure through experimental system identification.

**Figure 4** shows the simplified kinematic exoskeleton/arm model with the global (exoskeleton) coordinate system (*x<sup>g</sup> , y<sup>g</sup> ,z<sup>g</sup>* ) and the Kinect coordinate system (*xk, yk,zk*). Both are Cartesian coordinate systems. Depicted is the right arm reaching forward. The model assumes that the exoskeleton is completely rigid and that the arm is perfectly aligned to the exoskeleton.

<sup>4</sup>http://sciencestim*.*sf*.*net

The forward kinematics is given by

$$\mathbf{p}\_{\mathbf{h}}^{\mathbf{g}}(\vartheta\_{u},\varphi\_{u},\vartheta\_{f}) = -(l\_{u}\mathbf{R}(\vartheta\_{u},\varphi\_{u}) + l\_{f}\mathbf{R}(\vartheta\_{u},\varphi\_{u})\mathbf{R}(\vartheta\_{f},\varphi\_{f}))\mathbf{e}\_{\mathbf{z}}.\tag{1}$$

where *p g <sup>h</sup>* is the hand position in global coordinates, *ez* = [0*,* 0*,* 1] *<sup>T</sup>* is a unity vector, and *lf* and *lu* are the lengths of the forearm and upper arm, respectively. The rotation matrix *R* is defined as follows:

$$\mathcal{R}(\vartheta,\varphi) := \begin{bmatrix} \cos\varphi\cos\vartheta & -\sin\varphi \ -\sin\vartheta \ \cos\varphi \\ \cos\vartheta \sin\varphi & \cos\varphi \ -\sin\varphi \sin\vartheta \\ \sin\vartheta & 0 & \cos\vartheta \end{bmatrix} . \tag{2}$$

In the used setup, the humeral rotation angle *ϕ<sup>f</sup>* of the shoulder is constant, as it represents a fixed DoF, and its value is determined by the configuration of the exoskeleton.

Equation (1) can be used to determine the hand position for a given set of exoskeleton angles. The *inverse kinematics* can be obtained by numerically solving Equation (1) to determine the angles *ϑu*, *ϕ<sup>u</sup>* and *ϑ<sup>f</sup>* for a given hand position *p g <sup>h</sup>* within the reachable workspace and angle *ϕ<sup>f</sup>* . The solution is unique as the humeral shoulder rotation angle *ϕ<sup>f</sup>* is fixed, and the operational space for *ϑ<sup>f</sup>* is limited by the mechanical constraints to [0*,π*].

The transformation from Kinect coordinates to global coordinates is visualized in **Figure 4** and can be written as

$$\mathbf{p}^{\mathbb{g}} = \mathbf{R}\_{\mathbf{k}}(\boldsymbol{\phi}, \boldsymbol{\theta}, \boldsymbol{\psi})\mathbf{p}^{\mathbb{k}} + \mathbf{t}\_{\mathbf{k}} \tag{3}$$

where *p<sup>g</sup>* = *x<sup>g</sup> y<sup>g</sup> z<sup>g</sup> T* , *p<sup>k</sup>* = *x<sup>k</sup> y<sup>k</sup> z<sup>k</sup> T* , and *tk* ∈ R<sup>3</sup> <sup>×</sup> <sup>1</sup> is a translation vector, and *Rk* ∈ R<sup>3</sup> <sup>×</sup> <sup>3</sup> a rotation matrix which is parameterized by the Euler angles *φ*, *θ*, and *ψ*.

#### *2.4.1. Parameter identification*

The parameters *φ,θ,ψ,* and *tk* of the coordinate transformation as well as the kinematic model parameters *lu, lf ,* and *ϕ<sup>f</sup>* are unknown and have to be calibrated for each user each time the system is set up. Therefore, a system identification procedure is applied to determine the nine parameters. During the calibration phase, the arm and the attached unlocked exoskeleton are manually placed by a third person (e.g., the caregiver) at *N* different positions in the reachable workspace that can be reached with the arm attached to the exoskeleton. Since nine parameters need to be identified, *N* ≥ 9 positions must be visited. The reachable workspace is at first defined by the forward kinematics of the exoskeleton. However, this space may be furthermore limited by insufficient NMES-induced muscle force.

For each hand position *i*, the corresponding joint angles (*ϑu,i*, *ϕu,i*, *ϑ<sup>f</sup> ,i*) are measured together with the hand position vector

$$\mathbf{p}\_{h,i}^{k} = \begin{bmatrix} \boldsymbol{\pi}\_{h,i}^{k} \ \boldsymbol{\nu}\_{h,i}^{k} \ \boldsymbol{z}\_{h,i}^{k} \end{bmatrix}^{T},\tag{4}$$

which is recorded by the environmental sensor in the Kinect coordinate frame.

The unknown parameter vector *-* = *lu lf ϕ<sup>f</sup> φθψ tk <sup>T</sup><sup>T</sup>* is estimated by minimizing a quadratic cost function

$$\hat{\boldsymbol{\Theta}} = \arg\min\_{\boldsymbol{\Theta}} \left( \frac{1}{2} \sum\_{i=1}^{N} \mathbf{e}\_i \mathbf{e}\_i^T \right) \tag{5}$$

where

$$\mathbf{e}\_{i} := \underbrace{\left(-\left(l\_{\mu}\mathbf{R}(\vartheta\_{u,i},\varphi\_{u,i}) + l\_{f}\mathbf{R}(\vartheta\_{u,i},\varphi\_{u,i})\mathbf{R}(\vartheta\_{f,i},\varphi\_{f})\right)\mathbf{e}\_{\mathbf{z}}\right)}\_{\mathbf{P}^{\mathcal{S}}\_{h,i,\text{FK}}}$$

$$-\underbrace{\left(\mathbf{R}\_{\mathbf{k}}(\phi,\theta,\psi)\cdot\mathbf{p}^{k}\_{h,i} + \mathbf{t}\_{\mathbf{k}}\right)}\_{\mathbf{P}^{\mathcal{S}}\_{h,i,\text{Kinc}}}\tag{6}$$

is the error between the hand position *p g <sup>h</sup>,i,*FK, obtained by the forward kinematic model (1), and the hand position *p g <sup>h</sup>,i,*Kinect, obtained from the transformed Kinect measurements, both in global coordinates. The minimization of the cost function is achieved by the Gauss-Newton method with analytically calculated gradients.

#### **2.5. CONTROL SYSTEM**

All NMES generated arm movements are initiated by commands received from the high level control system, the Central Controller (CC), which processes, among others, the information collected by the eye-tracker. The CC movement commands are:


Each command emits an event causing a state transition in a finite state-machine on the real-time control system, which then performs the actual movement.

Based on the elementary movement commands outlined above, complex movement sequences are possible by a combination of multiple commands issued in series. An example for the drinking use case is outlined in **Figure 5**.

In this study, the hand movements were performed voluntarily by the subject. In the complete MUNDUS system, two alternative solutions to support hand functions have been proposed: a hand neuroprosthesis and a robotic hand orthosis (Pedrocchi et al., 2013). The hand neuroprosthesis deploys a new stimulation system for array electrodes (Valtin et al., 2012) in order to produce precise finger movements. However, the description of these hand modules is outside the scope of this study.

It should be noted that the straight lines shown in the center of **Figure 5** do not represent the actual trajectories of the hand. The actual generation of a movement between two points by the real-time controller will be described in the next section.

## *2.5.1. Sequential real-time control strategy*

The real-time control system internally controls the angles of the exoskeleton. Therefore, whenever a command is issued by the CC, new angular references are determined by the real-time control system. This calculation involves, if required, also stored old angular references from the last movement and the inverse exoskeleton kinematics. The resulting reference angles of the *j*th command are *r j ϑu* , *r j <sup>ϕ</sup><sup>u</sup>* , and *r j <sup>ϑ</sup><sup>f</sup>* for the shoulder ab-/adduction, the horizontal shoulder rotation, and the elbow flexion/extension, respectively.

Sequential feedback control is used to adjust the stimulation intensities (pulse charges) in order to drive the hand to desired positions in the reachable work space. Each DoF is controlled separately, one after the other while all other DoFs are locked by the exoskeleton brakes. This results in a fully decoupled system with regard to crosstalk between the DoFs. For this reason, a light model with few parameters can be used for each controller design, which dramatically reduces the effort for parameter identification. Each movement to a given 3d position is divided into three consecutive steps:

**FIGURE 5 | The state automaton inside the MUNDUS Central Controller (CC) to realize the drinking use case starting from an arm rest position and returning to this position again.** The states (S3, S5, S7, S9, S10, S12, S14, S15) with arm movements trigger a state machine inside the real-time arm NMES control module (cf.

**Figure 6**). The references for the rest position as well as for the mouth position may be stored in the MUNDUS CC as angular references during the system calibration phase. The object position is online determined by the Kinect system by tracking a green marker on the object handle.


The real-time arm NMES controller is a hybrid control system combining a state automaton and continuous-time feedback controllers to reach the desired angle subsequently for each DOF (cf. **Figure 6**).

#### *2.5.2. Shoulder flexion/extension control*

For the shoulder flexion/extension, a discrete-time controller based on an identified pulse transfer-function model is employed. The control design uses the well-known pole-placement method in polynomial form (Astrom and Wittenmark, 1996). For the *j*th activation of the controller, the relation between the stimulation intensity *ν j <sup>d</sup>,<sup>m</sup>* of medial deltoid and the shoulder elevation angle *ϑj <sup>u</sup>* can be approximately described by a second order *autoregressive with exogenous input* (ARX) model (Ljung, 1999) of the form

$$\vartheta\_{\boldsymbol{u}}^{\boldsymbol{j}}(\boldsymbol{k}) = \frac{\mathsf{B}(\boldsymbol{q})}{\mathsf{A}(\boldsymbol{q})} \boldsymbol{\nu}\_{\boldsymbol{d},\boldsymbol{m}}^{\boldsymbol{j}}(\boldsymbol{k}) + \frac{\boldsymbol{q}^{2}}{\mathsf{A}(\boldsymbol{q})} \boldsymbol{e}^{\boldsymbol{j}}(\boldsymbol{k}),$$

$$\underline{\boldsymbol{\nu}}\_{\boldsymbol{d},\boldsymbol{m}} \le \boldsymbol{\nu}\_{\boldsymbol{d},\boldsymbol{m}}^{\boldsymbol{j}}(\boldsymbol{k}) \le \overline{\boldsymbol{\nu}}\_{\boldsymbol{d},\boldsymbol{m}}, \boldsymbol{k} \ge \boldsymbol{0},\tag{7}$$

where *k* is the sample index, *e<sup>j</sup>* (*k*) represents white noise, and

$$\begin{aligned} \mathbf{B}(q) &= b\_0, \\ \mathbf{A}(q) &= (q^2 + a\_1 q + a\_2) q^4 \end{aligned}$$

are polynomials of the forward-shift operator *q* (*qs*(*k*) = *s*(*k* + 1)). This model possesses an input-output time delay of six sampling instants, which is typically observed in the recorded I/O data. The used sampling frequency is 25 Hz and equals to the stimulation frequency. During the system calibration, the coefficients of the polynomials are estimated from a recorded input step response (changing *νd,<sup>m</sup>* from (*νd,<sup>m</sup>* + 0*.*2(*νd,<sup>m</sup>* − *νd,m*)) to (*νd,<sup>m</sup>* + 0*.*8(*νd,<sup>m</sup>* − *νd,m*))) using the instrumental variable method (Ljung, 1999).

Based on the obtained model, a polynomial controller of the form

$$\boldsymbol{\nu}\_{d,m}^{\boldsymbol{j}}(k) = \frac{\mathbf{S}(q)}{\overline{\mathbf{R}}(q)(1-q)} \left( \frac{\mathsf{T}(q)}{\mathbf{S}(q)} \boldsymbol{r}\_{\boldsymbol{\vartheta}\_{u}}^{\boldsymbol{j}} - \boldsymbol{\vartheta}\_{u}^{\boldsymbol{j}}(k) \right) \tag{8}$$

is designed with the controller polynomials R(*q*)*,*S(*q*), and T(*q*). **Figure 7** shows the corresponding closed-loop system. The controller has integral action [factor (1 − *q*) in (8)]. This enables the rejection of constant and slowly varying disturbances and compensates the effects of muscular fatigue. The coefficients of the controller polynomials R(*q*) and S(*q*) are chosen to obtain a desired characteristic polynomial

$$\mathsf{A}\_{cl}(q) = (1-q)\overline{\mathsf{R}}(q)\mathsf{A}(q) + \mathsf{S}(q)\mathsf{B}(q) \tag{9}$$

the roots of which are equal to the closed-loop system poles and should be stable and well damped. For the given system and controller with integrator, the minimal degree controller is given by

deg (S) = 6, deg (R) = 5 and deg A*cl* = 12. A common approach is to factorize A*cl*(*q*) as follows:

$$\mathsf{A}\_{cl}(q) = \mathsf{A}\_{cl,1}(q)\mathsf{A}\_{cl,2}(q)q^8 \tag{10}$$

where A*cl,*1(*q*) and A*cl,*2(*q*) are second order polynomials specified via rise-time *tr,<sup>i</sup>* and damping factor *Di* (*i* = 1*,* 2) of corresponding continuous-time second order systems. Eight of the twelve closed-loop poles are located at the origin (fastest possible mode in discrete-time). The pre-filter polynomial is set to

$$\mathsf{T}(q) = \mathsf{A}\_{d,2}(q) q^4 \mathsf{A}\_{d,1}(1) / \mathsf{B}(1). \tag{11}$$

This yields a unity DC gain from the reference input *r j <sup>ϑ</sup><sup>u</sup>* to the system output *ϑ<sup>j</sup> <sup>u</sup>* . Furthermore, it cancels six closed-loop poles defined by A*cl,*2(*q*)*q*4. The resulting transfer function of the closed-loop system is then:

$$\frac{\vartheta\_u^j(k)}{r\_{\vartheta\_u}^j(k)} = \frac{\mathsf{T}(q)\mathsf{B}(q)}{\mathsf{A}\_{cl}(q)} = \frac{\mathsf{A}\_{cl,1}(1)\mathsf{B}(q)}{q^4\mathsf{A}\_{cl,1}(q)\mathsf{B}(1)}.\tag{12}$$

As a result, only the poles defined by the roots of *q*4A*cl,*1(*q*) influence the system dynamics with respect to changes in the reference signal. The disturbance rejection and noise properties of the closed-loop system, however, are depending on all closedloop poles defined by Equation (10). At first, the rise-time and damping factor for A*cl,*<sup>1</sup> are selected to obtain a desired reference tracking behavior. Then the rise-time and damping factor of A*cl,*<sup>2</sup> are iteratively tuned to yield satisfactory noise sensitivity and disturbance rejection (verified by frequency response plots of the sensitivity and the complementary sensitivity function). For all subjects of this study, we have chosen *tr,*<sup>1</sup> = 0*.*6 s, *tr,*<sup>2</sup> = 0*.*5 s and a damping factor *Di* = 0*.*999 for both polynomials.

The final controller implementation, which is shown in **Figure 8**, takes the following additional aspects into account:


The initial stimulation intensity *ν j <sup>d</sup>,m,init* is adjusted in order to avoid undesired movements when the controller is activated. Thus, before the controller activation and the brake release, the stimulation intensity is increased up to the value which was used before locking the DoF. The ramp-up period lasts about 1.5 s. Furthermore, to avoid unwanted initial transients caused by the controller transfer functions, the initial joint angle *ϑ<sup>j</sup> <sup>u</sup>*(*k* = 0) at

controller activation is acquired and then subtracted from the joint angle measurement *ϑ<sup>j</sup> <sup>u</sup>*(*k*) and the output of the trajectory generator.

#### *2.5.3. Trajectory generation*

To obtain smooth shoulder flexion/extension movements, the reference trajectory *r j <sup>ϑ</sup>u,f*(*k*) for each activation *j* is chosen to be a sinusoidal reference path starting at *ϑ<sup>j</sup> <sup>u</sup>*(0) and converging to the desired target angle *r j ϑu* :

$$r\_{\boldsymbol{\vartheta}\_{u},f}^{j}(k) = \begin{cases} \begin{array}{c} \vartheta\_{\boldsymbol{u}}^{j}(0) & \text{for } & 0 \le k < N\_{1} \\ \frac{1}{2} \left(1 - \cos\left(\frac{\pi k - N\_{1}}{2N}\right)\right) \cdot & \\\left(r\_{\boldsymbol{\vartheta}\_{u}}^{j} - \vartheta\_{\boldsymbol{u}}^{j}(0)\right) + \vartheta\_{\boldsymbol{u}}^{j}(0) & \end{array} & \text{for } N\_{1} \le k \le N\_{2} = N\_{1} + N. \end{cases}$$

The parameter *N*<sup>1</sup> = 69 describes the amount of samples (corresponding to 2*.*76 s) before the sinusoidal shape starts, and *N* denotes the number of samples for the transient part of the trajectory and is set to 150 (corresponding to 3 s). After the sample *N*<sup>2</sup> = *N*<sup>1</sup> + *N*, the reference trajectory is equal to *r j ϑu* . Then, the controller will be deactivated and the brake will be locked as soon as one of the following conditions is fulfilled:


Once the target is reached, the current value of stimulation intensity is stored and the controller of the shoulder flexion/extension is deactivated.

#### *2.5.4. Shoulder horizontal rotation control*

The control of the shoulder horizontal rotation involves the stimulation of the anterior (for inward rotation) and the posterior (for outward rotation) deltoid. Thus, the following switching control law is used

$$\nu\_{d,a}^{j} = \begin{cases} u\_r^j & \text{if } u\_r^j > 0 \\ 0 & \text{if } u\_r^j \le 0 \end{cases} \tag{13}$$

$$\nu\_{d,p}^{j} = \begin{cases} -\nu\_r^j & \text{if } \mu\_r^j < 0\\ 0 & \text{if } \mu\_r^j \ge 0 \end{cases},\tag{14}$$

which introduces a mapping of one single virtual actuation variable *u j <sup>r</sup>* ∈ [−*νd,p, νd,a*] to the two stimulation intensities *ν j <sup>d</sup>,<sup>a</sup>* and *ν j <sup>d</sup>,<sup>p</sup>* for the *j*th controller activation.

The virtual actuation variable *u j <sup>r</sup>* is the output of an integral controller with constant integration slopes and is given by

$$u\_r^j(k+1) = \underset{-\overline{\pi}\_{d,p}, \overline{\pi}\_{d,a}}{\text{sat}} \left( u\_r^j(k) + c\_r \text{sgn} \left( r\_{\varphi\_u}^j - \varphi\_u^j(k) \right) \right), \ u\_r^j(0) = 0, 1$$

where the positive gain *cr* is set to 0*.*3µ as in this study. To avoid integrator windup, a saturation function

$$\begin{array}{rcl} \text{sat}\_1(\mathfrak{x}) := \begin{cases} b\_1 & \text{if} \quad \mathfrak{x} \le b\_1 \\ \mathfrak{x} & \text{if} \; b\_1 < \mathfrak{x} < b\_2 \\ b\_2 & \text{if} \quad b\_2 \le \mathfrak{x} \end{cases} \end{array} \tag{15}$$

is used in the integral control law. This prevents the integrator from exceeding the constraints for the actuation variable.

Conditions for the deactivation of the controller and the subsequent locking of the brake are in analogy to the ones given in Section 2.5.2.

#### *2.5.5. Elbow extension/flexion control*

The control of elbow extension/flexion is similar to the horizontal shoulder rotation control, but only one muscle, the biceps, is stimulated in order to induce elbow flexion. Downward movements of the forearm (extensive movements) are caused by gravity. The stimulation intensity will be linearly increased/decreased with the absolute slope rate *ce* = 6*.*7 nAs in each sampling instance until the desired angle is achieved. The following integral controller, which also includes an anti-windup strategy, is used:

$$\nu\_b^j(k+1) = \underset{0, \overline{\boldsymbol{\eta}}\_b}{\text{sat}} \left( \nu\_b^j(k) + \mathfrak{c}\_\mathbf{e} \operatorname{sgn} \left( \boldsymbol{r}\_{\boldsymbol{\vartheta}\_f}^j - \vartheta\_f^j(k) \right) \right), \ \nu\_b^j(0) = \boldsymbol{\nu}\_{b,init}^j. \tag{16}$$

Here, *j* represents again the *j*th activation of the controller. The initial stimulation intensity *ν j <sup>b</sup>,init* is adjusted in order to prevent the forearm from rapidly falling down when the controller is activated and the brake is released. Thus, before the controller activation, the stimulation intensity is increased up to 50% of the stimulation intensity achieved at the end of the previous activation phase of the elbow controller. The ramp-up phase lasts 1 s.

Conditions for the deactivation of the controller and the subsequent locking of the brake are in analogy to the ones given in Section 2.5.2.

#### **2.6. VALIDATION OF THE CONTROL SYSTEM**

The control system was validated in five healthy subjects (three female and two male), aged 29–40 years (mean ± *SD* 34.5 ± 5.3). Average weight was 61 ± 17 kg. The drinking task was selected to evaluate the performance of the system. Each subject was asked to be completely relaxed during the arm movements entirely induced by the system. At the hand related steps of the procedure, he/she was asked to voluntarily open and close the hand in order to grasp and release the cup. Each subject repeated the trial five times. Before the beginning of the trials, the exoskeleton as well as the amount of gravity compensation were adjusted to the anthropometric measures of each subject. Then, the system was calibrated performing the following steps:


The experimental protocol was approved by the ethical committee of the Valduce Hospital (Italy) where the validation trials have been performed. All subjects signed a written informed consent.

To evaluate the performance of the system, the positioning error between the target position and actually reached position at the completion of each movement command was computed for the hand positions 1 to 8 shown in **Figure 5**. Two sets of positioning errors were calculated since two different methods were used to derive the actual position in the global coordinate system: (1) the measured angles were applied to the forward kinematic model; (2) the actual position measured by the Kinect was transformed in the global coordinate system. Furthermore, the time needed to execute all movement commands during the drinking task was computed.

## **3. RESULTS**

**Figure 9** exemplarily shows the recorded angles together with their active references (bands), the applied stimulation intensities and the states of the brakes. Vertical, dashed lines separate the time periods of the controlled arm movements that have been introduced and numbered in **Figure 5**. The stimulation intensities *νd,a*, *νd,m*, *νd,p*, and *ν<sup>b</sup>* are normalized to their bounds [0*, νd,a*], [*νd,m, νd,m*], [0*, νd,p*], and [0*, νb*], respectively. The control system is performing well in moving the arm such that the joint angles are close to the reference angles. However, in this example, an unwanted slipping of the horizontal shoulder brake can be observed after 43, 80, 92, and 106 s that causes the shoulder horizontal rotation angle *ϕ<sup>u</sup>* to drift away from the previously reached target angle. **Figure 10** shows the desired arm posture at the ending of every controlled arm movement in comparison to the real arm position achieved by NMES. The error caused by slipping is clearly visible for the instances of time 2∗, 4∗, 6∗, and 7∗, which represent the endings of the corresponding movements defined in **Figure 5**.

The five trials of the drinking task were successfully completed by all subjects. For each subject, **Table 2** reports the mean and standard deviation values of the position errors in *xg/yg/z<sup>g</sup>* directions obtained during the five trials of the drinking task. The controller performance obtained in the two most important reaching subactions, i.e., reaching the object and reaching the mouth, and the overall performance obtained by averaging the results obtained in all of the eight target positions are shown in **Table 2**. The Euclidian norm (i.e., the mean distance error) of the mean positioning error vectors has been calculated from data in **Table 2** and is reported in **Table 3**. The mean distance error for all subjects and positions was less than two centimeters when using the exoskeleton angles to determine the hand position. Based on the Kinect measurements, the observed mean distance error is smaller than five centimeters. For the majority of subjects (B–E), a relatively large mean (systematic) error in the *x<sup>g</sup>* -direction of up to 12 cm are observed for the object position (cf. **Figure 5**), resulting in a mean distance of about 8 cm (see **Table 3**). Subject D obtained a large standard deviation for the object positioning error in *x<sup>g</sup>* -direction (see **Table 2**). A larger discrepancy between the errors based on the exoskeleton sensors and the Kinect can be observed for the mouth position in subjects C–E.

Additionally to positioning error analysis, the validity of the identified kinematic model and coordinate transformation is investigated for each individual subject. For the twelve positions chosen during the kinematic model calibration, we calculated the 3D position of the hand in two ways using the found kinematic model parameters: At first by applying the kinematic model to the measured exoskeleton joint angles and second by transforming the Kinect measurements into the global coordinate system. Then, over all twelve positions the RMS of the distance error between the two estimates for the hand positions is calculated. The results are shown in **Table 2**.

The mean values averaged over five trials of the observed time durations for all sub movements and for each subject are reported in **Table 4**. Each individual sub movement is indicated by a number previously introduced in **Figure 5**. Additionally, the mean values for the total time required to complete a full drinking

task (only time durations wherein the controller was activated are counted) are reported per subject. The average time for the execution of all eight arm movement commands was 71.4 s. The total time for donning the system on and for calibration was less than 10 min for every subject (calibration alone required about 2 min).

#### **4. DISCUSSION AND CONCLUSIONS**

The experimental evaluation shows that the feedback control of the hybrid NMES-exoskeleton system is feasible. Compared to the results presented in Freeman et al. (2012), no learning phase was required to achieve the desired functional movements. Overall, the evaluation shows that it is possible to support the user in performing the drinking task. Because the drinking task was considered the most complex one, we conclude that other tasks are supported with similar effectiveness.

The observed small position errors at the mouth might be corrected by minor head movements to allow the drinking from the cup by means of a straw. When positioning the hand above the object (i.e., the cup handle), in *x<sup>g</sup>* -direction larger errors were observed compared to other directions. But due to the large dimension of the cup handle, the ability to grasp the handle was not restricted. The limited accuracy for placing the hand at objects restricts the possible size and number of objects on the table. Reasons for the observed errors are diverse. One major problem observed is the limited braking torque of 2.5 Nm for the horizontal shoulder rotation that sometimes cannot prevent unwanted slipping. Despite careful placement of the stimulation electrodes, it cannot be avoided that a stimulation of the Deltoid, medial head, generates (besides a desired shoulder extension moment) an unwanted horizontal shoulder rotation moment. If the latter exceeds the torque of the locked horizontal shoulder rotation brake, then slipping occurs for this DoF. With the arm pointing forward, an error in the shoulder horizontal rotation leads to a large hand error in the *x<sup>g</sup>* -direction, especially for the extended arm. In future research, the use of array electrodes for the deltoid muscle might be an option to achieve a more selective stimulation

illustrated in blue.

**drinking task.** Shown are the desired arm postures and the actually obtained ones for the endings of the eight arm movements defined

from the Kinect measurements compared to the one derived from the exoskeleton sensors are therefore an indicator that the rigid body system assumption is only an approximation.

A shortcoming of the developed system is that elbow extension and shoulder flexion are only induced by gravity. This requires a carefully adjusted weight compensation. Any overcompensation of the weight could drive the arm movement into a dead lock.

Huge advantages of the employed control strategy are its robustness and its simple adaptation to new users/sessions. Only a simple single-input single-output dynamical model needs to be identified for the adaptation of the controller. For all subjects, the same tuning parameters, like rise times and damping factors, have been used for the automatic design of the shoulder extension/flexion controller. In addition to this, the same gains have been applied to the controllers of shoulder horizontal rotation and elbow flexion/extension in all subjects. Due to automated and guided procedures, the system can be set up in a few minutes for the individual user. All individual NMES controllers for the three DoFs include an integrator which allows for the compensation of muscular fatigue as long as the stimulation intensities do not saturate. No deterioration of control performance was observed

and to avoid such unwanted stimulation effects and slipping. Another solution is to increase the brake torque by re-designing the exoskeleton.

Even when moving to a position given in Cartesian coordinates, the real-time control system is based on angular control. The position errors determined by the exoskeleton angles are purely related to the control system. The errors determined by the Kinect measurements additionally take problems into account that are related to the used kinematic model and coordinate transformations. The current controller design assumes that the exoskeleton/arm-combination represents a rigid body system. This is certainly only an approximation. Moreover, for the calibration of the kinematic model and the coordinate transformation, the arm/hand is moved by an assisting person to twelve arbitrarily chosen different positions in the workspace. Compared to the later use with NMES, no loading/deformation of the exoskeleton by the arm weight takes place. Any deviation from the rigid body assumption causes a position error due to the use of an incorrect forward kinematics. Such an error can only be detected by an external measurement system, like the Kinect, and not by the exoskeleton's internal angle sensors. The larger errors computed


**Table 2 | Mean positioning errors along with their standard deviations in** *xg/yg/zg***-direction for five drinking task sequences per subject measured via the exoskeleton sensors and via Kinect.**

**Table 3 | Euclidean norm (distance) of the mean positioning error vector given in Table 2.**


for the healthy subjects during the five performed trials and from day to day. All these advantages have to be paid by the fact that the movements do not look very physiological and movement sequences are not time optimal (cf. **Table 4**). However, we hypothesize that this fact is of minor importance for final users, and that the guaranteed functionality overbalances the timing issue for this assistive technology. The personal experience of performing all movements by means of the own muscles is the major advantage compared to robotic approaches for assistance of reaching function (e.g., Maheu et al., 2011). Regular use of the proposed arm neuroprosthesis and, consequently, of the patient's musculature will be health promoting. It will increase muscle strength and might also improve cardiovascular fitness.

In summary, a feedback controlled hybrid NMES-exoskeleton which does not require any residual function at the shoulder and arm level was developed. By combining NMES with the passive exoskeleton for partial arm weight support, muscular fatigue can be significantly reduced since the required amount of muscular force is smaller compared to normal movements. The use of electrically lockable joints reduces the onset of muscular fatigue even further because no muscle function is required to hold the desired position.

The presented study was focusing on the achievable control system performance, which was expected to be maximal for healthy individual due to non-atrophied muscles and the absence of spasticity. During the development of the system, a first test involving one incomplete SCI subject (C4/C5) was performed and showed that the system supported the subject in reaching a cup and bring it to the mouth. The results of this test have been previously published (Pedrocchi et al., 2013). Tests of the final feedback controller on a group of SCI subjects will be performed to observe the feasibility of the system in supporting daily life activities. To obtain successful results, an initial conditioning phase in order to assure that NMES is able to induce some muscle force, and a longer familiarization phase with the system, are envisaged.

## **AUTHOR CONTRIBUTIONS**

Christian Klauer and Thomas Schauer designed and implemented the real-time NMES control system including interfaces to the central controller and to the sensors and brakes of the exoskeleton. They also derived the kinematic model and



set-up for the parameter estimation. Werner Reichenfelser, Jakob Karner, and Margit Gföhler designed and built the passive light-weight exoskeleton. Marta Gandolla and Alessandra Pedrocchi developed the eyetracker interface. Emilia Ambrosini, Simona Ferrante, and Christian Klauer carried out the validation study of the control system including data analysis. Marco Hack and Andreas Jedlitschka developed the Kinect interface and object/hand tracking. Sven Zwicker and Alexander Duschau-Wicke realized the central controller, the overall system integration and the inter-module communication. Alessandra Pedrocchi was the manager of the EU project MUNDUS and responsible for the entire system design. All authors contributed in writing and revising the manuscript.

#### **ACKNOWLEDGMENTS**

The research leading to these results has received funding from the European Community's Seventh Framework Programme under grant agreement no. 248326 within the project MUNDUS. We would also like to thank all participants of the study.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fnins*.* 2014*.*00262/abstract

A video of the drinking use case realized by the MUNDUS system showing a healthy subject. The arm movements are generated by means of the described feedback control system. In addition, also a NMES hand module is applied to support the grasping of the object.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 March 2014; accepted: 04 August 2014; published online: 02 September 2014.*

*Citation: Klauer C, Schauer T, Reichenfelser W, Karner J, Zwicker S, Gandolla M, Ambrosini E, Ferrante S, Hack M, Jedlitschka A, Duschau-Wicke A, Gföhler M and Pedrocchi A (2014) Feedback control of arm movements using Neuro-Muscular Electrical Stimulation (NMES) combined with a lockable, passive exoskeleton for gravity compensation. Front. Neurosci. 8:262. doi: 10.3389/fnins.2014.00262*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Klauer, Schauer, Reichenfelser, Karner, Zwicker, Gandolla, Ambrosini, Ferrante, Hack, Jedlitschka, Duschau-Wicke, Gföhler and Pedrocchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Computationally efficient modeling of proprioceptive signals in the upper limb for prostheses: a simulation study

#### *Ian Williams <sup>1</sup> \* and Timothy G. Constandinou1,2*

*<sup>1</sup> Department of Electrical and Electronic Engineering, Imperial College London, London, UK*

*<sup>2</sup> Center for Bio-Inspired Technology, Institute of Biomedical Engineering, Imperial College London, London, UK*

#### *Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*Eric Jon Perreault, Northwestern University, USA Dingguo Zhang, Shanghai Jiao Tong University, China*

#### *\*Correspondence:*

*Ian Williams, Center for Bio-Inspired Technology, Institute of Biomedical Engineering, Imperial College London, B422 Bessemer Building, South Kensington Campus, London, SW7 2AZ, UK e-mail: i.williams10@imperial.ac.uk*

Accurate models of proprioceptive neural patterns could 1 day play an important role in the creation of an intuitive proprioceptive neural prosthesis for amputees. This paper looks at combining efficient implementations of biomechanical and proprioceptor models in order to generate signals that mimic human muscular proprioceptive patterns for future experimental work in prosthesis feedback. A neuro-musculoskeletal model of the upper limb with 7 degrees of freedom and 17 muscles is presented and generates real time estimates of muscle spindle and Golgi Tendon Organ neural firing patterns. Unlike previous neuro-musculoskeletal models, muscle activation and excitation levels are unknowns in this application and an inverse dynamics tool (static optimization) is integrated to estimate these variables. A proprioceptive prosthesis will need to be portable and this is incompatible with the computationally demanding nature of standard biomechanical and proprioceptor modeling. This paper uses and proposes a number of approximations and optimizations to make real time operation on portable hardware feasible. Finally technical obstacles to mimicking natural feedback for an intuitive proprioceptive prosthesis, as well as issues and limitations with existing models, are identified and discussed.

**Keywords: proprioceptive feedback, neuroprosthesis, neuromusculoskeletal model, upper limb, biomechanics, muscle spindles, golgi tendon organ, static optimization**

### **1. INTRODUCTION**

A device capable of giving an amputee a sense of feeling back from their prosthetic limb could help millions of people live happier, more productive lives (Blank et al., 2010; Weber et al., 2012). Graded sensory feedback of almost any sort could feasibly provide the user with proprioceptive information about their prosthesis, and haptic, visual, auditory, vibratory and electrocutaneous feedback have all been explored (Clippinger et al., 1974; Ohnishi et al., 2007). However, sensory substitution methods such as these, partially deprive the user of another of their senses and suffer from long training periods and high cognitive load as they require the user to learn and interpret the information encoded by the feedback stimuli. Despite decades of experimentation, sensory substitution has not seen significant clinical application (Ohnishi et al., 2007) and it is an approach that is likely to be increasingly difficult to implement in future as prosthesis complexity increases and the quantity of feedback increases correspondingly.

Direct neural feedback in the form of a neural prosthesis has the potential to provide high quality and intuitive feedback. The incredible capability of neural prostheses to transform lives has already been vividly demonstrated in recent years by the rise of cochlear implants for the deaf and the tantalizing progress in retinal implants for the blind. A proprioceptive prosthesis on the other hand could in theory provide a user with feedback of their limb's position, motion and the forces it is exerting, as well as potentially providing therapeutic benefit for phantom limb issues (Dhillon and Horch, 2005) and a number of groups worldwide are working on developing just such a device (Dhillon and Horch, 2005; Hsiao et al., 2011; Weber et al., 2012; Williams and Constandinou, 2013b).

The ideal for a sensory neural prosthesis would be to mimic naturally occurring neural patterns and stimulate the appropriate neurons with those patterns—providing the user with comprehensive feedback that is as intuitive as possible. However, major obstacles remain to be overcome (see section 4.1), and it seems likely that neural prostheses will rely on the brain's ability to interpret limited and abnormal feedback for some time yet.

Mimicking the function and signals of specialized neurons is an active area of focus for cochlear and retinal prostheses in order to enhance the user's ability to interpret the feedback. The brain's ability to adapt and learn is impressive, but fitting in with its pre-existing neural processing may offer better performance. However, this progression, (from simple graded stimulation to systems of modulation that mimic natural patterns) has not yet been addressed for proprioception, despite tantalizing indications that limited but appropriate neural stimulation can generate limb state representations in the brain (Weber et al., 2011).

The aim of this paper is to create a real time model of proprioceptive signals from specific receptors to demonstrate its feasibility and to support future work investigating the possible benefits of mimicking natural signals. Our concept for developing a proprioceptive prosthesis for a transhumeral amputee is shown in **Figure 1** and involves mapping the motion of a prosthetic onto a model of the human arm so that equivalent representations

of muscle, tendon and receptor modulation can be calculated. Section 4.1 discusses this approach and looks at some of the issues involved. This paper will focus on the processing element creating an efficient model to convert data from sensors on a prosthetic limb into estimates of proprioceptive neural signals from muscle spindles and Golgi Tendon Organs (GTOs).

In the human body the generation of proprioceptive neural signals is implicitly linked with musculoskeletal biomechanics as well as muscle and proprioceptor dynamics (Proske and Gandevia, 2012). The neural signal generation in our proprioceptive prosthesis is likewise based on these three factors and shares much in common with neuro-musculoskeletal models developed for research in the field of human motor control (Lan et al., 2005; Frigon and Rossignol, 2006; Koo and Mak, 2006; Song et al., 2008; Colacino et al., 2010).

The integration of sensory feedback models with representations of musculoskeletal components is still a relatively new field and most publications have focused on the lower limb and locomotion. Upper limb models considering only 1 degree of freedom have previously been proposed (Lan et al., 2005; Koo and Mak, 2006; Colacino et al., 2010) and a more complex 3 degree of freedom, 15 muscle "Virtual Arm" model covering shoulder and elbow joints was proposed by Song et al in Song et al. (2008). These studies focus on understanding limb motion and control and therefore simplifications and qualitative representations of proprioceptive signals are used which are unlikely to be suitable for implementation in a proprioceptive prosthesis due to limitations that include: using models fitted to feline firing patterns despite much lower observed firing rates in humans; using individual receptor or ensemble firing patterns interchangeably even though there may be multiple orders of magnitude difference between the two; and using simple piecewise linear approximations to population firing rates that do not capture the observed dynamics or non-linearities of proprioceptive receptors. These models also have muscle activation or excitation as inputs, and limb movement as the output. However, in a proprioceptive prosthesis the situation is reversed with limb movement as an input and muscle activation an unknown. Therefore, despite the similarities in the underlying sub-models, the final model implemented here differs substantially.

The addition of static optimization (an inverse dynamics tool) is proposed to estimate muscle forces and activations. However, standard implementations are computationally demanding unsuited to the real-time, portable, and low-power nature of a proprioceptive prosthesis—and as such approximations are proposed to address this.

Numerous models of muscle spindles have been proposed in the literature [see Prochazka and Gorassini (1998) and Mileusnic et al. (2006a) for review]. Here the anatomically derived Mileusnic et al muscle spindle model (Mileusnic et al., 2006a) will be used and adjusted to fit human spindle firing rates for a variety of muscles in the upper limb. The model is relatively computationally intensive and as such an approximation to this model will be proposed to reduce the computational load.

There are relatively fewer GTO models in the literature [see Mileusnic et al.(2006b) for review]; here the model described in Lin and Crago (2002) (based on a transfer function model by Houk and Simon) was selected for implementation. A method to fit this model to human data and adjust the model according to optimal isometric muscle strength is proposed.

This paper proposes a system for modeling ensemble average proprioceptor signals (see section 4.1 for discussion of this approach) for a simplified representation of the upper limb with 7 degrees of freedom and 17 muscles. Approximations to existing models and tools are proposed with the aim of creating a real time system capable of running on portable hardware.

## **2. MODELS AND METHODS**

The system described here is shown in **Figure 2** and broadly consists of biomechanical modeling combined with two previously described proprioceptor models. The sensor data from the prosthetic limb consists of joint angles and torques mapped onto the joints of the modeled human limb; the role of the biomechanical modeling is to convert this data into estimates of muscle length and force. These parameters are in turn converted by the receptor models into estimates of neural firing patterns.

## **2.1. MUSCULOSKELETAL MODEL**

#### *2.1.1. Skeletal structure*

The biomechanical modeling is underpinned by data from a 3D musculoskeletal model of the upper limb in OpenSim (Delp et al., 2007). The OpenSim model used here is a reduced form of the Stanford VA Upper Limb model which is based on the measurements and proposals in Holzbaur et al. (2005). The reduced form of the model is shown in **Figure 3** and consists of the following skeletal elements: thorax, sternum, scapula, clavicle, humerus, radius, ulna, wrist bones and 2nd to 5th metacarpals. Mass and inertial properties were obtained from Chandler et al. (1975) and Winter (2009) and the mass and inertia of the not-included finger and thumb segments were approximated as a lumped mass at the center of gravity of the hand.

## *2.1.2. Joints and degrees of freedom*

The model covers 7 degrees of freedom in the upper limb: 3 at the shoulder (describing elevation angle, shoulder elevation and shoulder rotation), 2 at the elbow (covering elbow flexion and forearm pronation), and 2 at the wrist (covering flexion and deviation). Joint kinematics and ranges of motion were unchanged from the original model.

## *2.1.3. Muscle model*

A standard three component dimensionless Hill type muscle model (0◦ pennation angle) was used and scaled to fit individual muscles as proposed by Zajac (1988). This approach allows all muscles to be modeled by the same functions with the differences between each muscle described by only a few variables. Normalized muscle length ( ¯ *<sup>L</sup>M*), tendon length ( ¯*LT*), muscle force ( ¯ *FM*) and muscle velocity ( ¯*vM*), were respectively calculated using:

$$\bar{L}^{\bar{M}} = \frac{L^{\mathcal{M}}}{L\_o^{\mathcal{M}}}, \quad \bar{L}^T = \frac{L^T}{L\_s^T}, \quad \bar{F}^{\bar{M}} = \frac{F^{\mathcal{M}}}{F\_o^{\mathcal{M}}}, \quad \bar{\nu^{\mathcal{M}}} = \frac{\nu^{\mathcal{M}}}{\nu\_{max}^{\mathcal{M}}} \tag{1}$$

where *LM <sup>o</sup>* is the optimal muscle length, *LT <sup>s</sup>* is the tendon slack length, *F<sup>M</sup> <sup>o</sup>* is the muscle's maximum isometric force and *v<sup>M</sup> max* is the muscle's maximum shortening velocity. All muscle paths and muscle insertion points are as specified in the OpenSim model. Parameters for the muscles were obtained from the OpenSim model and *v<sup>M</sup> max* was assumed to be seven times the optimal fiber length [a figure approximately midway between that recorded for slow and fast twitch fibers (Brooks and Faulkner, 1988; Thelen, 2003)].

The torque (*T*) produced by the 17 muscles (*m*) around joint (*j*) is modeled as:

$$T\_{\bar{l}} = \sum\_{m=1}^{17} \left( \left[ a\_m \cdot f\_l(\bar{L}\_m^{\bar{M}}) \cdot f\_{\bar{\nu}}(\bar{\nu}\_m^{\bar{M}}) \right] \right)$$

$$+ f\_{\bar{\rho}}(\bar{L}\_m^{\bar{M}}) \left[ R\_{m,\bar{j}} \cdot F\_{o,m}^{\bar{M}} \right] \tag{2}$$

where *fl*, *fv*, and *fp* are functions describing the muscle's forcelength, force-velocity, and passive force-length relationships; *am* is the level of muscle activation (between 0 and 1); *Rm,<sup>j</sup>* is the muscle's moment arm around joint *j*; and *F<sup>M</sup> <sup>o</sup>,<sup>m</sup>* is muscle *m*'s maximum isometric force. Equations for calculating force-velocity, force-length, and passive force are as described by Thelen (2003).

#### **2.2. BIOMECHANICAL MODELING**

#### *2.2.1. Musculotendon length and muscle moment arms*

In OpenSim the musculotendon length and muscle moment arms are calculated based on the muscle's origin and insertion points as well as anatomical wrapping points and constraints. However, running a 3D model is computationally intensive. A more efficient (although less accurate) approach based on fitting a polynomial surface to the length-joint angles relationship and another for the moment arm-joint angles relationship was described in van den Bogert et al. (2011).

In order to determine the polynomial coefficients for this relationship, the OpenSim musculoskeletal model was swept through the full range of motion of the various joints and at each pose the lengths and moment arms of all the muscles were recorded. This data was then processed in Matlab with the polyfitn function to generate polynomial surfaces fitted to this data. The polyfitn function outputs the polynomial surface coefficients (*c<sup>L</sup> <sup>i</sup>* for length and *cMA <sup>i</sup>* for moment arm) relating the musculotendon lengths (*LMT*) and muscle moment arms (*Rm,j*) to the joint angles (*qj*) for muscle *m* such that:

$$L\_m^{MT} = \sum\_{i=1}^{N^L} c\_i^L \prod\_{j=1}^7 q\_j^{\epsilon\_\theta^L}, \quad R\_{m,j} = \sum\_{i=1}^{N^{MA}} c\_i^{MA} \prod\_{j=1}^7 q\_j^{\epsilon\_\theta^{MA}} \tag{3}$$

where *N<sup>L</sup>* and *NMA* are the number of polynomial terms for length and moment arm respectively, while *eL <sup>θ</sup>* and *<sup>e</sup>MA <sup>θ</sup>* are the integer exponents for length and moment arm respectively. A cubic polynomial fit was used, therefore giving *eL <sup>θ</sup>* and *<sup>e</sup>MA θ* values between 0 and 3.

The length of the muscle (*LM*) was calculated from the musculotendon length, by subtracting an estimate of the tendon length under tension. This estimate of tendon length was based on the recorded strain curve of normalized tendons as described in Thelen (2003). For efficient modeling this force strain relationship was approximated by computing the equilibrium position (removing differential equations) using a cubic polynomial fitted to the strain curve:

$$
\bar{L}^T = 0.04879 \bar{F}^{\bar{M}^3} - 0.1009 \bar{F}^{\bar{M}^2} + 0.1003 \bar{F}^{\bar{M}} + \text{l.} \tag{4}
$$

#### *2.2.2. Muscle activations and forces*

It has been widely noted that there is redundancy in the human musculoskeletal system and hence there is typically not a unique combination of muscle forces to generate any particular motion. The situation is further complicated by the fact that muscles are often multi-articular and produce moments around each of the joints they span. Methodologies such as static or dynamic optimization [which rely on minimizing or maximizing some optimization criteria (Erdemir et al., 2007)] are often used to address this redundancy and complexity problem. Due to the real time nature of this system, the static optimization technique will be used here.

Probably the simplest proposed optimization criteria is to try to minimize the total amount of muscle activation (<sup>17</sup> *<sup>m</sup>* <sup>=</sup> <sup>1</sup> *am*). This approach has the advantage of being linear and hence solvable by fast linear programme solvers, however, this optimization approach does not produce results that are representative of observed patterns of muscle activation (Rasmussen et al., 2001). There is still no clear agreement on the best optimization criteria for all joints, motions and loads, however, representative muscle activations have been produced by systems minimizing sum of activation squared, cubed or to a higher order polynomial ( <sup>17</sup> *<sup>m</sup>* <sup>=</sup> <sup>1</sup> *an <sup>m</sup>* where *n* is an integer greater than 1). However, solving these criteria requires significantly higher computational power than the linear criteria. In Rasmussen et al. (2001), Rasmussen proposed using a min-max optimization criteria which approximates the high order polynomial, but which can be solved using efficient linear techniques. This optimization criteria can be formulated by introducing an artificial criterion variable (*β*): Minimize *β* subject to:

$$\begin{aligned} &a\_m \le \beta, \quad \forall m \in \{1, 2, \dots, 17\} \\ &0 \le a\_m \le 1 \\ &T\_j^\* = \sum\_{m=1}^{17} \left( \left[ a\_m \cdot f\_l(\bar{L}\_m^M) \cdot f\_v(\bar{\nu}\_m^M) + f\_p(\bar{L}\_m^M) \right] R\_{m,j} \cdot F\_{o,m}^M \right), \\ &\forall j \in \{4, 5, 6, 7\} \end{aligned} \tag{5}$$

where, *T*<sup>∗</sup> *<sup>j</sup>* is the measured torque around joint *j* and the muscle activations (*am*) are the variables for the algorithm.

A weakness of this optimization criteria is that once a minimum *β* value has been calculated, the optimization process does not try to reduce muscle activations below this value, e.g., if muscle *i* needs to be fully activated (*ai* = 1*, β* = 1) there is no optimization penalty for setting other muscles to be fully activated. To address this the optimization criteria was modified, with the new aim being to minimize *β* + 0*.*01<sup>17</sup> *<sup>m</sup>* <sup>=</sup> <sup>1</sup> *am*.

The open source simplex package lp\_solve was used to solve this linear programme around the four joints in the elbow and wrist. Given the limited subset of shoulder spanning muscles being modeled (and the target application being a transhumeral amputee with extant shoulder musculature), it was not possible to resolve the torques at the three shoulder joints (*j* = 1*,* 2*,* 3). These joints were, however, included in the system because their configuration influences the lengths of and moments developed by the bicep and tricep muscle groups around the elbow and hence have an impact on all distal muscle activations.

#### *2.2.3. Activation dynamics*

Muscle activation dynamics were accounted for by using a method similar to that used by Thelen (2003). In that paper an idealized muscle excitation signal (*u*) was used as an input and the muscle activation (*a*) was modeled by a non-linear first order differential equation:

$$\frac{da}{dt} = \frac{u - a}{\mathfrak{r}\_a(a, u)}\tag{6}$$

where *τ<sup>a</sup>* is a time constant that varies depending on the muscle activation level and on whether the activation level is increasing or decreasing:

$$\mathfrak{r}\_a = \begin{cases} \mathfrak{r}\_{\text{act}}(0.5 + 1.5a) & \mu > a \\ \mathfrak{r}\_{\text{deact}}/(0.5 + 1.5a) & \mu \le a \end{cases} \tag{7}$$

where *τact* is 15 ms and *τdeact* is 50 ms.

In our work we do not have the muscle excitation (*u*) available so the problem was addressed by determining the feasible range of activation levels (*amin* → *amax*) each muscle could have after a time *dt*. This was approximated from the differential equation describing the muscle activation by setting *u* = 0 to determine *amin* and *u* = 1 to determine *amax*, giving:

$$\begin{cases} a\_{\min} = a - \frac{dt \cdot a \cdot (0.5 + 1.5a)}{t\_{\text{decat}}}\\ a\_{\max} = a + \frac{dt(1 - a)}{t\_{\text{dec}}(0.5 + 1.5a)} \end{cases} \tag{8}$$

The feasible activation range was calculated for each muscle and included as constraints in the linear programme solver.

#### **2.3. PROPRIOCEPTOR MODELING**

#### *2.3.1. Muscle spindles*

The muscle spindle outputs were simulated using a model based on that proposed by Mileusnic et al. (2006a). The model inputs are muscle length (*LM*) and fusimotor activation levels (*γstatic* and *γdynamic*). The model uses this to estimate the tension in each spindle fiber type's (*bag*1, *bag*2, and *chain*) transduction zone, and the resulting action potential firing. The output of the model is a nonlinearly summed contribution from each fiber type to the primary (Ia) and secondary (II) axons that innervate the spindle. The model works essentially by modeling the fibers as a spring mass system and solving a second order differential equation (Equation (6) in Mileusnic et al., 2006a) that describes the tension in each fiber. However, integration of this differential equation requires a small time step size and therefore a high number of calculations. We propose that the tension in the system can be approximated by assuming that all the stretch happens in the polar regions of the fiber (which have a much lower spring constant) and then calculating the equilibrium tension in the fibers by modifying Equation (3) in Mileusnic et al. (2006a) to:

$$T = M \cdot \dot{L}^{\dot{M}} + \beta \cdot C \cdot (L^M - R - L\_o^{sr}) \cdot (abs(L^{\dot{M}})^{0.3})$$

$$\cdot \text{sign}(\dot{L}^{\dot{M}}) + K^{pr} \cdot (L^M - L\_o^{pr} - L\_o^{sr}) + \Gamma \tag{9}$$

where *T* is the fiber tension, *LM* is the muscle length, *C* is the coefficient of asymmetry in the muscle force-velocity curve, *R* is the muscle length below which force production is zero, *L pr <sup>o</sup>* and *Lsr o* are the rest lengths of the polar and sensory parts of the fiber, *Kpr* is the polar region spring constant and is the tension produced due to fusimotor input. This modification means that there are no differential equations to solve, so the time step for calculating the spindle output can be increased by orders of magnitude and the computational efficiency is likewise improved.

The parameter "G" in the Mileusnic model is a scaling term mapping ideal normalized spindle firing rates to feline data in the paper—and was estimated based on changes in spindle firing rates of up to 150 pulses per second (pps), that occur due to fusimotor stimulation in a feline muscle. There is limited data about the fusimotor sensitivity of human muscle spindle, but the maximum observed change in spindle output due to fusimotor signals has been observed to be <30 pps (Prochazka and Hulliger, 1998) and as such we scaled the Mileusnic et al. (2006a) derived values of "G" by a factor of <sup>1</sup> <sup>5</sup> to better fit human spindle firing rates.

#### *2.3.2. Golgi tendon organs*

The GTO model used here is based on the model described by Lin and Crago (2002), which in turn is based on work by Houk and Simon (1967) studying the feline soleus muscle. The model consists of two stages.

Firstly a non-linearity:

$$R^{\rm NL} = k\_1 \cdot \ln\left(\frac{F^{\rm M}}{k\_2} + 1\right) \tag{10}$$

where *RNL* is the output of this stage, while *k*<sup>1</sup> (60 impulses per second) and *k*<sup>2</sup> (4 Newtons) are constants scaling the GTO firing rate to the force applied. However, these parameters are based on data from the feline soleus muscle, and given the limited amount of data from human recordings it is difficult to determine human appropriate values for these parameters. We propose to modify this non-linearity to use normalized muscle force:

$$R^{NL} = k\_1 \cdot \ln \left( \bar{F}^{\bar{M}} \cdot \frac{F\_{o,s}^M}{k\_2} + 1 \right) = k\_1 \cdot \ln(\bar{F}^{\bar{M}} \cdot k\_3 + 1) \tag{11}$$

where *F<sup>M</sup> <sup>o</sup>,<sup>s</sup>* is the maximum isometric muscle force of the feline soleus [measured as 25.8N (Scott et al., 1996)], giving *k*<sup>3</sup> a value of 6.45. In addition we propose to adjust *k*<sup>1</sup> to reflect the lower observed firing rates in human GTOs compared to feline GTOs (Jami, 1992). Feline GTOs have been observed firing at rates of up to 300 pps, but in normal motion don't significantly exceed 120 pps (Jami, 1992). A review of the literature did not find any examples of human GTOs being subject to tests that would produce maximal firing rates, however, a review of microneurographic recordings showed firing patterns that rarely exceed 50 pps in normal motions (al Falahe et al., 1990; Jami, 1992). We therefore propose a *k*<sup>1</sup> figure scaled accordingly of 25 pps.

Secondly, the output of the non-linearity is then fed into a linear dynamics transfer function:

$$H(s) = \frac{1.7s^2 + 2.58s + 0.4}{s^2 + 2.2s + 0.4}.\tag{12}$$

For efficient implementation this transfer function was transformed into the z-domain in Matlab using a bilinear approximation with a sample frequency of 1 kHz and warped to fit at 6 Hz giving a z-domain transfer function of:

$$H(z) = \frac{1.69942 - 3.99626z^{-1} + 1.69684z^{-2}}{1 - 1.99780z^{-1} + 0.99780z^{-2}}.\tag{13}$$

## **3. RESULTS**

#### **3.1. MODEL VALIDATION**

The focus of the work presented here is on choosing and modifying existing validated models to create a real time system. As such the approximations will be validated against the original models and the computational efficiency compared. To generate a dataset for realistic comparison, 30 s (at 120 Hz) of motion capture data from the mocapdata.com website (product\_id = 15,044 showing an actor swinging his arms while walking across a room, then brushing his teeth before walking back to the original spot) was scaled, fed into OpenSim and the resulting joint angles for the upper limb were used for all simulations. This data set was chosen because it has a range of fast and slow upper limb motions and because tooth brushing represents an example of where a prosthesis user would not be able to visually monitor their limb and so feedback could provide significant benefit.

## *3.1.1. Length and moment arm validation*

The polynomial approximation for estimating length showed close conformance with the values generated by OpenSim's 3D model throughout the dataset; giving a coefficient of determination (*R*2) of in excess of 0.99 for all muscles. The fit of the moment arm approximation was slightly worse with *R*<sup>2</sup> values of in excess of 0.9 for the two bicep muscles and above 0.98 for all other muscles for the four joints of interest.

## *3.1.2. Static optimization validation*

**Figure 4** shows a comparison between a baseline static optimization tool and results obtained from the model proposed here. The baseline results were obtained by running the builtin OpenSim static optimization tool using a sum of activation squared optimization criteria. Muscle forces and moment arms from OpenSim were used to estimate the joint torques at each point in time and these torques were used as the inputs to the

model described here.The results indicate the proposed system can produce results qualitatively very similar to existing standard techniques, however, it did flag up an issue in the load sharing between the tricep muscles (see Discussion).

## *3.1.3. Spindle model validation*

To test how well the equilibrium spindle approximation corresponded to the original Mileusnic et al model, a number of the validation runs from the original paper were repeated and the results are shown in **Figures 5**, **6**. Differences are generally small and this is consistent with the observed performance for more complex data (as shown in **Figure 7**), although the proposed spindle model is stiffer and as a result rapid movements do give rise to minor discrepancies—the most notable of which is shown in **Figure 5F**.

There is little data available to assess the models performance for human spindles, however, the limited validation possible did highlight a potential limitation of the Mileusnic model

showing primary and secondary afferent firing in the presence or absence of static or dynamic fusimotor activation at 70 pps. Labels are as in original paper.

which is discussed below. **Figure 8** shows a comparison with the recordings, from two sets of nine primary afferents and two sets of seven secondary afferents, published in Edin and Vallbo (1990). The recordings are from the radial nerve during imposed motions about the metacarpophalangeal (MCP) joint. Predicted firing patterns were generated by using the original Holzbaur et al model in OpenSim to estimate the lengths of the two extensor muscles innervated by the radial nerve [extensor digitorum communis

**FIGURE 7 | Comparison of Mileusnic modeled data and proposed approximation against recorded data from a cat (Figure 4A of Prochazka et al., 1979).** Fusimotor activity was assumed to be absent.

interossei (EDCI) and extensor indicis proprius (EIP)] during MCP joint motions as described in the paper.

Spindle lengths vary significantly less from muscle to muscle than muscle fasicle lengths do, this discrepancy is possible because the spindle attachment to muscle endpoints or perimysium is varied to provide consistent proprioceptive acuity across a range of joints and muscles (Proske et al., 2000). The Mileusnic et al model was optimized for muscles whose fiber length varies above and below the optimal fiber length, whereas both the EDCI and EIP muscles are physiologically constrained to be longer than their optimal lengths. As such the unmodified model overestimates spindle firing rates throughout the range of motion (even in the absence of fusimotor input). Spindle rest and threshold lengths in the model were therefore adjusted to correspond to the fiber length when the MCP joint is in a physiologically neutral position (0◦ flexion) rather than the optimal muscle length. This approach gives the results shown in **Figure 8**.

## *3.1.4. Full system output*

The system was run to generate primary afferent and GTO neural signals to enable 3rd party validation of this work, and the neural firing patterns are shown in **Figure 9**.

## **3.2. COMPUTATIONAL EFFICIENCY**

For the purposes of this analysis the code was broken down into two parts—a deterministic part and a non-deterministic part. The deterministic part consisting of the newly written code (calculating muscle lengths, moment arms, force-velocity/length relationships, muscle spindle output, etc.), while the non-deterministic part of the code (formulating and solving the optimization linear programme) used an open source library.

The 30 s 120 Hz dataset (consisting of 3600 samples) was processed by a 2.1 GHz laptop in under 1.09 s (i.e., 27.5 times faster

than real time). Profiling showed that 15% of that time was spent in the deterministic part of the code and 85% in the nondeterministic optimization code. A manual estimate of the number of instructions required to process each time sample (based on the source code) was conducted for the deterministic part of the code—yielding a value of approximately 77,000 instructions. A conservative estimate of the total number of instructions (deterministic and non-deterministic) necessary for processing each sample was made based on scaling the estimated number of instructions by the relative processing duration of the deterministic and non-deterministic parts and then multiplying by a factor of 2—this provided an estimate of 1.03 million instructions per sample.

## *3.2.1. Comparison with standard implementations*

Comparisons between the proposed spindle model and a C-code implementation of the standard Mileusnic model (using a standard euler method solver) showed that the majority of the improvement in processing speed was due to differences in the time steps that can be used (rather than reduction in the number of calculations per time step). The standard Mileusnic model can become unstable if too large a time step is chosen (experimentation showed that a maximum timestep in the order of 0.1–1 ms is required, depending on the dataset), whereas the proposed solution is stable regardless of the timestep. As such it was necessary to upsample the 120 Hz dataset used here, by a factor of 8, to obtain results with the standard model but not for the proposed model. However, the maximum timestep for the proposed model will be upper bounded by the limb position update frequency and the maximum firing frequency of the spindle—meaning that the efficiency improvement is data dependent—but in this situation was in the region of an eightfold improvement.

The fitting of cubic polynomials to calculate length and moment arm make these elements of the processing almost negligible and appears to represent a reduction in required calculations by multiple orders of magnitude compared to 3D modeling. Observed execution time for all the biomechanical modeling (length, moment arm and also the static optimization) presented here was around five orders of magnitude faster than in OpenSim, however, this is a deeply unfair comparison—pitting optimized C-code against the performance of the general purpose OpenSim package.

## **4. DISCUSSION**

## **4.1. GENERATING PROPRIOCEPTIVE FEEDBACK FOR A PROSTHESIS**

Providing intuitive and comprehensive feedback which is familiar or trivially easy to interpret is the ultimate goal for any neural prosthesis. However, it is unknown what the most effective and achievable format for providing proprioceptive neural feedback to prosthesis users is in the near term and there are numerous challenges in implementing, comparing and optimizing competing methods.

Stimulating a small number of neurons with a pattern that is linearly related to prosthetic limb parameters (e.g., joint angle or end effector force) is possibly the simplest approach and has been demonstrated to provide benefit in a laboratory control task (Clippinger et al., 1974; Dhillon et al., 2004; Dhillon and Horch, 2005; Dhillon et al., 2005; Rossini et al., 2010). Stimulation of non-proprioceptive neurons with these non-biologically representative patterns makes this approach similar to sensory substitution feedback. It is unknown whether sensory substitution feedback using direct neural stimulation will offer any substantial benefit over non-implanted implementations although the ability to stimulate truncated nerves and neural pathways that would otherwise be silent may confer advantages.

An alternative feedback modulation method is one that aims to approximate all the naturally occurring neural feedback patterns in the human limb. This approach is in its infancy and there is a need for physiological experimentation to verify the suitability of this approach and investigate how close to this ideal the feedback needs to be in order to demonstrate benefit compared to simpler forms of modulation, however, it seems a logical, albeit distant, ideal to aim for. Major obstacles remain such as limitations in our understanding of proprioceptors and the modulating signals from the brain, as well as our limited ability to interface with and selectively modulate large numbers of neurons. Our concept for implementing this approach breaks the process down into three steps:

## 1. **Mapping from a prosthesis to a model of a normal limb**

Prosthetic limb properties can differ substantially from those of a human limb. For the purposes of this mapping the differences between the human and prosthesis can be grouped into three main categories: (a) physical properties—weight, moments of inertia, size and shape (including for instance the number of digits); (b) actuation properties—strength, speed, joint coupling and actuator non-linearities; and (c) kinematic properties—degrees of freedom, range of motion, axis of rotation and joint structure (including for instance joint complexes). These differences were most evident in the days when cable and hook prostheses dominated the market. However, under the twin pressures of prosthesis users' desire for cosmesis and functionality (in a world of tools and equipment designed to be operated by the human hand), there has been a strong trend toward anthropomorphic convergence. Prototype upper limbs such as the DEKA arm or commercially available prostheses such as the i-Limb, clearly demonstrate the progress that has been made toward approximating the human upper limb. Even the rise in underactuation for finger joints—which was largely driven by actuator weight considerations—moves prosthetics closer to the human form and with anthropomorphic design as a guiding principle, it is a trend that looks set to continue.

This has important implications because the closer the match between prosthetic limbs and human limbs, the easier the process of mapping the state of one to the other becomes.

## 2. **Modeling proprioceptor behavior**

Assuming a modeled human limb can match the states and motion of the prosthesis, then modeling the neural signals can be consdered a problem of biomechanics and proprioceptor modeling (assuming feedback is applied in the Peripheral Nervous System).

Receptors in the muscles, joints and skin all sense tissue deformation and provide proprioceptive feedback to the CNS. Ideally all these receptors would be modeled for a proprioceptive prosthesis so that appropriate stimulation could be applied to any receptors interfaced with. However, in a system constrained by power, portability and (as a result) complexity, it is necessary to prioritize. Discriminating criteria include the importance of each receptor type to motor control, our ability to model the underlying tissue deformation and our understanding of the receptor firing patterns.

Here we elected to focus on muscle spindles and GTOs, both of which stand out on the grounds of the quality of information they provide to the CNS and the quality of the models available for musculo-tendon and proprioception modeling. However, even for these relatively well understood receptors modeling limitations are evident such as a lack of parameters for human receptors, difficulties fitting parameterized models to different muscles and uncertainty regarding fusimotor input (see section 4.3 for further discussion).

### 3. **Applying appropriate neuromodulation to enough target neurons**

When electrodes are implanted in or around a nerve, it is unknown which neurons will be stimulated and what sensation or motor effect they can elicit. The main factors in determining which neurons are stimulated are the electrodeneuron distance, the stimulus strength and the neuronal diameter (with larger neurons recruited at lower thresholds). There is typically a trade-off between the number of neurons stimulated and the selectivity achieved; with stimulation of non-target neurons with a synchronous barrage leading to unusual or potentially noxious sensations (Smith and Leslie, 1990). Possible techniques to reduce the number of nonproprioceptive neurons stimulated include: careful choice of implantation nerve or nerve branch to increase the ratio of proprioceptive neurons stimulated (versus motor or exteroceptive neurons); electrodes with designs that increase selectivity and provide greater ability to target stimulation at different fascicles or at the sub-fascicular level; and waveforms that alter the distance and diameter recruitment order.

The stimulation pattern to apply potentially depends on how selectively individual neurons can be targeted. Our approach is to target fascicle level selectivity because higher selectivity electrodes typically need to penetrate the perineurium which introduces a break in the blood-nerve barrier and has been observed to cause endoneurial accumulation, fibrous build up due to tissue rejection and neural damage caused by relative motion (boring at the electrode tip) (Biran et al., 2005; Polikov et al., 2005). At the fasicular level we propose to stimulate with ensemble average signals—with the aim of making use of the ability of the central nervous system to integrate feedback. The extent to which the CNS can interpret a subset of normal feedback (even in the presence of contradictory feedback) is largely unquantified, but is demonstrated by single muscle tendon vibration trials and numerous psychophysical experiments over the years examining proprioceptive performance under varying conditions including local anesthesia.

## **4.2. APPROXIMATIONS**

The system presented here is focused on real time prediction of neural stimulation patterns and firing rates that would be suitable for human nerve stimulation and which are based on data that could be available from a prosthetic limb. A number of models and approximations were used to achieve this aim. The cubic polynomial approximations for muscle length curves fitted the OpenSim data closely, however, the equivalent moment arm approximations showed substantially worse correlation. It was observed that the quality of a muscle's moment arm approximation decreased as the number of joints the muscle spanned increased and that the surface fit for moment arms about the shoulder joints were particularly poor (although that did not matter for this application). Considering these differences in input data and the difference in optimization criteria employed, the output of the static optimization stage showed a reasonably good match with the standard OpenSim tool. However, substantial differences were visually evident in the distribution of load between the three tricep muscle branches and this is reflected in the very low R2 values for these branches. This was likely due to the linear nature of the optimization, which becomes increasingly poor at load sharing as the number of joints and muscles increases. Examination of the results showed that simply averaging the activations of the three tricep branches would have closely fitted the OpenSim results (R2 values of 0.912, 0.970, and 0.906 for the lateral, long and medial heads respectively). In more complex systems (with greater numbers of joints and muscles) it may be necessary to compartmentalize the optimization process and run multiple iterations or implement some alternative method of sharing load. Results for bicep long and short head activation were also well below average and may be a result of the poorer moment arm fit observed for these muscle branches.

The simplified version of the spindle model closely matched the outputs of the original model for the validations proposed in the original paper as well as for some real movement data. Discrepancies were visible during rapid movements, however, given the duration of these transient differences and peak firing rates in humans of approximately 100 Hz, these discrepancies represent only a low number of missed action potentials. As mentioned in the results, the efficiency improvement provided by the proposed model appears to be largely data dependent and related to the sample frequency of the system, the maximum spindle output frequency and the maximum step size for stable solving of the differential equations in the standard model. It should be noted that the analysis here assumed a standard euler method for solving these equations, but that many alternative numerical methods exist and could improve or guarantee stability.

The proposed parameter change and adjustment to rest and threshold lengths allowed the model to estimate human spindle recordings to within a standard deviation, but without a significantly greater quantity of human data it is unclear how widely applicable these adjustments are.

## **4.3. ISSUES AND AREAS FOR FURTHER WORK**


• The integration of motion capture, a musculoskeletal model, static optimization and proprioceptor models enables some proprioceptive signals to be non-invasively modeled for real movements. Further integration of inverse dynamic modeling to estimate joint torques, as well as a suitable musculoskeletal model of the feline hind limb (a preferred experimentation model), would enhance this system, allowing novel experimentation and extensive validation.

## **5. CONCLUSION**

Realistic models that link human motion to proprioceptor signals could 1 day form the basis for a proprioceptive neural prosthesis in much the same way retinal and cochlear implants aim to mimic auditory and retinal cells. In contrast to previous neuromusculoskeletal models, this work has proposed: the integration of static optimization; modifications to approximate human proprioceptors; and a variety of approximations and optimizations to reduce computational complexity without substantial degradation of the output. A key uncertainty in aiming to provide natural feeling proprioceptive feedback to a prosthesis user is how close to normal it needs to be in order to provide benefit over simpler forms of feedback modulation. This work aims to build capability to explore this question.

The model presented here is able to simulate muscle lengths, moment arms and activations as well as the corresponding muscle spindle and GTO neural signals in real time on low power hardware. This system potentially enables physiological experimentation into intuitive proprioceptive feedback as well as novel forms of proprioceptive and motor control and maybe 1 day could form part of a system capable of giving amputees feeling in their prosthetic limbs.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 March 2014; accepted: 09 June 2014; published online: 25 June 2014. Citation: Williams I and Constandinou TG (2014) Computationally efficient modeling of proprioceptive signals in the upper limb for prostheses: a simulation study. Front. Neurosci. 8:181. doi: 10.3389/fnins.2014.00181*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Williams and Constandinou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Sensory synergy as environmental input integration

## *Fady Alnajjar\*, Matti Itkonen , Vincent Berenz , Maxime Tournier , Chikara Nagai and Shingo Shimoda*

*Intelligent Behavior Control Unit, Brain Science Institute-TOYOTA Collaboration Center of RIKEN, Nagoya, Japan*

#### *Edited by:*

*David Guiraud, Institut National de la Recherche en Informatique et Automatique, France*

#### *Reviewed by:*

*Vittorio Sanguineti, University of Genova, Italy Franck Multon, University Rennes2, France*

#### *\*Correspondence:*

*Fady Alnajjar, Intelligent Behavior Control Unit, Brain Science Institute-TOYOTA Collaboration Center of RIKEN, 2271-130 Anagahora, Shimoshidami, Moriyama-ku, Nagoya, Aichi, 463-0003, Japan e-mail: fady@brain.riken.jp*

The development of a method to feed proper environmental inputs back to the central nervous system (CNS) remains one of the challenges in achieving natural movement when part of the body is replaced with an artificial device. Muscle synergies are widely accepted as a biologically plausible interpretation of the neural dynamics between the CNS and the muscular system. Yet the sensorineural dynamics of environmental feedback to the CNS has not been investigated in detail. In this study, we address this issue by exploring the concept of sensory synergy. In contrast to muscle synergy, we hypothesize that sensory synergy plays an essential role in integrating the overall environmental inputs to provide low-dimensional information to the CNS. We assume that sensor synergy and muscle synergy communicate using these low-dimensional signals. To examine our hypothesis, we conducted posture control experiments involving lateral disturbance with nine healthy participants. Proprioceptive information represented by the changes on muscle lengths were estimated by using the musculoskeletal model analysis software SIMM. Changes on muscles lengths were then used to compute sensory synergies. The experimental results indicate that the environmental inputs were translated into the two dimensional signals and used to move the upper limb to the desired position immediately after the lateral disturbance. Participants who showed high skill in posture control were found to be likely to have a strong correlation between sensory and muscle signaling as well as high coordination between the utilized sensory synergies. These results suggest the importance of integrating environmental inputs into suitable low-dimensional signals before providing them to the CNS. This mechanism should be essential when designing the prosthesis' sensory system to make the controller simpler.

**Keywords: prosthetic arms, sensorineural feedback, muscle synergy, sensory synergy, posture control, automatic posture response**

## **INTRODUCTION**

Neuroprosthetics faces considerable challenges, especially when it is necessary to account for neurological disorders (Ring and Rosenthal, 2005). These challenges concern mainly the immense variety of possible neural damage, which make it hard to define a reliable sensorimotor pathway for controlling external devices (Musallam et al., 2004). Conventional prosthetics focuses mainly on motor control and pays less attention to the role of integrating sensory information as feedback. Without sensory feedback, even the simplest actions, such as controlling a prosthetic arm, can be slow and clumsy due to the lack of tactile sense (Kwok, 2013). Some researchers have proposed direct sensory feedback through air pressure or electrical stimulation though these methods have a number of limitations. Neurophysiological studies have found that the body position in space is estimated by integrating information from multiple sensors modalities rather than through direct sensory input (Zupan et al., 2002; Mergner et al., 2003; Kuo, 2005; Ting, 2007). This integrated sensory feedback can encode noise-robust, useful, and cost-effective information in low-dimensional signals that are simple enough to accelerate the construction of the desired control signal (Kargo and Giszter, 2000). Adding proper sensory integration mechanism into the design of the prosthesis, therefore, may propose an access to simpler controller.

In recent years, several studies have indicated that muscle synergy is a likely neural strategy that the central nervous system (CNS) has adopted to simplify the control of our redundant musculoskeletal system (D'Avella and Bizzi, 2005; Safavynia et al., 2011; Alnajjar et al., 2013b). The concept of muscle synergy, therefore, has been widely adopted as a quantitative interpretation of motor control strategies on a neural level. Muscle synergy has been investigated in detail in several areas of research, including the clarification of the corresponding anatomical concept (Bizzi and Cheung, 2013), the classification of the motor skills of healthy subjects (Torres-Oviedo and Ting, 2007; Alnajjar et al., 2013a), the synergetic motor control paradigm for managing joint redundancy (Hayashibe and Shimoda, 2014) and the identification of the degree of brain damage in stroke survivors (Cheung et al., 2012).

One of the remaining unsettled debates concerning muscle synergies is how they are selected and evaluated by the CNS to adapt to the surrounding environment including body dynamics (Latash, 2008). Answering this fundamental question is essential to understanding the mechanism used by the CNS to handle the complexity of sensorimotor interactions in the body. To answer this question, we introduce the term "sensory synergy" to supplement muscle synergy in order to understand the mechanism of mapping stimuli to behavior (**Figure 1**). In contrast to muscle synergy, which defines suitable combinations of muscles to adapt the behavior to the environment, we hypothesize that sensory synergy plays essential roles in integrating a compendium of sensory feedback to simplify the construction of muscle synergy. We define a single sensory synergy as a group of weighted sensory inputs whose function is to provide the quality of the resulting motion as feedback to the CNS through a single synergy recruitment signal in order to facilitate the generation of the next command, thus accelerating the search time for the optimal muscle synergy. Sensory synergies studies could be the simplified way to understand sensory signaling. In nature, sensory signals of different modalities are in general redundant and plastic to ensure delivering appropriate environmental information to the CNS (Day and Guerraz, 2007). If one sensory modality is disrupted or become unavailable, the other modality can take over (Dickstein et al., 2001; Lanska, 2002). In some cases one of the sensory modalities can even override all others modalities and drives them (Diedrichsen et al., 2007).

To conduct this study, we recorded the kinematics patterns and muscle activities of nine healthy participants in an automatic posture response experiment (APR). The results highlight the synergy characteristics common to all individuals, which were found to depend on the quality of their APRs skills. Results revealed a potential link between the sensory and muscle synergies in terms of synergy size that may enhance sensorimotor transformations. This study should be useful to inspire the development of sensory system for effective neural prosthetic devices which can be operated with simple controller.

## **MATERIALS AND METHODS**

#### **EXPERIMENTAL SETUP**

In this study, an experiment was conducted to determine the relation between the APR measured skills of the participants and their computed sensory and muscle synergies. The participants were nine healthy men (mean age, 34.5 ± 9 years). All the participants were right-handed and had no history of major neurological disorders or posture balance impairment. All experimental protocols were approved by the RIKEN ethics committee.

During the experiment, the participants were instructed to stand upright in the akimbo position (**Figure 2A**) on a movable platform, placing their feet on foot-ground contact sensors located approximately 10 cm apart (**Figure 2B**). We chose this standing position, in which hands are placed on a little above the hips and the elbows are bowed outward, to reduce any impact of the arms in restoring the body balance and to facilitate the capturing of motion markers attached on the participants' bodies. The platform was programmed to perform lateral displacements of 11 cm with velocity of 6.4 cm/s. The participants were also instructed to make an effort to maintain their balance in an upright posture during the platform displacements and to avoid any body movements other than lateral hip flexion/extension and ankle inversion/eversion. The direction and timing of displacement was chosen at random and therefore it was unpredictable to the participants during the experiment. Before the experiment, each participant was asked to practice balancing on the platform for 20 min to become familiar with

standing on the movable platform. Body motion was captured with a VICON motion capture system using 42 markers attached to various parts of the participant's body (see Supplementary Figure 1 for more information about markers positions). **(B)** Experimental setup, muscle locations, joint locations, platform motion pattern and displacement speed, EMG record range, and

to the platform displacement. The participant's EMG responses occurred with a latency of approximately 50 ms following the displacement. The sensory synergy was computed in the period between 0 and 50 ms (shaded area in the muscle length plots), and muscle synergy was computed in the period between 50 and 100 ms (shaded area in the muscle activation plots).

the experimental environment. At the time of the experiment, each participant experienced leftward and rightward platform displacements (mean ± SD: 18 ± 4 cm), and electromyograms obtained in five trials of leftward displacement were used for data analysis.

## **DATA RECORDING**

#### *Surface electromyography (EMG)*

Data on muscle activity was collected by wireless surface EMG (BTS FREEEMG 300, BTS Bioengineering, Italy). EMG electrodes were used to record data from six dominant leg and lumbar muscles (Hoy et al., 1990): the flexor hallucis longus (FHL) and tibialis anterior (TA), which mainly control the ankle strategy in lateral perturbation; and the tensor fasciae latae (TFL), gluteus medius (GM), rectus abdominalis (RA), and erector spinae (ES), which control the hip strategy and the lumbar joint in lateral perturbation (Runge et al., 1999). The EMG electrodes were placed in accordance to the guidelines of the Surface Electromyography for the Non-Invasive Assessment of Muscles (SENIAM) European Union project (Hermens et al., 1999). The entire time-series EMG data were rectified and processed using a low-pass filter with a cutoff frequency of 32 Hz. EMGs were normalized by their respective maxima measured during the experiment. All signals were resampled to 1 kHz.

#### *Motion capture system*

Kinematic patterns of the participants' movements were captured with a motion capture system (Workstation 5.2.4, *VICON*). Forty-two markers (spheres covered with reflective tape) were attached to various parts of the participant's body prior to the experiment (see Supplementary Figure 1 for more information about markers positions). The motion capture system consisted of six cameras, and tracked and reconstructed the motion of each of the recorded markers in 3D space.

### *Foot-ground contact sensors*

The ground reaction forces for each participant were calculated based on data obtained from foot-ground contact sensors (FingerTPS, Pressure Profile Systems, Los Angeles, CA) distributed over three segments of each foot.

## **ESTIMATION OF THE CHANGES IN MUSCLE LENGTH**

Software for Interactive Musculoskeletal Modeling (SIMM), a graphical software system for developing and analyzing models of musculoskeletal structures, was used in this study (Delp and Loan, 1995; Neptune et al., 2008). SIMM uses a full body model created by a set of bones from a male adult subject. Muscle parameters in the middle trunk and the lower limb were adjustable according to the scaling bone computed by the recorded markers from the subjects. Each participant's body weight was used to allocate the body segments of the model (de Leva, 1996). SIMM was then used to perform inverse dynamics calculations driven by various data collected from the experiments (i.e., motion capture data, and foot-ground contact sensor data), see **Figure 3**. Changes in muscle length that is a positive muscle stretch from resting value, as a representation of the activation of proprioceptors (muscle spindles), obtained through inverse dynamics calculations was used as sensory data to compute the sensory synergies. Although 92 muscles and 34 degrees of freedom were considered in the inverse dynamics calculations, due to the simplicity of the applied task (i.e., the fact that the lateral disturbance of a body standing upright can be simplified as a three-link inverted pendulum model (Jiang et al., 2002), and the selected quick and short time period to monitor both sensory and muscle data), we considered sensory synergy calculations using the lengths of the six

dominant muscles for which EMG data were recorded are fair enough at this stage (**Figure 3**). Although it has been argued that the structure of synergies is dependent upon the number and choice of muscles included within the synergies analysis (Steele et al., 2013), we assume that the selected dominant muscles can cope with this issue since they have more influence in carrying out the concerned motion (Alnajjar et al., 2013b; Wojtara et al., 2014). Adding other muscles for synergy calculations should not affect significantly the results (Steele et al., 2013), see also the Supplementary Figure 2.

#### **COMPUTING SENSORY SYNERGIES**

The core feature of sensory synergies is the reduction of the dimensionality of sensory signals provided as feedback to the CNS. Let us express the sensory data of *s* sensors by using a matrix *S*:

$$S \in \mathbb{R}^{s \times t},\tag{1}$$

where *s* and *t* are the number of sensors and the sampling number, respectively. The output of the sensor synergy computation *<sup>I</sup> C* is described as follows:

$$\mathbf{^I C} = \mathbf{W} \mathbf{S} \tag{2}$$

where

$$\,^I C \in \mathbb{R}^{n \times t}, \,\, W \in \mathbb{R}^{n \times s} \tag{3}$$

We consider that the sensor synergy computation adds the meaning to the specified combination of the sensor input. The meaningful signal *<sup>I</sup> C* is translated into the muscle synergy input *OC* in the low-dimensional space. This signal transfer from *<sup>I</sup> C* to *OC* can be considered as the semantic compression of the body-environment interaction in the real environment descried as the M-S transition. *<sup>I</sup> C* can be uniquely estimated from *S* when the well-sophisticated sensor and muscle synergies are used to control the body. Therefore, we assume that the following equation is optimized as the inverse of the muscle synergy computation:

$$\mathcal{S} = {}^{I}W^{I}C + E,\tag{4}$$

Where

$$\,^I W \in \mathbb{R}^{s \times n}, E \in \mathbb{R}^{s \times t} \tag{5}$$

*W* can be regarded as the pseudo-inverse of *<sup>I</sup> W*. We consider that *W* is uniquely computed from *<sup>I</sup> W* when the motion is well-sophisticated. **Figure 1** describes the relationships of *S*, *W*, and *<sup>I</sup> C*.

In Equation (4), *n* signals are used to represent *s* sensors by using the sensory synergy *<sup>I</sup> W* and the synergy recruitment *I C*. To reduce the dimensionality of the sensory data, we set *n* to be smaller than *s*. The error between *S* and *<sup>I</sup> W <sup>I</sup> C* is expressed as *E*, which must be small enough to represent *s* sensors. The magnitude of *E* can be described by an index of similarity *L* (Equation 6), which is sensitive to both the shape and the magnitude of the measured and reconstructed sensory patterns (Torres-Oviedo and Ting, 2007):

$$L = 100 \left( 1 - \frac{1}{s} \sum\_{i=1}^{s} \frac{\sqrt{\frac{1}{t} \sum\_{j=1}^{t} E\_{ij}^2}}{\sqrt{\frac{1}{t} \sum\_{j=1}^{t} S'\_{ij}^2}} \right),\tag{6}$$

where *S* =*<sup>I</sup>* W*<sup>I</sup>* C, and *Eij* and *S ij* are the elements of matrices *E* and *S'*, respectively. The range of *L* is 0 *< L <* 100. When the magnitude of *E* decreases, *L* increases. We considered a value of *L >* 75% to indicate a good fit to the original data. Through preliminary trial runs, we found that this criterion ensured that each muscle would be reconstructed well. A reasonable value for *n* was chosen by using the index *L* with the non-negative matrix factorization algorithm (NMF) (Lee and Seung, 2001). See **Figure 5** for an example.

#### **COMPUTING MUSCLE SYNERGY**

Muscle synergy was calculated following similar steps as for sensory synergy. The number of signals for representing *m* muscles can be reduced by applying the NMF algorithm using the following matrix:

$$M = \,^O W^O \mathbf{C} + E,\tag{7}$$

Where in this case

$$\,^{O}W \in \mathbb{R}^{m \times n}, \,^{O}C \in \mathbb{R}^{n \times t}, E \in \mathbb{R}^{m \times t} \tag{8}$$

Again, *n* signals are used to represent *m* muscles by using the muscle synergy *OW* and the synergy recruitment *OC*. To reduce the dimensionality of muscle data, we set *n* to be smaller than *m*.

#### **SENSORY SYNERGY SIZE**

The synergy coordination index (SCI) was used to evaluate the resulting synergy space. The space here is represented by the angle θ between the utilized synergies (**Figure 4**). Let us assume that sensory synergy W is expressed as

$$\boldsymbol{W} = \left[ \boldsymbol{W}^{(1)} \; \boldsymbol{W}^{(2)} \; \boldsymbol{W}^{(3)} \; \cdots \; \boldsymbol{W}^{(n)} \right],$$

where *W*(*i*) ∈ *Rs* is a basis vector of the synergy space. Because we use NMF to estimate *W*, the synergy space exists only for positive vector components. Furthermore, vectors *W*(*i*) (*i* = 1 ··· *n*) are in general not orthogonal to each other. The size of the synergy space depends on the relative angles between the vectors *W*(*i*) . To quantify the size of the synergy space, we define the space size as the sum of the inner products of *W*(*i*) and *W*(*j*) :

$$\text{SCI} = \frac{2}{n(n-1)} \sum\_{i \neq j}^{n} \mathcal{W}^{(i)} \mathcal{W}^{(j)}.\tag{9}$$

The range of *SCI* is from 0 to 1. *SCI* = 1 implies that all vectors *W*(*i*) are identical, whereas *SCI* = 0 implies that all vectors

*W*(*i*) are orthogonal to each other. The synergy space is smaller for larger values of *SCI*.

#### **SIMILARITY BETWEEN THE SENSORY/MUSCLE SYNERGY RECRUITMENTS (S/M SIMILARITY)**

The S/M similarity describes the similarity between the sensory and muscle synergy recruitment signals *<sup>I</sup> C* and *OC* (**Figure 1**). The S/M similarity is calculated using correlation coefficient:

$$r\left(\mathbf{x}, \mathbf{y}\right) = \frac{\sum\_{i=1}^{m} (\mathbf{x}\_i - \overline{\mathbf{x}})(\mathbf{y}\_i - \overline{\mathbf{y}})}{m S\_{\overline{\mathbf{x}}} S\_{\overline{\mathbf{y}}}} \mid, \tag{10}$$

Here, *x* and *y* are two vectors to be compared (in this case, *<sup>I</sup> C* for *x* and *OC* for *y*), *x* and *y* are their mean values, and *Sx* and *Sy* are their standard deviations. The S/M similarity ranges from 0 to 1.

A high similarity value indicates that muscle synergy recruitment is highly correlated with sensory synergy recruitment. To avoid the ordering issue in the NMF algorithm, we re-sorted the resulting synergies to obtain the highest similarity.

#### **MEASURING THE APR's SKILL OF THE PARTICIPANTS**

To quantify the APR's skill of each participant, a numerical scoring system, based on visual observation by an examiner, was developed (**Table 1**). To encourage the participants to perform at their best and to maintain a high level of motivation, the scores were also displayed to the participants throughout the experiment on a screen. To ensure the effectiveness and reliability of the scoring system, a video was recorded for all the experiments and the examiner used it to re-score offline the participant performance and compare it to the original scores. Similarity ratios were higher than 98% for all experiments (see an example, Supplementary Video 1). The scoring system was designed to measure the participants skills in responding to the designed APR task, but it was not used to confirm or not the overall balance ability of the participants.

#### **Table 1 | Numerical scoring system to quantify the APR's skill of participants.**


## **RESULTS**

#### **NUMBER OF UTILIZED SYNERGIES**

All the participants successfully completed the assigned tasks, and their respective APR scores varied considerably. The number of utilized synergies *n* was the same across the participants. For sensory synergy, two synergies were enough to project the collected sensory data (**Figure 5A**). Similarly, two muscle synergies were enough to represent the measured muscle activations (**Figure 5B**). From these findings, the sensory or muscle synergies were analyzed on the assumption that two synergies were enough for each participant to complete the assigned task.

**Figure 6** shows an example of the resulting pair of synergies for two representative participants. **Figures 6A,B** show the sensory and muscle synergies computed from data for participant #1 (relatively good balance, score = 1.15), and **Figures 6C,D** show the sensory and muscle synergies computed from data for participant #7 (relatively poor balance, score = −0.9).

As seen in **Figure 6**, notably different strategies were adopted by each of the participants. These appear to represent their level of skill in responding to the disturbance. Participant #1, for instance, seems to have utilized two muscle synergies: one to control the lumbar region with the hip joints (*OW*(*1*) ) and another to evoke the ankle and hip strategies (*OW*(*2*) ). Similar strategies were also represented by the sensory synergies, where the ankle and the hip muscle length sensors were grouped together, and the hip and lumbar joint sensors were in another group. A correlation between the sensory and muscle synergy recruitment signals *<sup>I</sup> C* and *OC* was also observed. The control signal for precise posture control appeared with a delay of approximately 20 ms after the first signal. In contrast to these trends, participant #7 utilized an independent synergy for the ankle strategy alone (*OW*(*2*) ), and another synergy to control the hip and the lumbar joints (*OW*(*1*) ). Thus, the coordination between the utilization of these two muscle synergies seems to be weaker in participant #7 than participant #1. Also, the sensory and muscle synergy recruitment signals seem to show a poor match for this participant. The following two sections highlight the details of these characteristics and relate them to the balancing skills of the participants.

#### **RELATION BETWEEN APR's SKILL LEVEL AND SYNERGY SIZE**

**Figure 7A** shows the relation between the APR's skill level of the participants and their computed sensory synergy size, where

**FIGURE 5 | Similarity** *L* **between the recorded and reconstructed (A) sensory data and (B) muscle activation patterns from all possible computed numbers of synergies (Equation 6).** The plots show means and

SD for each participant. The horizontal dashed line indicates the predefined threshold (75%), and the vertical dashed line indicates the selected number of utilized synergies.

the two appear to be directly proportional (the sensory synergy size is smaller for high-skill participants than for low-skill participants).

**Figure 7B** shows the relation between the sensory and muscle synergy sizes for all the participants, where it is clear that the sensory synergy seems to be consistent with the muscle synergy size. The correlations between sensory and muscle synergies are stronger when the synergy size is smaller.

### **RELATION BETWEEN APR's SKILL AND I/O SIMILARITY**

**Figure 8** shows the relationship between the participants' scores and the correlation of their sensory and muscle synergies recruitments, *<sup>I</sup> C* and *OC*, respectively. From the figure, good performers show high correlation between the sensory/muscle synergies recruitments than bad performers. This high correlation could be the result of the smaller size of sensory and muscle synergies that facilitate mapping between environmental input and motor control.

**FIGURE 7 | (A)** Relation between balance skill level and sensory synergy size. **(B)** Relation between sensory synergy size and muscle synergy size. (P, Participant).

#### **DISCUSSION**

#### **RELATION BETWEEN SENSORY AND MUSCLE SYNERGIES**

This paper formulates a sensory synergy framework and emphasizes its advantage as a biologically plausible model that offers low-dimensional environmental input feedback that may improve on current approaches to neural prosthetic development. The main challenge in computing sensory synergy is to determine the relation between sensory and muscle synergies in a low-dimensional space. To that end, we adopted a simple task in which we used the changes on muscle length as a measure of the activation of proprioceptors over a period of 50 ms to estimate the sensory synergies. The period of 50 ms from the onset of muscle activities was considered in order to compute the muscle synergies. Only dominant muscles which have more influence in carrying out the motion were selected for synergies calculations. A posture control experiment with nine healthy participants was conducted to examine the relation between sensory and muscle synergies. The results suggest that the degree of coordination between the resulting sensory synergies (synergy size) can serve as an effective marker for characterizing to which extend the behavior is adapted to the environment.

Results reveal that participants with high APR scores showed well-tuned sensory synergies that project, in a smaller synergy size, a compendium of sensory data as feedback indicating the body posture. This smaller size suggests the existence of a sophisticated controller that simplifies and accelerates the transformation of the signal into a motor command, thus a correlation between the input *<sup>I</sup> C* and output *OC* was observed, and a control signal for precise posture recovery was emerged, **Figures 6A,B**. The smaller synergy size tends to show that joints are not controlled independently, thus guarantee a coordinated output movement, **Figure 7B**. In participants with weak scores, on the other hand, we observed a larger synergy size that suggests less trained controller which hardly was able to handle the introduced sensorimotor signaling, P7, P8, and P9, in **Figures 7B**, **8**. The large synergy size that appeared in this group of participants, seems to cause passing larger amounts of unnecessary sensory information that may obstruct the formation of an optimal sensory signaling mapping to the desired motor control.

For future direction, we are planning to examine the contribution of other sensory modalities information, such as vision, center of pressure, etc., during a balance training phase that can be applied to the participants who only showed weak scores (Alnajjar et al., 2013b). We expect to observe an automatic converge of neural representation during the participants training that would increase the sensory weights for only those dominant sensors who mainly contributed to trigger muscle response and decrease the weights for those who were less efficient. This tuning of sensation weights could be depending on the task and the environment. The next stage of this study will be also targeting overcoming some of the limitations of this preliminary work. For instance, the subjective scoring system can be enhanced by abstracting it from the motion capture system. The time needed for the participant to recover his/her balance, or the degree of sway which is caused by the platform disturbance could be utilized to design a more robust scoring system.

From our initial results, we believe that sensory synergies are important to clarify low dimensional meaningful signals that simplify the work of the CNS when recruiting proper muscle synergies. It is also the key to determine the level of how much the body adapts to the surrounding environment. Designing prosthesis based upon the concept of sensory and muscle synergies can lead to make the controller simpler.

## **SENSORY SYNERGY AND THE FUTURE OF NEUROPROSTHETICS**

A critical aspect of functional forearm prostheses is the ability to perform sensorimotor tasks. Mainstream powered forearm prostheses are controlled using surface EMG signals. The interface commonly uses EMG sensors to switch between different activation states of the prosthesis. With this control method, the user often experiences difficulty in learning how to control the prosthesis or how to generate an activation signal for a larger number of degrees of freedom and/or finer control of speed and force. Although research has been focusing on the motor control aspect, it is also very necessary to account for somatosensation, especially for proprioceptive and tactile modalities (Peerdeman et al., 2011).

Work on artificial hands indicates that a reduction in dimensionality can decrease the complexity of controlling prosthesis (Jerde et al., 2003; Katsiaris et al., 2012). The integration of tactile sense and proprioception is regarded as essential for implementing the ability to perceive environmental input (Rincon-Gonzalez et al., 2011). The identification of the sensory synergy onset may provide valuable cues that make it possible to extract the intent of the action, for example, the target of a reaching movement. Using sensory synergies is expected to allow for early recognition of the goal compared to when muscle synergies are used, as the latter is the result of modulation. This difference may be essential for implementing continuous and gentle movements in an activated system.

**Figure 9** shows an example of future practical applications of this study. The neural sensorimotor synergy system extends the system in **Figure 1** by including prosthetic and exoskeletal artifacts. The dimensionality of the sensory stimulus is reduced through sensory synergy. A controller modulates sensory synergies to motor commands, and the modulation takes place in a space of reduced dimensionality compared to that of the input and output spaces. Motor commands are recruited at activators. In our ongoing research, we are applying this new principle of control to forearm prosthesis (**Figure 9**), and we are currently conducting clinical experiments involving the control of the forearm prostheses in accordance with the user's intention through the neural sensorimotor synergy system (Oyama et al., 2013; Iwatsuki et al., 2014). In short, from **Figure 9**, the dimensionality of the sensory stimulus to the prosthetic device is reduced by sensory synergy as part of the sensory system of the users, as illustrated in **Figure 9A**. The output from the sensory synergy is used as the input to both the CNS and an artificial controller. Compared with raw environmental inputs, the output from the sensory synergy should be easier to communicate to the CNS when sensory synergy is well defined. The control signals for the prosthetic device are created through motor synergy (**Figure 9B**). This synergy combines the signals from the CNS and the prosthesis controller and creates a higher-dimensional signal to control the prosthetic device. The prosthesis controller (**Figure 9C**) modulates the signal from the sensory synergy to the motor synergy. One of the roles of this prosthesis controller is the generation of reflexive motions to protect the users in case of unpredictable environmental changes.

## **AUTHOR CONTRIBUTIONS**

Conceived and designed the experiments: Fady Alnajjar, Shingo Shimoda. Performed the experiments: Fady Alnajjar, Shingo Shimoda. Analyzed the data: Fady Alnajjar, Matti Itkonen. Formed the equations: Fady Alnajjar, Shingo Shimoda. Wrote the paper: Fady Alnajjar, Matti Itkonen. Revised and discussed the paper: Fady Alnajjar, Matti Itkonen, Vincent Berenz, Maxime Tournier, Chikara Nagai, Shingo Shimoda.

### **ACKNOWLEDGMENTS**

We gratefully acknowledge funding by TOYOTA Motor Co. We are very grateful for their technical and financial assistance. A part of this work was supported by JSPS Grants-in-Aid for Scientific Research Grant Number 25871112.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnins.2014. 00436/abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 March 2014; accepted: 11 December 2014; published online: 13 January 2015.*

*Citation: Alnajjar F, Itkonen M, Berenz V, Tournier M, Nagai C and Shimoda S (2015) Sensory synergy as environmental input integration. Front. Neurosci. 8:436. doi: 10.3389/fnins.2014.00436*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2015 Alnajjar, Itkonen, Berenz, Tournier, Nagai and Shimoda. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A functional model and simulation of spinal motor pools and intrafascicular recordings of motoneuron activity in peripheral nerve

#### *Mohamed N. Abdelghani 1, James J. Abbas <sup>2</sup> \*, Kenneth W. Horch1 and Ranu Jung1*

*<sup>1</sup> Adaptive Neural Systems Lab, Department of Biomedical Engineering, Florida International University, Miami, FL, USA <sup>2</sup> Center for Adaptive Neural Systems, School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ, USA*

#### *Edited by:*

*David Guiraud, Institut National de la Recherche en Informatique et Automatique, France*

#### *Reviewed by:*

*Silvestro Micera, Scuola Superiore Sant'Anna, Italy David Guiraud, Institut National de la Recherche en Informatique et Automatique, France Olivier Rossel, Institut National de Recherche en Informatique et en Automatique, DEMAR, France*

#### *\*Correspondence:*

*James J. Abbas, Center for Adaptive Neural Systems, School of Biological and Health Systems Engineering, Arizona State University, Engineering Center, G Wing, 501 E. Tyler Mall, PO Box 879709, Tempe, AZ 85287-9709, USA e-mail: james.abbas@asu.edu*

Decoding motor intent from recorded neural signals is essential for the development of effective neural-controlled prostheses. To facilitate the development of online decoding algorithms we have developed a software platform to simulate neural motor signals recorded with peripheral nerve electrodes, such as longitudinal intrafascicular electrodes (LIFEs). The simulator uses stored motor intent signals to drive a pool of simulated motoneurons with various spike shapes, recruitment characteristics, and firing frequencies. Each electrode records a weighted sum of a subset of simulated motoneuron activity patterns. As designed, the simulator facilitates development of a suite of test scenarios that would not be possible with actual data sets because, unlike with actual recordings, in the simulator the individual contributions to the simulated composite recordings are known and can be methodically varied across a set of simulation runs. In this manner, the simulation tool is suitable for iterative development of real-time decoding algorithms prior to definitive evaluation in amputee subjects with implanted electrodes. The simulation tool was used to produce data sets that demonstrate its ability to capture some features of neural recordings that pose challenges for decoding algorithms.

**Keywords: prosthesis, neural control, neural recordings, electrode, decoding, fascicle, peripheral nerve, simulation**

## **INTRODUCTION**

Most commercially-available powered prostheses for upper limb amputees provide control of a single degree-of-freedom (DOF) (MotionControl, 2007). A few provide more than one DOF, but they require extensive training and exert a high demand on attention (OttoBock, 2011; TouchBionics, 2013). All of these systems fall far short of restoring the functionality of the native limb. This limitation has driven a substantial research and development effort to develop advanced powered prostheses (JHUAPL, 2014) and techniques to control the prostheses with biological signals. To afford a greater degree of control, some efforts have explored techniques to utilize signals recorded from residual or reinnervated muscle (Kuiken et al., 2009; Khokhar et al., 2010; Rehbaum et al., 2012), while others have investigated the use of signals recorded from the central nervous system (CNS) (Wolpaw et al., 1991; Zhu et al., 2005; Huang et al., 2012; Onose et al., 2012) or the peripheral nervous system (PNS) (Dhillon and Horch, 2005; Durand et al., 2008; Micera et al., 2008, 2011; Kamavuako et al., 2010; Tang et al., 2011; Wodlinger and Durand, 2011).

Although approaches that utilize recordings from muscle, CNS and/or PNS may prove to be suitable for controlling advanced prostheses, the PNS interfaces have the potential advantage of providing access to a sufficient number of signals without the risks associated with implants into the brain or spinal cord. Signals from the PNS have been recorded using various types of electrode technologies (Hoffer and Loeb, 1980; Veraart et al., 1993; Tyler and Durand, 2002). Dhillon et al. (Dhillon et al., 2004; Dhillon and Horch, 2005) demonstrated that amputees could control a one DOF robotic arm in a graded fashion using real-time decoding of signals recorded from longitudinal intrafascicular electrodes (LIFEs) implanted in the peripheral nerve stumps. These electrodes, which are fine wires that are inserted into and along a long axis of a fascicle, enable recordings from small groups of axons (up to approximately 10). Subsequent demonstrations with other electrode systems (Durand et al., 2008; Micera et al., 2008, 2011; Kamavuako et al., 2010; Tang et al., 2011; Wodlinger and Durand, 2011) further supports the investigation of PNS interfaces for prosthetic control.

For control signals derived from the PNS (as well as from muscle or from the CNS), the recorded neural signals must be transformed in order to derive the control signals to be sent to the motorized prosthesis. The transformation includes, either implicitly or explicitly, a decoding of the recorded signal to infer the intent of the user. A wide variety of algorithms to decode the biological signals for use in controlling prostheses have been developed (e.g., Wood et al., 2004; Fraser et al., 2009). The specific features of the decoding algorithm may differ depending on the type of biological signal recorded, the properties of the machine-tissue interface, and the targeted function of the prosthesis.

To evaluate the performance of a candidate decoding algorithm, the penultimate test is to use it for real-time decoding of signals recorded from an amputee as s/he performs a functional task. However, the ability to extensively utilize such a testing paradigm may be limited due to the experimental nature and limited deployment of the neuromuscular interface technologies, as well as other factors. Furthermore, such testing may not afford direct control over factors that could help to differentiate the performance of candidate decoding algorithms, such as spike overlap or the number of fibers that contribute to the signal recorded on a given electrode. Computer models of the peripheral neuromuscular system and the neural interface can be used to efficiently explore a greater range of approaches than can be readily achieved in living systems (Durand et al., 2008; Zhou et al., 2010).

In this work, we have developed a system to simulate neural recordings. This system is intended to accelerate the development and evaluation of candidate decoding algorithms by enabling the production of data sets of simulated neural recordings with known characteristics. By affording direct control over several key features of recorded neural signals, the system could be used to methodically generate data sets that could identify the advantages and disadvantages of candidate decoding algorithms.

Our simulation framework enables modeling and simulation of spinal cord motor pools and recordings by LIFEs (or other electrode technologies) from subpopulations of motor axons. The simulator can be used to produce simulated recordings from multiple electrodes for multi-DOF tasks with known motor intent, neural spike train characteristics, levels of encapsulation and signal-to-noise ratios (SNRs). This simulator was designed to facilitate direct comparison of candidate neural decoding algorithms by enabling comprehensive assessment of the effect of spike overlap, noise level, and electrode receptive field properties on algorithm performance. Here, we present a description of the model and the simulation tool as well as results of several simulations using the tool. These results demonstrate that the simulation tool can be used to systematically vary motor intent, neural firing patterns, and electrode recording characteristics in order to produce data sets that could facilitate the development and assessment of decoding algorithms for systems that use peripheral neural interfaces.

#### **MATERIALS AND METHODS**

A model and simulation system were developed to simulate the activity of motoneuron pools based on a multi-DOF input of motor intent. **Figure 1** presents a schematic that represents the system that is modeled in which multiple LIFEs are implanted in peripheral nerve of an amputee to record activity of motoneurons that is driven by motor intent signals. The simulator (**Figures 2**, **3**) consists of three primary components: the motoneuron activation unit, the motoneuron output unit and the electrode unit. Each of these is described in the sections that follow.

The simulator is implemented in MATLAB®. Simulation and user specified parameters and functions are defined using several Excel® or text documents.

#### **MOTONEURON ACTIVATION UNIT**

The motoneuron activation unit models the transformation from motor intent to a variable that represents the membrane potential, or state, of the motoneurons.

represented as a multi-dimensional signal from centers in the brain to motoneurons pools in the spinal cord, which produce firing patterns in motoneurons (M: M1, M2, M3). Axons from a given motor pool tend to cluster together along the length of the peripheral nerve fascicle. The diagram shows a LIFE electrode that has been implanted into one of the fascicles of the nerve.

Motor intent is the voluntary intention of a person that leads to activation of the neuromotor system to attain a motor goal (Jankowska, 1992; Carp and Wolpaw, 2010). For example, motor intent could be an attempt to flex the biceps strongly, to partially extend the wrist or to reach and grasp an object. In an amputee, motor intent may produce activity in the peripheral motor axons of the residual limb that could be recorded using a neural interface. In our simulation framework, we define motor intent as an effort to stabilize and control a single joint or coupled sets of joints. As such, realization of the motor intent would involve formulation of two essential components: an intended action and a level of effort. The intended action is the DOF to be controlled while intended effort is the intensity of that action.

The motoneuron activation unit (**Figure 2A**) is modeled by,

$$\mathbf{x}(t) = \mathbf{G}\mathbf{u}(t),\tag{1}$$

where *u*(*t*) is an *n* × 1 vector, where *n* is the number of motor intent signals. This is a vector quantity in which the individual components represent motor intent normalized by maximum intended effort. *G* maps the motor intent signals to motoneurons. It is an *m* × *n* matrix, with *m* ≥ *n*, where *m* represents

the number of spinal cord motoneurons in a motor pool. *x*(*t*) is the *m* × 1 vector of the motoneuron activation states. Here, the activation state, *xi*(*t*), represents the membrane potential of motoneuron *i* at the site of action potential initiation (axon hillock). This represents the time-varying state that will determine the instantaneous firing rate of a motoneuron. Note that the motor intent vector represents direct inputs to motor pools through the connectivity matrix, *G*. Uniform values in a row of *G* would be used to simulate uniformity of inputs across the set of the motoneurons (Fuglevand et al., 1993); variations in these values would simulate a situation in which some motoneurons in the pool received stronger input than others. Indirect motor pathways are not included in the current implementation of the simulator and motor intent signals are modeled as graded values, not firing patterns.

*z*(*t*), recorded by the electrodes. The motoneuron output model includes

In the simulator, the motor intent vector *u*(*t*) Equation (1) is a set of independent functions over a time interval [0*, T*] that is specified by the user prior to the start of the simulation. Users have the option to set each component of the vector *u*(*t*). For example, motor intent can be set as a square wave, ramp-andhold, sinusoid, etc. Alternatively, a dynamic model can be used to generate motor intent signals for a task such as reaching. The structure and values of *G* are specified in a parameter file.

selected spike is scaled in time by the function and in amplitude by Ai.

#### **MOTONEURON OUTPUT UNIT**

This component of the model (**Figure 2A**) represents the transformation from motoneuron state (time-varying membrane potential just prior to the axon hillock) to time-varying extracellular potential just outside the axon. This unit is responsible for generating spike events based on the state of the motoneurons and producing the extracellular voltage waveform based on the spike events.

Alpha motoneurons, which comprise the motor pool, fall into three subclasses according to the contractile properties of

the muscle fibers they innervate: fast-twitch fatigable (FF), fasttwitch fatigue-resistant (FR), and slow-twitch fatigue-resistant (S). These three fiber types differ in size (of the muscle fiber and the motoneuron) recruitment characteristics, and range of firing rates. The recruitment of motoneurons in a motor pool is postulated to follow the size principle (Henneman and Mendell, 2011)—small motoneurons fire first and as motor drive increases, larger motoneurons are recruited and contraction strength increases. Variations in excitability of motoneurons within the pool may be the primary mechanism for this orderly recruitment (Fuglevand et al., 1993). Small motoneurons connect to slow fibers while larger ones innervate fast twitch fibers (Brown et al., 2006; Carp and Wolpaw, 2010). The firing rates observed in slow fibers are lower than the rates observed in large fibers (Cisi and Kohn, 2008).

The motoneuron output unit (**Figures 2A,B**) is modeled by

$$
\varphi(t) = \mu\left(\mathfrak{x}(t)\right) \tag{2}
$$

where *y*(*t*) is an *m* × 1 vector that represents the extracellular potentials at each axon and *μ* is a function that maps the activation states, *x*(*t*), of the motoneurons to *y*(*t*).

The motoneuron output model includes three components: the first component determines the mean firing rate based on activation state and motoneuron properties; the second produces a train of spike events based on mean firing rate and the specification of a point process function for spike event timing; the third produces the time series of the extracellular potential based on the spike event timings and the spike template. Each of these components of the model is described in more detail below.

The mapping of motor unit activation to firing rate of the various types of motoneurons are represented schematically in **Figure 4**. For each motoneuron, the firing rate is given by

$$f(\mathbf{x}) = \begin{cases} 0, & 0 \le \mathbf{x} < \mathbf{x}\_{thr}, \\ \mathbf{x}\_f \mathbf{x}, & \mathbf{x}\_{thr} \le \mathbf{x} < \mathbf{x}\_{sat}, \\ f\_{sat}, & \mathbf{x} \ge \mathbf{x}\_{sat}, \end{cases} \tag{3}$$

where the slope κ*<sup>f</sup>* of the input/output response curve is given by,

$$\kappa\_f = \frac{f\_{\text{sat}} - f\_{\text{thr}}}{\chi\_{\text{sat}} - \chi\_{\text{thr}}} \tag{4}$$

where f is the frequency of firing in Hz. *xthr* is the activation state above which a motoneuron begins to fire. *fthr* and *fsat* are the minimum and maximum frequency of firing for a motoneuron, while *xsat* is the activation level at which a the firing rate saturates. The activation state *x* is normalized between 0 and 1, where 1 represents maximum activation. *xthr* determines the effort at which a particular motoneuron is recruited. In the simulator, *xthr*, *fthr*,

The third component of the motoneuron output model produces the time series of the extracellular potential based on the spike event timings and the spike template. Each motoneuron output spike has a characteristic morphology, amplitude and duration. The shape of the extracellular spike is influenced by the size of the axon, the number and type of voltage gated channels, whether or not it is myelinated, and the general health of axons, since atrophy after amputation can alter spike shape (Dhillon et al., 2004).

Extrinsic factors that influence the shape of spike recorded from an extracellular electrode are the recording electrode material type, geometry, location, and orientation with respect to neural sources as well as characteristics of the tissue-electrode interface such as the degree and type of encapsulation. To simplify the real-time simulation process, we have chosen to include the effects of electrode type in the shape of the spike templates. Therefore, the spike template represents the shape and duration of the extracellular effect of the axonal spike train as viewed by an electrode. Note that the effect of electrode location and other extrinsic factors that affect amplitude are represented in the electrode unit. This structure streamlines the simulation process by incorporating all of the temporal aspects of a spike in one template; all other processes that affect the recorded signals involve only addition (superposition) and multiplication (scaling).

Extracellular waveforms occupy a frequency bandwidth between 100 Hz and 10 kHz depending on the recording electrode (Horch and Dhillon, 2004; Plonsey et al., 2007; Gosselin, 2011). Some examples of shapes of action potentials recorded using LIFEs have been provided in the literature (Malagodi et al., 1989; Lefurge et al., 1991; Lawrence et al., 2004; Dhillon and Horch, 2005; Micera et al., 2008).

In the simulator, spike shapes are specified by the user in a process that includes several steps. First, the user selects normalized spike morphologies (**Figure 5**). Spike morphologies are generated by differentiating Gaussian and Gamma functions, which can produce a variety of spike wavelets similar to spike shapes reported in the literature. The spike wavelets are normalized in amplitudes between (−1, 1) and normalized in duration between (0, 1). The spike-morphologies are then scaled in amplitude and duration by the simulator using parameters that can be specified by the user (**Figure 2C**).

Let *-*(*t*) be a *m* × 1 vector function that encodes spike shapes of a motoneuron. Each component of *-*, *ψi*(*t*), will have the following properties,

$$\int\_{-\infty}^{\infty} \psi\_i(s) \, ds = 0,\tag{6}$$

 <sup>∞</sup> −∞ *ψi* <sup>2</sup> *(s) ds <* ∞*.* (7)

Now, we can define output of a motoneuron *i* as follows (**Figure 2**):

and

$$
\mu\_i(\mathbf{x}\_i(t)) = \int\_0^t \psi\_i(t-\tau)dN \left( f\_i(\mathbf{x}\_i(\tau)) \right). \tag{8}
$$

*xsat*, and *fsat* can be set by the user for each motoneuron and can therefore be used to specify the input/output properties of

neous firing rate, but the actual spike timing is produced by one of several point process functions (Fuglevand et al., 1993; Cisi and

*N (ξ )* ∼ ⎧ ⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎩ *ξ Poisson (ξ ) TruncatedGaussian (ξ,σ) Gamma (ξ,σ) Uniform (ξ, w)* (5)

be a stochastic point process having one of the distributions listed above. The activation state x determines the mean interspike interval (ISI, *ξ* = 1*/f*). The simulator provides the option of selecting one of the different point processes for spike trains: Identity, Poisson, Truncated-Gaussian, Gamma, or Uniform Equation (5). The Identity process produces a regular spike train for testing other simulator functionalities. The Poisson process produces an irregular spike train where the variability is dependent on the mean firing rate. The Gaussian distribution has been selected for use in prior modeling studies (Fuglevand et al., 1993) based on some reports of firing rate variability. In the last three processes, the variability in ISI can be set to be independent of the mean ISI. This is useful for evaluating the performance of decoding algorithms under different levels of ISI variability while the mean ISI remains fixed.

$$\text{a motor pool.}\\ \text{The firing rate, f, represents the time varying mean instant-}\\ \text{noneus firing rate, but the actual spike timing is produced by one of several point process functions (Fuglevand et al., 1993; Cisi and }$$

Kohn, 2008; Zhou et al., 2010) described below. Let

implemented using a piecewise linear function with threshold and saturation. The plot shows examples of curves representing the mapping from motor intent to firing frequency for three motoneuron pools, one for each of the three fiber types. Note that the mapping values specified in this example will produce sequential recruitment of the slow (S), fast fatigue resistant (FR), and fast fatigable (FF) as motor intent is increased.

**FIGURE 5 | Examples of spike templates.** Three spike morphologies with normalized amplitudes between (− 1*,* 1) and normalized duration between (0*,* 1) are scaled in time and amplitude to form a multitude of spike templates. A spike template is a characteristic of a neuron. Spike morphologies are classified in terms symmetry and the number of peaks and troughs. Plots **(A–C)** present spike morphologies that are: symmetric with one peak and one trough, symmetric with two peaks and two troughs, and asymmetric with one peak and one trough. Other spike morphologies are possible and can be directly programmed in the simulator. After scaling in amplitude and time, each spike morphology can be used to generate several spike templates, as shown in plots **(D–F)**, each of which has three spike templates generated from one spike morphology.

If *N* is a Poisson process, then we can rewrite the function *μ<sup>i</sup>* as

$$\mu\_i\left(\mathbf{x}(t)\right) = \sum\_{j=0}^{\infty} \int\_0^t \psi\_i(t-\tau)\delta(\tau-\tau\_j)d\tau. \tag{9}$$

where *δ τ* − *τ<sup>j</sup>* is the delta function and *τ<sup>j</sup>* is spike event time, a function of the input/output response map *fi* and Poisson process.

To implement the motoneuron output units in the simulator, for each motoneuron the user specifies an input/ouput response curve, a firing model (e.g., Poisson, Gaussian) and a spike template. Spike templates are generated by a subunit of the simulator (**Figures 2C**, **5**).

#### **ELECTRODE UNIT**

The output of the electrode unit is the summation of signals from the motoneurons in the vicinity of the electrode. The number of units and their relative contributions will depend upon the design of the electrode, its location in or near the fascicle, and the properties of the tissue between the motor axon and the electrode. The model of the electrode unit is designed to represent each of these factors, which are described below.

#### *Characteristics of LIFEs*

In this study, we have implemented a model of the LIFE electrode. In studies that performed peripheral nerve recordings in animal models, LIFEs have been fabricated from 25, 50, or 100μm diameter insulated 90%Pt–10%Ir. A 1 mm recording site is made by removing part of the insulation (Malagodi et al., 1989; Lefurge et al., 1991). Each LIFE is placed in a fascicle so that it is aligned with the axons.

Superposition is the summation of neural signals from multiple sources on a single recording electrode. The amount of superposition depends on the structure and relative position of the electrode with respect to the neural sources. A LIFE with these dimensions and placement typically records from 6 to 10 axons (Lefurge et al., 1991). The amplitude of the component from each axon will depend upon the strength of the signal and the distance of that axon from the electrode.

Crosstalk occurs when a neural electrode picks up neural signals from motor axons emanating from different motor pools. This may lead to superposition of two or more intended motor actions on a single electrode recording. However, it has been reported that peripheral nerves are somatotopically organized even at fascicular and subfascicular level (Hallin, 1990). Given this organization, an electrode that records from a small number of fibers is likely to record primarily from motor axons derived from the same, or related motor pool (Topp and Boyd, 2012).

Spike overlap refers to the temporal coincidence of two spikes from different motor axons on one electrode. The overlap of spikes from two or more waveforms could be constructive, which would result in one large spike, or destructive, which would result in a small amplitude spike. Either of these could distort spike shapes, lead to a failure to in spike detection, and alter the apparent firing frequencies in recorded neural activity.

A system of multiple LIFEs implanted in multiple peripheral nerve fascicles could record from multiple motor pools and reflect different motor actions. The knowledge of nerve gross anatomy helps guide the placement of electrodes into nerves that carry information related to the targeted motor actions, but it is not currently possible to surgically target specific regions within a fascicle of a nerve or motoneurons from a specific muscle. The relationship between motor intent and the signal recorded on each electrode must be determined (decoded) experimentally. Similar decoding procedures have been carried out for cortical and other peripheral interfaces (Allison et al., 1992; Donoghue, 2002; Dhillon et al., 2004; Dhillon and Horch, 2005; Velliste et al., 2008; Blakely et al., 2009; Halder et al., 2011; Krusienski and Shih, 2011; Hochberg et al., 2012).

Drift is unwanted relative motion between the neural interface and neural sources. Drift can affect the recorded firing patterns and crosstalk. Any increase in the distance between the axon and the electrode would attenuate its contribution to the recorded signal.

Encapsulation is the accumulation of biological matter on the neural interface as a result of the tissue response to the electrode (Lefurge et al., 1991; Polikov et al., 2005). Encapsulation attenuates neural signals and can lead to dysfunctional electrodes.

The noise in recordings from LIFEs (or other electrodes) emanates from a number of sources: activity of muscles in the vicinity of the electrode, electrocardiac signals, background neural activity from motor or sensory axons, tissue thermal noise, thermal and impedance properties of the neural interface, the recording system and the recording environment (e.g., power hum and flicker noise).

#### *Electrode unit: model*

In the simulator, a mapping matrix is used to direct signals from one or more motor axons to each LIFE (**Figures 2A**, **3**). The value of each element in the matrix represents the sum of the effects of distance from the axon to the electrode, drift, and encapsulation. Noise is incorporated as additive signal.

The neural component of the signal recorded on each electrode is a weighted sum of the extracellular signals generated by the motoneurons (**Figure 2A**) and is described by

$$\mathbf{z}(t) = \mathbf{H}\left(\boldsymbol{\chi}(t)\right) + \mathbf{W}(t) \tag{10}$$

where **z**(*t*) is a vector representing the signals recorded on each of *l* electrodes, *y*(*t*) is the vector representing the activity of *m* motor axons, *H* is a *l* × *m* matrix that maps motor axon activity to electrodes and *W*(*t*), which represents noise, is an *l* × 1 vector. The values for *H* reflect the location of the electrodes with respect to the motor axon. For example, a small value for an element of *H* would indicate an axon that is distant from the electrode and would therefore contribute weakly to the recorded signals.

*H* can be configured by the user to test different electrode configurations and recording scenarios. For example, the recording from a LIFE electrode may include substantial contributions from 6 to 10 motor axons signals, the recording from an electrode on a Utah array may include substantial contributions from 1 to 6 motor axons. To simulate recordings from fibers that are close to an electrode with a low degree of encapsulation, the elements of *H* should be set to high values (close to 1); the effect of increased distance or encapsulation can be simulated with lower values to achieve signal attenuation.

*H* can be defined as the product of two matrices:

$$H = \mathbf{C} \mathbf{B} \tag{11}$$

where *B* is a *l* × *m* matrix, *C* is an *l* × *l* matrix, *m* is the number of motor axons, *l* is the number of electrodes. The matrix *B* maps activity from a subset of related motor axons (i.e., the same motor pool) into activity on a set of virtual electrodes *v*(*t*) (**Figure 3**).

$$\mathbf{v}(t) = \mathbf{B}\left(\mathbf{y}(t)\right). \tag{12}$$

In this formulation signals detected by the virtual electrodes represent pure motor commands destined to a particular muscle. The mapping matrix *C* is the degree of crosstalk between motor pools or, in this case, virtual electrodes. This representation enables explicit specification of crosstalk that is separate from the specification of the mapping to virtual electrodes.

Since LIFEs record from a small number of fibers that are likely to be in the same motor pool, we assume that *C* is nearly the Identity matrix. That is, cross-talk between motor pools is negligible. In this case, the LIFE's electrode signal **z**(*t*) is given by

$$z(t) = \mathcal{B}\nu(t) + \mathcal{W}(t). \tag{13}$$

In Equation (10), *W* is the sum of all noise sources in the environment. In the simulator, noise is modeled as power-law noise (i.e., 1*/f <sup>β</sup>*) whose amplitude and *β* parameter can be specified by the user. Alternatively, the user can specify band-limited Gaussian white noise and specify the SNR or provide an additive noise time series using an input file. In this case, the standard deviation of the noise will be determined by

$$
\sigma\_{noise} = \frac{Q\_{99.9} - Q\_{0.1}}{\text{3 SNR}},
\tag{14}
$$

where *Q*99*.*<sup>9</sup> and *Q*0*.*<sup>1</sup> are the 99.9% percentile and 0.1% percentile of the pure neural signal recorded by the electrode.

## **OPERATION OF THE SIMULATOR**

In order to implement a simulation run, the user must specify the following simulation parameters:


#### **DEMONSTRATION OF SIMULATOR CAPABILITIES**

The simulator was used on specific models in various scenarios to demonstrate its capabilities with a particular emphasis on producing data sets with characteristics that could pose challenges for neural decoding algorithms, such as: recordings from multiple axons with different spike morphologies and spike train characteristics (Simulation run **1**), recordings produced by motor intent commands with more than one DOF (Simulation run **2**), recordings produced by motor intent commands at slowly varying or different levels of quasi steady-state activity (Simulation run **3**), and recordings with substantial spike overlap (Simulation run **4**).

Simulation run **1** was set up to demonstrate the different spike trains from fast and slow motor units and to demonstrate superposition of signals from different motor pools. This model included 5 electrodes in the vicinity of S and FF motor units with a ramp in motor intent (1 DOF). Six motor units of each type (S and FF) were simulated; the spike morphology used for the contribution of each motor unit was the shape shown in **Figure 5A**. The parameters for each motor unit were selected from a uniform random distribution across a pre-specified range. The ranges of values used for spike duration was 4–6 ms for S, 2–4 ms for FF; the ranges for spike amplitudes were for 45–65 for S, 95–105 for FF; the ranges for firing frequencies at threshold were 1–5 Hz for S and 12–19 Hz for FF; ranges for firing frequencies at saturation were 16–18 Hz for S, 25–30 Hz for FF; ranges for motor intent threshold were 0–10% for S, 35–65% for FF; ranges for motor intent saturation were 40–50% for S, 80–100% for FF. The Poisson model for spike timing variability was used. The weights on contributions of the neurons to the recorded signal (values in the *H* matrix) were assigned amplitudes that were equally spaced over the range from 0.5 to 1 and each electrode had additive noise with SNR = 3 (average across the set electrodes). These simulations demonstrate recordings from electrodes that record from 1 S unit, 6 S units, 1 FF unit, 6 FF units, and 3 S and 3 FF units, respectively.

Simulation run **2** was set up to demonstrate multiple DOF motor intent and a composite of two motor pools onto one electrode. This model included 3 electrodes and a 2 DOF motor intent signal: one electrode was modeled to be in the motor pool of the first motor intent signal; another electrode was modeled to be in the motor pool of the second motor intent signal; the third electrode was modeled to be in the vicinity of both motor pools. The parameters used for this run were the same as the mixed fiber electrode (3 S and 3 FF) in simulation run 1 except that for the third electrode the cross-talk matrix was set to equally weight inputs from the two motor intent signals (0.5 for all matrix elements).

Simulation run **3** was set up to demonstrate the effect of motor intent commands at slowly varying or different levels of quasi steady-state activity. These simulated examples are also used to demonstrate the qualitative similarity of the simulated traces to recordings from the peripheral nerve of an amputee. This model included a single DOF motor intent and additive noise; the parameters of the model were specified to approximate the characteristics of actual recordings from the peripheral nerve of an amputee (Dhillon et al., 2004; Dhillon and Horch, 2005). The first recording was from a trial in which an amputee was requested to produce a ramp in motor intent. To produce the simulated data set, we configured the electrode to record from 6 motor axons, since experimental data has indicted that a LIFE with these dimensions and placement typically records from 6 to 10 axons (Lefurge et al., 1991). Additionally, the SNR ratio in the simulator was set to be equal to the SNR calculated from neural data. Motor intent was estimated from the real neural data using a simple moving average decoder (i.e., the time series was low pass filtered using a 200 ms moving average window). Then, we produced a simulated motor intent signal that closely resembled the extracted motor intent in time and amplitude but free of noise and irregularities. We used this motor intent signal to generate the simulated neural data using the specified set of neuron characteristics and electrode characteristics.

The second group of simulations in this run was set up to mimic a sequence of 3 trials in which an amputee was requested to produce a steady-state value in motor intent at a low, moderate and then high level, respectively. This model used a set of 30 electrodes with varying number of axons (2, 4, 6, 8, or 10) and varying properties of the motor pool (all S, all FF, or an equal mix of S & FF). Each was simulated under three conditions (motor intent levels of low, moderate, and high steady-state values). For each steady-state trial, the power spectrum was calculated from the bandpass-filtered (4th order; 80 Hz–4 kHz) time-series data using the Welch method (0.5 s window; 50% overlap) and the total power, mean frequency and estimated motor intent for each trial were calculated. In all trials in this run, neural recording amplitudes of both simulated and experimental data were scaled using the standard deviation of the quiescent phase (i.e., a null motor intent) on that electrode.

Simulation run **4** uses a large set of simulation runs that was designed to demonstrate the effect of firing rates and the number of axons per electrode on spike overlap. This model used a set of 15 electrodes with varying number of axons (2, 4, 6, 8, or 10) and varying properties of the motor pool (all S, all FF, or an equal mix of S & FF). Each was simulated under 10 conditions (motor intent levels of up to 100% in increments of 10%). For each simulation run, the composite firing rate (total number of spikes from the set of neurons contributing to the electrode) and the % spike overlap (the percentage of the time in the simulation run where a spike was present on more than one axon contributing to that electrode) were calculated.

## **RESULTS**

**Figure 6** shows simulated LIFE recordings from fast and slow motoneurons in response to a slow ramp and hold motor intent (Simulation run **1**). Each motor axon contributes different firing patterns to a LIFE electrode recording. S fibers have sparse firing, longer spike duration and smaller amplitudes while FF fibers have larger amplitudes shorter spikes and more dense firing patterns. A LIFE electrode, depending where it is placed in a nerve fascicle, could either record activity from S, FR, FF or a mix of motor axons. In this simulation, the motor intent signal was a ramp up to a maximum contraction (**Figure 6A**). **Figure 6B** shows action potentials from a single S motor axon and **Figure 6C** shows a recording from the LIFE that is the superposition of signals from six S motor axons. **Figure 6D** shows firings of a single FF motor axon and **Figure 6E** shows a recording from the LIFE that is the superposition of signals from six axons of FF motoneurons. **Figure 6F** is a LIFE recording from a set of three S and three FF motoneurons. These plots demonstrate that the properties of the motoneurons as specified for the FF and S fibers produce different contributions to the LIFE recording and demonstrate the superposition of signals from many motoneurons onto a single LIFE recording.

**Figure 7** demonstrates the ability of the simulator to generate simulated LIFE recordings for a multiple-DOF task (Simulation run **2**). The motor intent signals were independently specified to represent a ramp-and-hold for the first DOF (**Figure 7A**) and a series of contractions and relaxations for the second DOF (**Figure 7B**). Each of these motor intent signals produced activation in a motor pool. Three electrodes were placed such that the first recorded signals from the first motor pool (**Figure 7C**); the second electrode recorded from the second motor pool (**Figure 7D**); and the third recorded signals from both motor pools (**Figure 7E**).

To demonstrate the ability of the simulator to produce neural recordings that can mimic actual neural recordings (Simulation run **3**), we compared simulated traces to data acquired by a LIFE implanted in an amputee (Dhillon et al., 2004; Dhillon and Horch, 2005). **Figure 8** demonstrates that the simulated ramp data (**Figure 8B**) is qualitatively similar to the actual data (**Figure 8A**). In addition, a moving-window sign-test (200 ms) was used to compare the squares of simulated and experimental data. This analysis indicated that the experimental and simulated data are not significantly different from each other (*p* ≈ 1). **Figure 8C** shows the decoded motor intent signals from simulated and real data. Note that this comparison between simulated and real data is limited by the nature of the recorded neural data,

because we did not have an independent measure of motor intent. Thus, the motor intent signal used to generate the simulated neural recording is the result of the simplified decoding scheme and is not a true representation of the original motor intent signal.

Data from trials in which an amputee was asked to produce steady-state levels of motor intent are summarized in **Figure 9**. **Figures 9A–C** present the normalized total power, mean frequency and estimated motor intent values for each motor intent level. The values calculated for the 30 electrodes used in the simulation run are presented in the form of box-and-whisker plots (quartiles and 99 percentile ranges); the data calculated from trials at each motor intent level from two amputee subjects are superimposed. These data indicate that, for both the simulated and actual recordings, as the level of motor intent increased from low to medium to high, total power increased, the mean frequency decreased, and the estimated motor intent value increased.

**Figure 10** presents the calculated degree of spike overlap vs. frequency of firing of motoneurons across a set of simulations

hold; plot **(B)** shows motor intent pertaining to 2nd DOF with a series of contractions and relaxations. Plot **(C)** shows recording from a LIFE electrode recording from motor axons associated with the first DOF, while plot (**D)** shows a LIFE recording associated with the second DOF. Plot **(E)** shows a recording from a LIFE electrode picking up signals from the two motor pools associated with the first and second DOFs.

using several electrode and motor intent settings (Simulation run **4**). **Figure 10A** presents spike overlap as a function of motor intent for each of the electrodes. The plots demonstrate that percent overlap increases as a result of increased motor intent and the number of axons that contribute to a particular electrode. Note that the overlap in electrodes that record solely from S fibers reaches a plateau at motor intent = 0.5 (since this value was specified as the saturation point for that motor pool); the electrodes that record solely from FF fibers show overlap only for values of motor intent greater than 0.5 (since this value was specified as the threshold value point for that motor pool); and the electrodes that record from a combination of S and FF show a gradual increase in spike overlap throughout the range. Also note that the maximum values recorded for spike overlap was approximately the same for the three groups of electrodes (S, FF, and S & FF) due to the fact that the effect on spike overlap of lower firing rates of the S axons was offset by their longer spike durations. This effect is also demonstrated in **Figure 10B**, which demonstrates that overlap on electrodes that recorded from S axons increased more rapidly as a function of composite firing rate than those that recorded from FF axons; the rate of increase in overlap for the S & FF electrodes was at an intermediate level.

with SNR = 3.

**FIGURE 8 | Simulated recordings from slowly varying commands in motor intent and comparison with data recorded using LIFE electrodes in an amputee (Simulation run 3).** Experimental data from a ramp and hold task (Dhillon et al., 2004) is plotted in **(A)**. A simulated recording from a ramp and hold task is plotted in **(B)**. Both simulated and experimental data were scaled using the standard deviation of the quiescent phase (i.e., a null motor intent). Plot **(C)** shows a plot of decoded motor intent: the blue trace is from the actual LIFE recording in **(A)**, the green trace is from the simulated data shown in **(B)**.

## **DISCUSSION**

#### **A TOOL TO FACILITATE THE DEVELOPMENT OF DECODING ALGORITHMS**

The purpose of this simulator is to facilitate the development of effective and reliable decoders for the control of prostheses by neural signals. Neural interfaces may improve the functionality of advanced prosthetic limbs and reduce the attentional demands required to operate them. Some of the key technical challenges in developing these neural interface technologies are to obtain a large number of independently controllable signals, to obtain them reliably and to interpret them appropriately. This work was directed at creating a tool to be used in the development of technology for interpreting, or decoding, the recorded neural signals.

In a neural controlled prosthesis, the role of the decoder is to estimate the intent of the user from the recorded neural signal. According to our general definition as well as our specific implementation, motor intent is a multi-dimensional signal that can take on graded values along each dimension. The recorded neural signals are a set of waveforms, each of which is a composite of spike trains from several motoneuron sources. In general, an increase in the intensity of motor intent along any dimension is likely to increase the level of activity on one or more electrodes. Therefore, one challenge for the decoding process is to identify changes in activity level in the recorded signals, which would indicate a change in the intensity of motor intent. A second challenge for the decoding process is to accomplish a mapping from a multi-dimensional space defined by electrode recordings to space defined by dimensions of motor intent.

**FIGURE 9 | Characteristics of simulated steady-state contractions at different levels of motor intent and comparison with data recorded using LIFE electrodes in an amputee (Simulation run 3).** Normalized total power **(A)** and mean frequency **(B)** calculated from the power spectra from simulated and experimental data. The box-and-whisker plots at each level of motor intent present the mean, quartiles and 99-percentile ranges of data from 30 simulated electrodes. The calculated values for total power and mean frequency from the spectra of experimental data from two amputee subjects are superimposed (red symbols). Similarly, the estimated motor intent values from simulated and experimental trials are presented in **(C)**.

Consider the first challenge—that of identifying changes in activity level on a given electrode. In electrodes that record composite signals, any overlap in the action potentials in neighboring axons will produce distortion in the morphology of a given spike. Some candidate decoding algorithms may be more sensitive than others to such distortions due to spike overlap. In evaluating a decoder on actual recordings from nerves, the amount of overlap is not known and cannot be experimentally controlled. The simulator described here will enable comprehensive assessment of candidate algorithms with respect to their ability to identify changes in motor intent and with respect to their sensitivity to distortions caused by spike overlap. The simulator can be used to generate data sets with a collection of motor intent signals and a variety of electrode configurations. These data sets can be created to present specific and well-characterized challenges for decoding, such as spike overlap, in order to assess the ability of the algorithm to address that specific issue.

**FIGURE 10 | Percent overlap as a function of motor intent and spike frequency (Simulation run 4).** These plots present data from a set of simulations using different motoneurons pools (S, FF, and mixed S & FF) that provide signals to a set of LIFEs. S motoneurons had spike durations of 4 ms and had firing frequencies that ranged from 5 to 18 Hz over the lower half of the motor intent range; FF motoneurons had spike durations of 2 ms and had firing frequencies that ranged from 18 to 35 Hz over the upper half of the motor intent range. 15 electrodes were simulated with different combinations of fiber type (all S, all FF, or a mix of S & FF) and number of neurons contributing (2, 4, 6, 8, 10). Percent overlap represents the percentage of the recording time in which there was overlap of 2 or more spikes. Composite frequency was calculated as the total number of spikes summed across all neurons that contribute to a particular electrode. Plot **(A)** shows the percent overlap on recordings from LIFE electrodes as a function of motor intent. Note that percent overlap is higher for electrodes that record from more neurons and that it increases as a function of motor intent. Plot **(B)** presents results from the same set of simulations, but with the data plotted as a function of composite frequency. On the plots, the black, red, and blue lines/markers indicate values derived from electrodes that record from S, mixed and FF motoneurons, respectively. Note that with this specification of motoneurons (spike duration and rates), the highest value for percent overlap is less than 20% and that electrodes that record signals from S motoneurons have higher values of spike overlap for a given composite frequency than those that record from a mixed population or from only FF motoneurons, because of the difference in spike durations.

Next consider the second challenge for the decoding process to accomplish a mapping from a multi-dimensional electrode space to motor intent space. In the situation where there is crosstalk, i.e., when signals from two or more motor pools contribute substantially to the signal recorded by one electrode, the decoding algorithm must be able to identify both components of the signal. Once again, it is likely that some candidate algorithms would address this problem better than others and the simulator would facilitate a comprehensive comparison.

In both of these cases, these capabilities of the simulator are particularly important because it is not possible to perform such a set of experiments in humans or an animal model. Distortions due to spike overlap and cross-talk of several motor pools onto one electrode cannot be controlled experimentally nor can they be quantitatively identified when they occur.

#### **A MODEL THAT CAPTURES THE KEY FEATURES OF RECORDED NEURAL SIGNALS, YET CAN BE EFFICIENTLY SIMULATED**

Many previous reports have described the design and development of simulation systems for spinal motor pools (Capaday and Stein, 1987; Fuglevand et al., 1993; Bashor, 1998; Nussbaumer et al., 2002; Ivashko et al., 2003; Lowery and Erim, 2005; Subramanian et al., 2005; Stienen et al., 2007; Uchiyama and Windhorst, 2007; Cisi and Kohn, 2008) and models of recordings of extracellular potentials (Plonsey et al., 2007). To the best of our knowledge, these two types of models have not been integrated in a manner that would meet our stated needs. The models of spinal motor pools include several efforts directed at studying the neuromotor control system (Fuglevand et al., 1993; Ivashko et al., 2003; Rybak et al., 2006) and others directed at designing biomimetic control systems (Ijspeert, 2008). The models of neural recordings have focused primarily on understanding and optimizing the electrode-tissue interface (Perez-Orive and Durund, 2000). Although our model and simulation system draws upon many of the concepts implemented in previous studies, we did not directly implement these other models.

In designing the model and the simulator, our intent was to capture the key features of recorded signals that may differentiate the performance of various decoding algorithms in a system. For the overall structure and for the individual elements, there are clear tradeoffs between biological fidelity and operational efficiency. Models that have a high degree of biological fidelity can often incur high costs in terms of effort required to develop the software, effort required to configure the software for a simulation run, and computational complexity. In developing this system, we focused on the key features of biological fidelity while striving to achieve reasonable operational efficiency. The key features of the neural/electrode system that we believe are suitably captured include: gradation of motor intent, multidimensionality of motor intent, variability in firing rates of motor pools from different fiber types, recruitment properties of different fiber types, variability in spike morphology across motor axons and electrodes, jitter in spike train timing, superposition of spike trains from multiple motor axons onto one electrode, spike overlap, crosstalk from multiple motor pools onto one electrode, variability in the number and relative strengths of motor axons contributing to different electrodes, and noise superimposed on the relevant neural signals. These features are captured in a model that requires specification of parameters that affect the properties of the system in a straightforward manner. For example, in this system the user directly specifies the range of firing rates for a motor neuron of a particular type; in a model with a high degree of biological fidelity that included a model of the biophysical properties of the membrane and channel dynamics, the range of firing rates would emerge from the specification of a large number of interdependent model parameters and components. In this example, the model with higher biological fidelity would incur what we believe to be unnecessary costs in development, configuration, and implementation. We believe that the design of our simulator captures the key system features in a manner that is operationally efficient.

The transformation from motor intent to neural recordings certainly involves a large number of nonlinear, dynamic processes. The model we have implemented includes three nonlinear processes: the piecewise linear mapping from motoneuron state to mean firing rate, the spike event times based on motoneuron state, and the morphology of the spike template for a given neuron. All other processes involve linear transformations: the connectivity between motor intent and motoneuron activation, the convolution of spike events with spike templates, and the connectivity between motor axons and electrodes.

In neural recordings, the morphology of a recorded spike is influenced by the relative spacing (and orientation) of the electrode and the nodes of Ranvier as well as the electrical properties of the tissue. Alterations in the relative spacing, orientation or tissue properties could have a nonlinear effect on the spike morphology. As implemented, the system allows for linear scaling of the contribution of a motor unit to an electrode, but nonlinear effects that would modify spike morphology would have to be accommodated by a change in the spike template.

In a system that uses more than one electrode in a fascicle, it is possible that one neuron may produce signals that contribute substantively to the recordings on more than one electrode. In this scenario, the morphology of the spike templates from that neuron will be different on each electrode. As designed, our simulator allows for a scaled version of the same template on different electrodes, but it does not allow for one axon to produce different morphologies on different electrodes. This limitation, which may be particularly important if using the simulator to study recordings on densely packed intrafascicular electrode arrays, could be addressed by modifying the simulator to allow one point process to produce more than one simulated spike train, thus producing simultaneous spikes on different electrodes with different shapes.

#### **SPECIFICATION OF MOTOR INTENT**

In this simulator, we have implemented motor intent as a signal that has two essential components: an intended action and a level of effort. The intended action is the DOF to be controlled while intended effort is the intensity of that action. Motor intent could be used to represent an action that is formulated in joint torque space. That is, motor intent signals could be used to represent quantities such as elbow flexion moment or wrist abduction moment. We believe that this form of representation will directly facilitate translation to a system where an amputee controls a motorized prosthesis. There are many possible representations of motor intent (in task space, joint space, muscle space, or other body-referenced coordinate systems) and there is evidence to support the existence of such representations at various points in the neuromotor control system circuitry. We believe that the joint torque representation is suitable because it will directly transfer to a constrained experimental paradigm in which an amputee is asked to issue specific motor commands, and the motor commands are directly related to the required actions of the prosthesis. For example, if an amputee is asked to think about elbow flexion and wrist extension while seated quietly, motoneurons in the residual limb that used to innervate elbow flexors and wrist extensors are likely to fire. Subsequently, when attempting to perform a functional task with a neural-controlled, powered prosthesis those same motoneurons are likely to fire if the task requires elbow flexion and wrist extension. These recorded commands can then be directly mapped to motors on the prosthesis to execute the desired movement.

#### **ON-GOING AND FUTURE WORK**

We are currently using the simulator to develop data sets that will be useful in comparative assessment of decoding algorithms for neural-controlled prostheses. Although our current effort is directed at systems that would utilize LIFE electrodes, we believe that the simulator can be readily configured to simulate recordings from a Utah array, tfLIFE, TIME, or other electrodes designed to record from peripheral nerves. The primary differences in configurations for the different electrode types would be alterations of the spike template morphologies, the number as well as the relative contributions of motoneurons that contribute to a recorded signal, and the noise characteristics.

Several modifications to the existing simulation system might enhance its utility as a tool to characterize the benefits of various decoder designs. For example, the system described here uses a linear function to map motor intent to motoneuron activation. While this may be sufficient to test most of the key features of the decoding system, it may fail to capture other influences on the transformation that may impact decoder performance. Future efforts will seek to identify such opportunities for improving the utility of the simulation system.

#### **ACKNOWLEDGMENTS**

This work is sponsored by Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office (MTO) through the Space and Naval Warfare Systems Center, Pacific, Grant/Contract N66001-12-C-4195 to Ranu Jung.

#### **REFERENCES**


during voluntary motor intent. *Med. Biol. Eng. Comput.* 48, 67–77. doi: 10.1007/s11517-009-0555-8

Zhu, X., Guan, C., Wu, J., Cheng, Y., and Wang, Y. (2005). Bayesian method for continuous cursor control in EEG-based brain-computer interface. *Conf. Proc. IEEE Eng. Med. Biol. Soc.* 7, 7052–7055. doi: 10.1109/IEMBS.2005.1616130

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 March 2014; accepted: 29 October 2014; published online: 14 November 2014.*

*Citation: Abdelghani MN, Abbas JJ, Horch KW and Jung R (2014) A functional model and simulation of spinal motor pools and intrafascicular recordings of motoneuron activity in peripheral nerve. Front. Neurosci. 8:371. doi: 10.3389/fnins.2014.00371*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Abdelghani, Abbas, Horch and Jung. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Restoration of motor function following spinal cord injury via optimal control of intraspinal microstimulation: toward a next generation closed-loop neural prosthesis

## *Peter J. Grahn1, Grant W. Mallory2, B. Michael Berry1, Jan T. Hachmann2, Darlene A. Lobel <sup>3</sup> and J. Luis Lujan2,4\**

*<sup>1</sup> Mayo Clinic College of Medicine, Mayo Clinic, Rochester, MN, USA*

*<sup>2</sup> Department of Neurologic Surgery, Mayo Clinic, Rochester, MN, USA*

*<sup>3</sup> Department of Neurosurgery, Cleveland Clinic, Cleveland, OH, USA*

*<sup>4</sup> Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA*

#### *Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*Ken Yoshida, Indiana University-Purdue University Indianapolis, USA Lee Fisher, Univeristy of Pittsburgh, USA*

#### *\*Correspondence:*

*J. Luis Lujan, Departments of Neurologic Surgery and Physiology and Biomedical Engineering, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA e-mail: luis.lujan@mayo.edu*

Movement is planned and coordinated by the brain and carried out by contracting muscles acting on specific joints. Motor commands initiated in the brain travel through descending pathways in the spinal cord to effector motor neurons before reaching target muscles. Damage to these pathways by spinal cord injury (SCI) can result in paralysis below the injury level. However, the planning and coordination centers of the brain, as well as peripheral nerves and the muscles that they act upon, remain functional. Neuroprosthetic devices can restore motor function following SCI by direct electrical stimulation of the neuromuscular system. Unfortunately, conventional neuroprosthetic techniques are limited by a myriad of factors that include, but are not limited to, a lack of characterization of non-linear input/output system dynamics, mechanical coupling, limited number of degrees of freedom, high power consumption, large device size, and rapid onset of muscle fatigue. Wireless multi-channel closed-loop neuroprostheses that integrate command signals from the brain with sensor-based feedback from the environment and the system's state offer the possibility of increasing device performance, ultimately improving quality of life for people with SCI. In this manuscript, we review neuroprosthetic technology for improving functional restoration following SCI and describe brain-machine interfaces suitable for control of neuroprosthetic systems with multiple degrees of freedom. Additionally, we discuss novel stimulation paradigms that can improve synergy with higher planning centers and improve fatigue-resistant activation of paralyzed muscles. In the near future, integration of these technologies will provide SCI survivors with versatile closed-loop neuroprosthetic systems for restoring function to paralyzed muscles.

**Keywords: spinal cord injury, brain machine interface, closed-loop control, feedback control, neuroprosthetics, sensors, implantable systems**

## **INTRODUCTION**

Approximately 300,000 individuals in the United States, and more than 2.5 million individuals worldwide, are affected by traumatic spinal cord injury (SCI) (National Spinal Cord Injury Statistical Center, 2013). Overall health-care related cumulative costs are estimated to exceed \$9 billion annually in the United States alone (DeVivo, 2012). In 2010, 36.5% of SCI resulted from motor vehicle accidents, 28.5% from falls, 14% from violence (including gunshot wounds), 9% from sports accidents, and 11% from other incidences not reported in detail (National Spinal Cord Injury Statistical Center, 2013). The demographic profile has changed over the last 40 years to involve older aged individuals. However, males still comprise the majority of injuries (Sekhon and Fehlings, 2001; DeVivo, 2012; Lenehan et al., 2012; National Spinal Cord Injury Statistical Center, 2013).

Traumatic SCI can occur when an excessive load to the spinal column is transmitted (directly or indirectly) to the spinal cord (Rowland, 1991; Watson et al., 2009). Damage to the spinal cord begins at the moment of injury, when displaced fragments of bone, disc material, or ligaments typically cause bruises or tears to spinal cord tissue (McDonald and Sadowsky, 2002). However, paralysis has been observed with no radiographic evidence of damage to the spinal cord or vertebral column (Pang and Wilberger, 1982; Mirovsky et al., 2005; Mahajan et al., 2013). Regardless of the injury mechanism, SCI involves permanent sensorimotor and autonomic deficits (Scivoletto et al., 2014), with long term complications including muscle atrophy and increased risk of cardiovascular disease (Phillips et al., 1998; Chen et al., 1999).

Most spinal cord injuries do not completely sever the spinal cord (Marino et al., 2003; National Institute of Neurological Disorders and Stroke, 2013). Instead, key pathways necessary for signal transmission between the brain and the rest of the body are disrupted. Spinal cord injuries can be classified as complete and incomplete injuries (Marino et al., 2003). Complete injuries are indicated by a total lack of sensory and motor function below the level of injury. In contrast, the ability to convey messages to or from the brain is not completely lost in cases of incomplete injury. That is, limited sensation and movement remain below the level of injury. Although SCI interrupts connections between the brain and effector muscles, key planning, coordination, and effector centers above and below the injury remain intact (Krajl et al., 1986; Triolo et al., 1996; Jilge et al., 2004; Minassian et al., 2004; Fisher et al., 2008, 2009; Yanagisawa et al., 2011; Wang et al., 2013; Collinger et al., 2014). Functional electrical stimulation (FES) is a form of therapy that applies external currents into intact neuromuscular circuitry below the level of injury, activating intact neural components to cause muscle contractions that can lead to restoration of motor function (Jackson and Zimmermann, 2012).

This manuscript reviews current therapeutic applications of electrical stimulation of the spine for providing functional coordination of muscle contraction and restoring function to paralyzed muscles. Additionally, this manuscript describes the development of neurostimulation technologies and control strategies, combining brain signals, optimal control algorithms, and emerging FES strategies to develop a clinically-translatable FES system that optimizes restoration of neurologic function following SCI (**Figure 1**).

## **ELECTRICAL STIMULATION OF EXCITABLE TISSUE**

The use of electrical stimulation for investigating the function of the nervous system began with the Italian physician and scientist Luigi Galvani (Galvani and Aldini, 1792). Galvani discovered that nerves and muscles are electrically excitable, and was able to evoke muscle contractions in frog legs by stimulating them with brief jolts of electricity, produced by static generators (Hambrecht, 1992). Since then, it has been well established that nerve cells

**FIGURE 1 | Neuroprosthetic system.** The neuroprosthetic system is capable of interpreting volitional movement signals from the brain, integrating these commands with sensor feedback (e.g., joint angle, limb velocity, etc.) and, delivering appropriate commands into intact neural circuitry below the level of injury.

can be activated using electrical currents delivered into neural tissue via stimulating electrodes (Glenn et al., 1976; Branner et al., 2001; Brill et al., 2009; Kilgore et al., 2009; Kent and Grill, 2013; Nishimura et al., 2013). Active nerve cells fire electrical impulses, also known as action potentials, that travel along the nerve axon and propagate across neuromuscular junctions via neurotransmitter signaling (Bean, 2007; Meriney and Dittrich, 2013). In turn, this signaling mechanism causes muscle fibers connected to nerve fibers (i.e., motor unit) to contract (Hughes et al., 2006).

## **ELECTRICALLY EVOKED MUSCLE ACTIVATION**

The strength of stimulation-evoked muscle contractions can be controlled by varying the frequency, amplitude, and pulsewidth of the external stimuli (Grobelnik, 1973; Kralj et al., 1988; Kralj and Bajd, 1989; Bhadra and Peckham, 1997). At low frequencies, individual muscle twitches are evoked with each stimulus pulse. At higher frequencies, responses to individual stimuli fuse and muscles respond with smooth contractions. Higher stimulus frequencies produce stronger muscle contractions, but also increase the rate of muscle fatigue (Tanae et al., 1973; McDonnall et al., 2004; Bamford, 2005). Activation of motor units can be achieved using different stimulation modalities: transcutaneous stimulation, percutaneous stimulation, intramuscular stimulation, peripheral nerve stimulation, and spinal stimulation.

#### **TRANSCUTANEOUS STIMULATION**

Transcutaneous stimulation, also known as surface stimulation, relies on stimulating electrodes placed on the skin surface directly over the muscle motor points (i.e., locations that produce an optimal balance between contraction strength and stimulation amplitude) (Hirokawa et al., 1990; Scremin et al., 1999; Mangold et al., 2005). This non-invasive, reversible, and inexpensive technique has been successfully used in locomotion and hand grasp systems (Kralj and Bajd, 1989; Popovic et al., 2005). However, transcutaneous muscle stimulation has multiple practical limitations. Specifically, the skin offers a high resistance compared to muscle tissue (Bîrlea et al., 2014). For this reason, higher stimulation currents (*>*30 mA) are required to achieve desired motor responses using surface stimulation (Triolo et al., 2001; Lujan and Crago, 2009). Additionally, the limited degree of selectivity can lead to activation of antagonist muscle groups or an inability to selectively activate deep muscle groups (Schmit and Mortimer, 1997; Triolo et al., 2001). Furthermore, current spread due to suboptimal electrode placement and limited stimulation specificity can result in pain (Niddam et al., 2001).

#### **PERCUTANEOUS STIMULATION**

Percutaneous stimulation systems rely on intramuscular needle electrodes that pass through the skin and stimulate target muscles (Caldwell and Reswick, 1975; Stanic et al., 1978; Malezic et al., 1984; Marsolais and Kobetic, 1986; Bogataj et al., 1989). This allows activation of deep muscles and provides isolated, selective, and repeatable muscle contractions. Percutaneous stimulation requires lower stimulation intensities compared to transcutaneous stimulation. However, increased risks of infection, lead breakage, and movement restrictions limit the use of percutaneous stimulation outside of a laboratory environment (Knutson et al., 2002).

### **IMPLANTED INTRAMUSCULAR AND PERIPHERAL NERVE STIMULATION**

Implanted neurostimulation systems are associated with electrical current delivery via both intramuscular and nerve cuff electrodes (Rabischong and Ohanna, 1992; Peckham et al., 2002; Guiraud et al., 2006). As the name implies, intramuscular stimulation relies on electrodes implanted directly into the muscle (Crago et al., 1980; Hobby et al., 2001; Peckham et al., 2001, 2002; Peckham and Knutson, 2005; Kilgore et al., 2008). Peripheral nerve stimulation relies on electrode cuffs that are surgically placed around nerves innervating target muscles (Stein et al., 1975; Hoffer et al., 1996; Strange and Hoffer, 1999; Sinkjaer, 2000; Branner et al., 2001; Brill et al., 2009; Fisher et al., 2009; Polasek et al., 2009). Although capable of evoking strong, selective, and repeatable muscle activation, intramuscular and nerve cuff stimulation techniques often recruit the largest and most fatigable motor units first, resulting in early fatigue onset (Popovic et al., 2002). Discontinuous activation of muscle compartments and interleaved frequency stimulation have both been reported to delay fatigue onset (Boom et al., 1993; McDonnall et al., 2004). Saigal et al. demonstrated fatigue-resistant stepping in a spinalized cat by stimulating the lumbrosacral cord via interleaved stimulation (Saigal et al., 2004). Interleaved stimulation reduces muscle fatigue by decreasing the stimulation frequency (Mushahwar and Horch, 1997; Tai et al., 2000). The asynchronous nature of interleaved stimulation is designed to evoke fused contractions despite a lack of tetanic firing in individual motor units. However, the limited number of controllable degrees of freedom, high power consumption, and other technological and practical limitations have restricted the widespread application of electrical stimulation therapy outside research environments (Peckham and Knutson, 2005; Ragnarsson, 2008; Creasey and Craggs, 2012).

## **SPINAL CORD STIMULATION**

Direct stimulation of the spinal cord may be advantageous over conventional FES techniques as spinal stimulation provides an opportunity to directly activate higher level circuitry, which oversees and coordinates motor function (Minassian et al., 2004, 2007; Bamford, 2005; Gerasimenko et al., 2008; Bamford and Mushahwar, 2011; Holinski et al., 2011; van den Brand et al., 2012; Angeli et al., 2014). Two modalities of spinal stimulation have been described: epidural and intraspinal stimulation.

In epidural stimulation, stimulating electrodes are placed directly over the spinal cord (Lavrov et al., 2008; Hachmann et al., 2013). Two recent studies reported that neuromodulation of spinal circuitry via epidural stimulation, combined with intense physical rehabilitation, was capable of allowing individuals with incomplete and complete SCI to process conceptual, auditory and visual feedback to regain voluntary control of paralyzed muscles for short durations of time. Results of these studies suggest some degree of residual connectivity through the area of SCI (Harkema et al., 2011; Angeli et al., 2014). These studies, although promising, require using rigorous patient selection and replication in larger patient populations.

In intraspinal microstimulation (ISMS), stimulating electrodes are implanted within the ventral gray matter of the spinal cord (Bamford and Mushahwar, 2011). ISMS is hypothesized to directly activate alpha motor neurons, preferentially activating fatigue resistant muscle fibers (Gorman, 2000; Bamford, 2005). Several studies have highlighted the potential of ISMS to restore bladder and respiratory function, as well as upper and lower extremity function in animal models (Mushahwar and Horch, 2000a,b; Mushahwar et al., 2002; Moritz et al., 2007; Bamford et al., 2010; Bamford and Mushahwar, 2011; Nishimura et al., 2013; Sunshine et al., 2013).

## **INTRASPINAL MICROSTIMULATION (ISMS)**

Intraspinal stimulation has been extensively used to study the effects of electrical stimulation on the central nervous system, as well as synaptic delays and network interconnections across spinal pathways (Renshaw, 1946; Jankowska and Roberts, 1972a,b; Gustafsson and Jankowska, 1976). More recently, ISMS has been used to investigate the organization of motor circuitry within the spinal cord in amphibious, rodent, and feline animal models (Bizzi et al., 1991; Giszter et al., 1993; Tresch and Bizzi, 1999; Lemay et al., 2001, 2009; Saltiel et al., 2001; Lemay and Grill, 2004).

Similarly, over the past 15 years, ISMS has been used to investigate restoration of motor function in spinalized and anesthetized rodents and cats (Mushahwar et al., 2002; Bamford, 2005; Pikov et al., 2007; Yakovenko et al., 2007; Holinski et al., 2011; Kasten et al., 2013; Sunshine et al., 2013). Work performed by Lau et al. demonstrated that ISMS is capable of producing standing in cats for over 20 min (Lau et al., 2007). The lower stimulation amplitudes associated with intraspinal stimulation (in the order of a few microamperes) are believed to be, at least in part, responsible for the longer periods of muscle contraction observed (Bamford, 2005). Other studies suggest that the fatigue resistance observed with ISMS techniques is the result of preferential activation of type I slow-twitch fatigue-resistant motor fibers (Mushahwar, 2000; Mushahwar and Horch, 2000a; Saigal et al., 2004; Bamford, 2005; Nishimura et al., 2013). Moreover, Bamford et al. showed ISMS recruitment of up to 44% fatigue-resistant muscle fibers compared to less than 1% fatigue-resistant muscle fibers recruited using peripheral nerve cuff stimulation (Caldwell and Reswick, 1975; Marsolais and Kobetic, 1986; Bamford, 2005). As such, when combined with interleaved stimulation, ISMS has been associated with further decrease in muscle fatigue (Rack and Westbury, 1969; McDonnall et al., 2004; Lau et al., 2007; Mushahwar et al., 2007).

The close proximity of spinal motor centers to higher control centers responsible for controlling motor function, together with the improved fatigue response, make ISMS an excellent alternative for restoring locomotor function in individuals with SCI (Etlin et al., 2014; Guertin, 2014). However, before spinal or other electrical stimulation technology can be clinically used to optimally improve quality of life for individuals with SCI, appropriate stimulation control paradigms must be established.

## **OPTIMAL CONTROL PARADIGMS**

Electrical stimulation systems have been previously used to assist respiratory function (Kaneyuki et al., 1977; Gorman, 2000; Posluszny et al., 2014), hand grasp (Avestruz et al., 2008; Skarpaas and Morrell, 2009; Rosin et al., 2011; Gan et al., 2012; Basu et al., 2013; Grant and Lowery, 2013), locomotion (Behrend et al., 2009), as well as bladder and bowel function (Lee et al., 2004; Shon et al., 2010a,b; MacDonald et al., 2013) in patients with SCI. These FES systems have relied on a variety of control strategies, ranging from linear models to adaptive controllers, but all aimed at enhancing stimulation-evoked functional responses. Many neuroprosthetic control systems rely on feedforward configurations (Moro et al., 1999; Molinuevo et al., 2000), in which controller output depends only on user inputs (e.g., stimulus parameters). These controllers have fast response times, but do not make corrections if the target and actual outputs differ (Lee et al., 2009). Furthermore, these controllers will not alter their response in the face of unexpected internal or external perturbations (Blaha and Phillips, 1996; Lee et al., 2006). However, the highly non-linear nature of muscle responses, coupled with environmental perturbations found in activities of daily living, require that optimal neuroprosthetic control paradigms rely on feedback signals. Feedback-based control systems continuously monitor musculoskeletal system outputs and adjust stimulation parameters if the stimulation-evoked musculoskeletal system outputs (e.g., limb position, force) differ from the desired outputs (Lujan and Crago, 2009; Griessenauer et al., 2010; Chang et al., 2012). This guarantees the system can respond to and compensate for unforeseen perturbations. Feedback control has been previously used for control of hand grasp (Lujan and Crago, 2009), standing posture (Fraix et al., 2006; Rosin et al., 2011), and locomotion (Roham et al., 2007; Takmakov et al., 2010; Fitzgerald, 2014) in SCI individuals. Simple feedback control can be improved by using adaptive systems (Karniel and Inbar, 2000; Kobravi and Erfanian, 2012). Adaptive algorithms modify controller behavior in response to changes in the system and the environment (Chizek et al., 1988; Narendra, 1990; Narendra and Parthasarathy, 1990; Teixeira et al., 1991; Kostov et al., 1995; Davoodi and Andrews, 1998, 1999; Jonic et al., 1999; Abbas and Riener, 2001 ´ ).

Studies have demonstrated the ability of neural networks to successfully control motor neuroprostheses, both in paraplegic (Riess and Abbas, 1999, 2000, 2001; Nataraj et al., 2013) and tetraplegic individuals (Fujita et al., 1998; Lujan and Crago, 2009). Artificial neural networks (ANNs) can model static and dynamic non-linear systems (Durfee, 1989; Funahashi, 1989; Hornik et al., 1989; Chakraborty et al., 1992; Barron, 1993; Lan et al., 1994; Piche, 1994; Graupe and Kordylewski, 1995; Hassoun, 1995; Kostov et al., 1995; Chang et al., 1996; Chen et al., 1997; Demuth and Beale, 2000). Additionally, ANNs can generalize from experimental input/output data, eliminating the need for analytical models of the system (Funahashi, 1989; Hornik et al., 1989; Graupe and Kordylewski, 1995; Hassoun, 1995; Narendra, 1996; Demuth and Beale, 2000). Furthermore, ANNs are less sensitive to noise and easily implemented in hardware (Narendra, 1996). Moreover, ANN-based controllers allow changes to the controller without requiring changes in data collection or controller training methods. Backpropagation neural networks have been used to model the non-linear relationship between stimulus intensity and stimulation-evoked responses (Fujita et al., 1998; Lujan and Crago, 2009). Additionally, ANNs have been successfully used to create inverse dynamic models of musculoskeletal systems for neuroprosthetic control (Chang et al., 1997; Yoshida et al., 2002). These models are particularly useful for learning the characteristics of electrically-activated muscles in coupled multijoint systems acted upon by redundant muscles (Adamczyk and Crago, 1997, 2000; Lujan and Crago, 2009).

Thus, optimal neuroprosthetic control systems should rely on a combination of non-linear feedforward and feedback techniques in order to pre-emptively reduce the amount of error in real-time while minimizing time delays inherent to feedback control systems. Development of such optimal closedloop neuroprosthetic controllers will require high-quality sensors that can withstand daily use under a wide range of daily life activities.

## **FEEDBACK SIGNALS FOR OPTIMAL CONTROL OF NEURAL PROSTHESES**

Neuroprosthetic systems with feedback control are capable of identifying, decoding, and extracting features from appropriate input signals in order to respond to unforeseen perturbations and changes in the environment (Bhadra et al., 2002; Dominici et al., 2012; Holinski et al., 2013). However, optimal feedback modulation for clinical application will require fully implantable smart sensors that provide consistent and reliable chronic information to the control system (Shih et al., 2012; Peckham and Kilgore, 2013). There is already a wide range of sensors that can detect and measure information about the system and its environment. The most commonly used sensors include electrophysiological sensors, chemical sensors, force transducers, and magnetic sensors. Electrophysiological sensors measure potential differences generated by muscle (i.e., myoelectric signals) and neural tissue (e.g., electroencephalogram, electrocorticogram, electroneurogram) (Leuthardt et al., 2004; Müller-Putz et al., 2005; Holinski et al., 2013). These sensors can monitor muscle state and evaluate expected muscle responses. In turn, this allows adaptation of stimulation parameters in the presence of muscle fatigue (Hayashibe et al., 2011; Zhang et al., 2013). Chemical sensors (e.g., carbon fiber microelectrodes coupled to fast scan cyclic voltammetry devices) can detect changes in stimulationevoked analytes (e.g., neurotransmitters) (Bledsoe et al., 2009; Chang et al., 2012) that can be used to modulate stimulation levels. Force transducers (e.g., piezoelectric devices, accelerometers) can be used to detect changes in limb position, ground reaction forces, heel strike, and other events that are critical for event detection and optimal control of stimulation (Tan et al., 2004). Magnetic sensors detect changes in magnetic fields and can be used to detect limb position and orientation (Bhadra et al., 2002; Tan et al., 2004). However, having reliable sensors is not enough to develop an optimal feedback controller. In order for the signals measured by these sensors to be of clinical use, they must be properly decoded and integrated with both existing and novel neuroprosthetic control systems (Shadmehr et al., 2010). This will most likely happen in the way of a brain machine interface.

## **BRAIN MACHINE INTERFACES**

Brain machine interfaces (BMI) are neural interface systems that can record, analyze, and decode brain signals (Wang et al., 2010) to infer volitional intent, which in turn can be used to control limb movement and assistive devices (**Figure 2**) (Leuthardt et al., 2004; Hochberg et al., 2006; Schwartz et al., 2006; Miller et al., 2010; Carmena, 2012; Fifer et al., 2012). Brain commands may be recorded using sensors located on the scalp (electroencephalogram), the surface of the brain (electrocorticogram), or the brain parenchyma using intracortical electrodes that record activity from single neurons (single unit recording) or groups of neurons (local field potentials) (**Figure 3**). Electroencephalographic recordings offer a non-invasive recording technique that is safe and easy to implement. However, controlling multiple degrees of freedom with electroencephalographic signals has proven difficult due to challenges with extracting and classifying individual signal features as well as an inherent low spatial resolution (Yang et al., 2011). Single unit recordings and local field potentials offer excellent signal resolution, but are highly invasive (Buzsáki et al., 2012). Single unit recordings capture the activity of distinct neurons. The high spatial and temporal resolution provided by single unit recordings allows for precise measurements of neuronal spikes (Buzsáki et al., 2012). The downfall to single unit recordings is a difficulty isolating specific neural activity due to crosstalk from neighboring cells (Bai and Wise, 2001). Furthermore, single unit recordings can be biased toward activity from larger neurons adjacent to the intended neuron (Buzsáki et al., 1983). Finally, electrode migration, immune responses (e.g., glial scarring), and disruption of surrounding neural tissue interfere with signal quality and limit reliable single unit activity to acute recording conditions (Carter and Houk, 1993; Polikov et al., 2005). Local field potentials reflect a weighted average of integrative processes and associations between cells that can be detected over longer distances through extracellular space (Logothetis, 2003a,b; Andersen et al., 2004; Bronte-Stewart et al., 2009; Buzsáki et al., 2012; Rosa et al., 2012). Unfortunately, the longer recording range of local field potential techniques is associated with decreased spatial resolution. Electrocorticogram presents a good balance between risks and benefits, as it provides good spatiotemporal resolution without damaging underlying cortical tissue (Leuthardt et al., 2004; Wilson et al., 2006; Schalk et al., 2008; Moran, 2010; Slutzky et al., 2010).

Extracted brain signals must undergo filtering to remove movement artifacts and electrical noise before they can be used by a BMI and neuroprosthetic controller to generate motor commands (Kowalski et al., 2013). Filtered signals must be analyzed using classifiers and signal processing algorithms that identify unique features or signatures (Kowalski et al., 2013). In turn, these features are mapped to specific functions and/or degrees of freedom that control neuroprosthetic systems and assistive devices (Pfurtscheller et al., 2003; Musallam et al., 2004; Müller-Putz et al., 2005; Jackson et al., 2006; Moritz et al., 2008; Daly et al., 2009; Chadwick et al., 2011).

Pioneering work by Georgopoulos et al. used single unit recordings to establish a high degree of correlation between arm movement and cortical activity within a non-human primate (Georgopoulos et al., 1986). Subsequently, several studies in nonhuman primates and SCI-survivors have demonstrated stable, chronic, intracortical recordings using microelectrode arrays such as the Utah and Michigan arrays (Wessberg et al., 2000; Serruya et al., 2002; Taylor et al., 2002; Pfurtscheller et al., 2003; Suner et al., 2005; Cheung, 2007; Cheung et al., 2007; Moritz et al., 2008; Langhals and Kipke, 2009; Sharma et al., 2010, 2011; Do et al., 2011; Hochberg et al., 2012). Cortical signatures can be identified from their spatial, temporal, and frequency-dependent features (Nicolas-Alonso and Gomez-Gil, 2012). However, BMI

application to complex neuroprosthetic control has been limited due to the difficulty of extracting sufficient numbers of unique signatures for control of systems with multiple degrees of freedom (Shih et al., 2012). Ongoing efforts in decoding algorithms, together with advances in neural training techniques such as motor imagery, have recently improved feature extraction, allowing SCI survivors to control complex movements using BMI (Wang et al., 2009, 2013; Chao et al., 2010; Yanagisawa et al., 2011).

## **CONCLUSIONS**

Recent advances in the fields of BMIs and electrical stimulation therapy provide a promising outlook for patients with SCI. However, it is clear that successful restoration of independence for SCI survivors requires integration of selective electrical stimulation techniques, feedback control, and optimal control algorithms. As is the case in normal human neurophysiology, selective muscle activation as well as integration of force feedback, balance, proprioception, and reduction of muscle fatigue are all critical for motor function. Therefore, next-generation closed-loop neuroprosthetic systems must integrate fully implantable multichannel stimulators and feedback sensors with adaptive control systems. Furthermore, control algorithms must be designed for seamless integration with BMI systems and real-time processing, integration, and transmission of feedback control signals. Devices that are capable of coupling such novel stimulation, intention detection, proprioceptive sensing, and control algorithms are currently under development, with clinical translation just beyond the horizon. Ultimately, these technologies will provide SCI survivors with increased independence in daily life, improved overall health, and enhanced quality of life.

## **ACKNOWLEDGMENTS**

This work was supported by NIH R21 NS087320, The Grainger Foundation, and a gift from Louise Chapman.

## **REFERENCES**


complete paralysis in humans. *Brain* 137, 1394–1409. doi: 10.1093/brain/ awu038


Demuth, H., and Beale, M. (2000). *Neural Network Toolbox: For Use with MATLAB.* Natick, MA: The Mathworks, Inc.


Hassoun, M. H. (1995). *Fundamentals of Artificial Neural Networks.* Google Books.


Riess, J. A., and Abbas, J. J. (1999). "Control of cyclic movements as muscles fatigue using functional neuromuscular stimulation," in *IEEE* (Atlanta, GA), 659.


electrical stimulation. *IEEE Trans. Biomed. Eng.* 60, 2299–2307. doi: 10.1109/TBME.2013.2253777

**Conflict of Interest Statement:** Intellectual property licensed to Boston Scientific (J. Luis Lujan). The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 April 2014; accepted: 31 August 2014; published online: 17 September 2014.*

*Citation: Grahn PJ, Mallory GW, Berry BM, Hachmann JT, Lobel DA and Lujan JL (2014) Restoration of motor function following spinal cord injury via optimal control of intraspinal microstimulation: toward a next generation closed-loop neural prosthesis. Front. Neurosci. 8:296. doi: 10.3389/fnins.2014.00296*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Grahn, Mallory, Berry, Hachmann, Lobel and Lujan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Treatment of phantom limb pain (PLP) based on augmented reality and gaming controlled by myoelectric pattern recognition: a case study of a chronic PLP patient

## *Max Ortiz-Catalan1,2\*, Nichlas Sander 1, Morten B. Kristoffersen1,2, Bo Håkansson1 and Rickard Brånemark2*

*<sup>1</sup> Biomedical Engineering Division, Department of Signals and Systems, Chalmers University of Technology, Gothenburg, Sweden*

*<sup>2</sup> Department of Orthopaedics, Centre of Orthopaedic Osseointegration, Sahlgrenska University Hospital, Gothenburg, Sweden*

#### *Edited by:*

*David Guiraud, Institut National de la Recherche en Informatique et Automatique, France*

#### *Reviewed by:*

*Martin Lotze, University of Greifswald, Germany Alireza Mousavi, Brunel University, UK*

#### *\*Correspondence:*

*Max Ortiz-Catalan, Division of Biomedical Engineering, Department of Signals and Systems, Chalmers University of Technology, Hörsalsvägen 11, 41296 Gothenburg, Sweden e-mail: maxo@chalmers.se*

A variety of treatments have been historically used to alleviate phantom limb pain (PLP) with varying efficacy. Recently, virtual reality (VR) has been employed as a more sophisticated mirror therapy. Despite the advantages of VR over a conventional mirror, this approach has retained the use of the contralateral limb and is therefore restricted to unilateral amputees. Moreover, this strategy disregards the actual effort made by the patient to produce phantom motions. In this work, we investigate a treatment in which the virtual limb responds directly to myoelectric activity at the stump, while the illusion of a restored limb is enhanced through augmented reality (AR). Further, phantom motions are facilitated and encouraged through gaming. The proposed set of technologies was administered to a chronic PLP patient who has shown resistance to a variety of treatments (including mirror therapy) for 48 years. Individual and simultaneous phantom movements were predicted using myoelectric pattern recognition and were then used as input for VR and AR environments, as well as for a racing game. The sustained level of pain reported by the patient was gradually reduced to complete pain-free periods. The phantom posture initially reported as a strongly closed fist was gradually relaxed, interestingly resembling the neutral posture displayed by the virtual limb. The patient acquired the ability to freely move his phantom limb, and a telescopic effect was observed where the position of the phantom hand was restored to the anatomically correct distance. More importantly, the effect of the interventions was positively and noticeably perceived by the patient and his relatives. Despite the limitation of a single case study, the successful results of the proposed system in a patient for whom other medical and non-medical treatments have been ineffective justifies and motivates further investigation in a wider study.

**Keywords: phantom limb pain, augmented reality, virtual reality, myoelectric control, electromyography, pattern recognition, neurorehabilitation**

## **BACKGROUND**

Phantom limb pain (PLP) is a common and deteriorating condition suffered by ∼70% of amputees (Dijkstra et al., 2002), and regardless the cause of amputation (Clark et al., 2013). In recent years, virtual reality (VR) has been used to treat PLP as a more technologically sophisticated version of the wellknown "mirror" therapy introduced in 1996 (Ramachandra and Rogers-Ramachandra, 1996). VR has clear advantages over the physical constraints imposed by the conventional mirror box, as it allows a wider range of motion and rehabilitation exercises. In addition, VR allows interactive games that challenge patients with varying levels of difficulty, while keeping them entertained and motivated (Sveistrup, 2004). Contemporary reviews of the use of VR in neuromuscular rehabilitation are given in Sveistrup (2004), and Holden (2005).

To date, VR mirror therapy has relied on patients commanding the same motor execution in both limbs. A virtual representation of the missing limb is then created to match the motions of the contralateral limb, thus delivering visual feedback (Murray et al., 2006a,b; Mercier and Sirigu, 2009; Bach et al., 2010). Since the sound limb is required, this approach is only suitable for unilateral amputees. The patients have no direct volitional control of their phantom limb virtual representation. Instead, they simultaneously execute the same motions in both limbs. In this setup, the real effort and commitment of the patient to produce phantom limb motions is not part of the intervention, i.e., the mirror limb will move as long as the sound limb does, and regardless of the intention of the phantom limb. Additionally, it has been suggested that the variable efficacy of this therapy across subjects is mainly due to the difference in individual susceptibility to the visual feedback, rather than the physiological condition itself (Mercier and Sirigu, 2009). We hypothesize that the higher degree of realism provided by augmented reality (AR), together with direct volitional control through the prediction of motion intent using myoelectric signals at the stump, could improve the efficacy of this therapy. Furthermore, the addition of game control by phantom limb motions should help to engage the patient in executing these movements and, since only the amputated limb is involved, it is also suitable for bilateral amputees.

VR-based treatment in which the virtual limb is controlled by the affected side has been previously explored with motion tracking technology (Cole et al., 2009), which inherently, and considerably, restricts the amount of predictable motions. Here we show that myoelectric pattern recognition allows for the accurate prediction of hand, wrist, and elbow motions as intended in an intact limb.

The utilization of the stump musculature to control conventional myoelectric prostheses has been long thought to reduced PLP (Lotze et al., 1999), despite that most commonly, the controlling muscle contractions are not originally related to the end actuation (i.e., in a trans-humeral amputee, an electrode over the biceps muscles controls the closing of a prosthetic hand). However, even if the musculature for physiologically appropriate actuation is no longer present, it has been shown that amputees are able to distinguish between imagining a phantom movement, and actually executing it. This suggests that the ability to naturally execute a movement is maintained after amputation, but more importantly, the effect on neuroplasticity and inter-hemispheric communication is different when practicing motor execution and motor imaginary (Raffin et al., 2012a,b). Experiments with implanted neural interfaces, which rely on the physiology of motor execution, have been shown to reduce PLP (Di Pino et al., 2012). This supports the use of direct volitional control through myoelectric signals at the stump, with the advantage that the system presented here is non-invasive, and allows the equivalent to a physiologically appropriate control (i.e., muscle synergies generated with the intention of closing the missing hand, results in closing of the virtual hand).

It has been suggested that incongruencies in the visual stimulus and sensory perception produce varying results in terms of pain relief, in some cases increasing it (Desmond et al., 2006). This problem is avoided in our proposed myoelectrically controlled AR environment (MCARE), where a conventional webcam captures the whole environment around the patient and integrates it in the rehabilitation task. To the best of our knowledge, this is the first time that AR, gaming, and the prediction of motion intent using myoelectric pattern recognition have been used together as a treatment for PLP. Comprehensive reviews of PLP are given in Nikolajsen and Jensen (2001), and Flor et al. (2006). In this work, the results of using MCARE in a chronic, treatment-resistant PLP patient are reported.

A chronic PLP patient for whom other treatments have proven ineffective was recruited to this study. The patient (male, 72 years old) lost his arm just below the elbow joint in 1965 due to a traumatic injury. He has experienced PLP since the amputation and reported a strongly closed fist as the permanent posture of his phantom hand. The PLP has continued over the years, despite conventional mirror therapy, different drug-based treatments, acupuncture, and self-suggested hypnosis. The patient has reported living with constant burning pain of an intensity of 3 on a scale from 0 to 10 (SF-MGPQ; Melzack, 1987), with episodes that escalated up to the maximum intensity approximately every hour for a few minutes, reported as excruciating pain. In addition, the patient was normally woken at night due to intense episodes of pain.

## **METHODS**

#### **PAIN TRACKING**

Pain perception was monitored after every session using the short-form McGill pain questionnaire (SF-MGPQ) (Melzack, 1987) translated into Swedish (Burckhardt and Bjelle, 1994). The questionnaire was administered by a facilitator, with the exception of pain intensity where the patient noted the rating directly on the visual analogue scale. A percentage of total time at each level of pain was also reported. Additionally, the patient was free to self-report any comments on the system and the treatment.

#### **CONTROL SOURCE**

The prediction of motion intent was made using BioPatRec, an open source platform initially developed for advanced prosthetic control strategies based on pattern recognition algorithms (Ortiz-Catalan et al., 2013). The myoelectric activity at the patient's stump was utilized as the sole input to determine the intended phantom limb motions. Once the aimed motion is known, this can be used to command a variety of virtual environments and robotic devices. A custom-made AR environment was developed for this study to interface with BioPatRec and allow the patient to visualize himself (in real-time) with a virtual arm superimposed on his stump. The AR environment uses a conventional webcam which inputs a video feed that is analyzed to track a fiducial marker, thus allowing the virtual arm to remain in the anatomically correct position while the patient moves (see video in **Additional File 1**). The fiducial marker can be printed with a conventional printer. The virtual arm is superimposed on the marker and changes scale and rotation based on the tracking of the marker. These parameters can be also adjusted in real-time with the keyboard in order to improve the fitting of the virtual arm.

#### **MYOELECTRIC RECORDINGS**

Eight bipolar electrodes (self-adhesive Ag/AgCl, Ø = 1 cm, and ∼2 cm inter-electrode distance) and the marker were placed around the stump, as shown in **Figure 1**. The location of the electrodes was defined by asking the patient to perform different movements and palpation of the corresponding muscular activity. We have empirically found that this procedure, rather than pre-defined selective placement, allows dealing with the difficulties of a commonly altered anatomy at the most distal part of the stump. The movements requested were hand open/close, wrist pro/supination, wrist flexion/extension and elbow flexion/extension.

The amplifiers used were developed in-house (MyoAmpF2F4- VGI8) with embedded active filtering: 4th order high-pass filter at 20 Hz; 2nd order low-pass filter at 400 Hz; and, Notch filter at 50 Hz. The signals were amplified with a gain of 2000 and digitalized at 2 kHz and 16 bits.

The protocol for myoelectric signals acquisition and processing is described in Ortiz-Catalan et al. (2013). The classifiers used were Linear Discriminant Analysis in a One-Vs-One topology (LDA-OVO), and Multi-layer Perceptron in a dedicated topology per degree of freedom (MLP-AAM), for individual and

**FIGURE 1 | Setup for the myoelectrically controlled augmented reality environment (MCARE). (A)** Surface electrodes and a fiduciary marker placed at the stump. **(B)** Environment captured by the webcam and displayed on a computer screen, with the addition of the virtual limb superimposed on the fiduciary marker. **(C)** Patient playing a racing game in which he drives the car by phantom motions (Trackmania Nations Forever, free version). **(D)** Patient using the Target Achievement Control (TAC) test as a rehabilitation tool.

simultaneous movements, respectively. These classifiers have been shown to be successful at both tasks in real-time studies and are further described in Ortiz-Catalan et al. (in press).

#### **INTERVENTION**

Once the electrodes and marker were in place, and the quality of the EMG signals was verified by short real-time myoelectric recordings, the subject was asked to perform the eight movements while being guided on the length and timing of the contractions by a virtual limb. The instructions given to the patient were to perform the motions "as if he still had the missing limb," thus aiming for physiologically appropriate myoelectric activity to be used for control. The LDA-OVO was trained with this information and the patient had a 10-min session in the AR environment in which the facilitator prompted the patient to perform the recorded movements one by one in random order (**Figures 1A,B**).

After the AR environment session, new EMG recordings were made for simultaneous movements using wrist pro/supination and elbow flexion/extension. This information was used to train the MLP-AAM, which real-time predictions were used to play a racing game (Trackmania Nations Forever, free version). The game was controlled by using wrist pro/supination to turn left/right, while elbow flexion/extension controlled the car acceleration/deceleration. After a gaming session of ∼10 min, the same procedure was repeated for hand open/close and wrist flexion/extension, always in combination with elbow flexion/extension (**Figure 1C**). These combinations of motions were also used in the Target Achievement Control (TAC) test initially introduced by Simon et al. (2011b), with modifications described in Ortiz-Catalan et al. (in press). The artificial limb speed was two degrees/prediction (new predictions every 50 ms) and the target posture was displaced in 1 and 2 degrees of freedom (DoF). The velocity-ramp algorithm was used to facilitate controllability (Simon et al., 2011a). In this work, the TAC test was used for rehabilitation and training, rather than as an evaluation tool (**Figure 1D**).

Once the TAC tasks were completed, a new set of movements was recorded using all eight movements to conduct a "Motion Test" (Kuiken et al., 2009), as implemented in BioPatRec (Ortiz-Catalan et al., 2013). Similarly to the TAC test, the Motion Test was aimed as a rehabilitation tool. Questionnaires were administered by the facilitator at the end of the Motion Tests which concluded the session.

This protocol was applied once a week starting in March 2013 and this work includes the results up to week 18. In the last 5 weeks, two sessions a week were held, while in week eight no session was conducted because the patient was unavailable for reasons unrelated to treatment. A video showing examples of the interventions is available as **Additional File 1**.

## **RESULTS**

An increment in pain was reported by the patient after the first session, however, the pain decreased slightly below the original level in the second session, after which a slow yet consistent improvement was seen in the sustained level of pain. **Figure 2** illustrates the progress in pain reduction. After 4 weeks, the patient reported starting to experience episodes of lower pain intensity. After 10 weeks, episodes of almost absent pain started occurring and this then developed into completely pain-free periods a couple of session later. This was reported by the patient as the most dramatic effect: "*These pain-free periods are something almost new to me and it is an extremely pleasant sensation.*" In addition, pain-free periods of 15–60 min were reported immediately after the rehabilitation sessions.

As the patient is very active in agricultural activities, despite his disabilities, he performs physical tasks that involve the use of his prosthesis. These activities often induced sessions of pain during the following days. Each week, the patient reported that the periods of pain that normally came in the days following the activities had been dramatically reduced and that he was able to work harder without being afflicted by PLP.

Surprisingly, the patient was capable of sequential control of three DoF from the first session, which evolved to four DoF and simultaneous control after four sessions. The patient reports that he is now able to control the motion of his phantom limb at will in the trained DoF. This is even possible in the absence of the visual feedback provided by the system, as is the case when he drives. More importantly, he reports being able to control (stop) the pain episodes considerably more effectively than before the interventions. Furthermore, he no longer wakes up at night due to PLP. The patient's life partner reports that it is her belief that "*My husband can live 10 years more than I expected, as pain now plays a less important role in his life and those close to him can see it.*"

We have previously observed that patients using BioPatRec reported a telescopic effect on the position of the phantom limb. The patient initially reported the perception of his phantom hand at the stump height, which over the course of the treatment extended to the anatomically unaltered position. Interestingly, when he rests his arm over a table producing sensory feedback, the perceived position of the hand moves back to the end of the stump. However, as soon as he starts producing phantom motions, the perceived position is once again restored to the unaltered anatomy. This is a phenomenon that is now permanently present and it indicates the complexity of self-perception and how it can be altered by sensory feedback and motor execution. It is worth noting that the presence of a phantom limb map on the stump is weak, mixed, and fairly difficult for the patient to identify, thus providing limited sensory information from the phantom hand.

**FIGURE 2 | Evolution of pain intensity over time. (A)** The distribution of pain intensity over time shows that at the beginning of the treatment, the patient had a sustained level of pain (∼30%) during more than half of the time, and periods with higher levels of pain the rest of the time. Over the course of the treatment, a reduction of time at higher pain intensity levels was reported, as well as the appearance of periods of lower or absent pain. **(B)** The sustained level of pain was also the lowest pain perceived by the patient, and it gradually decreased to around 10% over the course of the interventions. Episodes of reduced pain started occurring after 4 weeks of treatment and gradually became pain-free periods. In week 11, a problem with his socket prosthesis caused him to use an old, tighter socket that had previously been shown to induce pain.

The initial state of the phantom hand was described by the patient as a permanently, strongly closed fist and this has been the case for the last 48 years. After six sessions, this state evolved to a mid-open hand position, which coincides with the neutral (relaxed) position shown by the virtual hand. This is now the permanent perception of phantom hand posture and it is greatly appreciated by the patient (patient self-report). We do not have enough data to argue that the constant visualization of such position as the normal virtual state has influenced its perception as the default phantom limb state, or whether this is instead the result of the patient's skill at moving the phantom limb. In any case, the relaxation of such a stressed position occurred at the same time as the appearance of reduced pain periods and it could therefore be attributed as one of the causes of reduced PLP.

As expected, the ability of the patient to control the different motions improved over the course of the treatment. It is worth mentioning that no muscles directly responsible for the more distal movements were available due to the level of amputation (e.g., hand open and close). However, the patient was capable of voluntarily controlling the virtual limb to produce those motions. We hypothesize that the patterns of myoelectric activity produced by muscle synergies are distinctive enough to allow the classifiers to differentiate directly related movements from those occurring more distally at the hand. **Figure 3** illustrates the learning curve through the improvement of the classification accuracy of nine classes (eight movements plus "no movement"). In this case, the classification accuracy indicates how well motions can be discerned from each other using information from the recorded sessions (offline), whereas the real-time performance once the patient has acquired experience with the system is shown in **Table 1**. Despite the relative low level of offline accuracy at the beginning of the treatment, the interventions were still possible because only a few movements were discriminated together for each rehabilitation task, thereby making the differentiation easier for the classifiers, i.e., only two DoF (four movements) were used for game control and the TAC test.

The motion test results after week 15 show performance comparable to that of 17 able-bodied subjects previously evaluated (Ortiz-Catalan et al., 2013), where the signal processing, features and motion test settings were the same. Although these results

**FIGURE 3 | Offline accuracy.** The offline discrimination accuracy over time is presented in box plots where the central mark represents the median value; the edges of the box are the 25th and 75th percentiles; the whiskers give the range of data values; "∗" represent average values.

#### **Table 1 | Motion test results.**


cannot be compared directly, they serve as an indication of the ability of the patient to produce distinctive motions in real-time. It is worthy of notice that the number of electrodes has been reduced to four since this report without a noticeable effect in classification performance.

## **DISCUSSION**

A complete understanding of the root causes and underlying mechanisms of PLP has evaded the scientific community for decades (Nikolajsen and Jensen, 2001; Flor et al., 2006). This understanding will undoubtedly enable the creation of more effective treatments for the condition. In this context, the system proposed here provides empirical information on the effect of reactivating brain areas related to motor execution, enabling visual feedback that "tricks" the brain into believing that there is a limb responding to motor commands, and exercising the stump musculature, which is normally neglected. Mirror therapy is based on the assumption that visual feedback can potentially correct tactile deafferentation (to some degree) due to brain plasticity. Evidence has been reported on the correlation between cortical reorganization and PLP (Flor et al., 1995), which was further investigated to argue that extensive myoelectric prosthetic use prevents it, and thus reduces PLP (Lotze et al., 1999). The patient reports wearing his body-powered prosthesis all the time he is awake and he has done so for decades. In this case, although limited visual feedback is provided by the prosthesis, the prosthesis does not respond to physiologically appropriate commands and the motion of the missing limb is thus neglected. This might explain why, although the patient has used his prosthesis extensively, the PLP has remained. This underlines the importance of a congruent relationship between feedback and motor execution, as well as the intention to perform motion execution itself. All this is synergistically provided by our system.

In this case study, we present a system that can be used for PLP treatment and has had relative success in a patient with chronic PLP who had unsuccessfully explored several other treatments. Despite the fact that the pain has not disappeared completely at the time of this report, its reduction and temporal absence have considerably improved the patient's condition (patient selfreport). It still remains to be seen whether the pain disappears completely after the long-term use of the system. The ideal medical treatment would be to administer it for a defined period of time and permanently cure the condition. We have intentionally avoided terminating the sessions to evaluate the long-term effect, as we feel this would be unethical, given the satisfaction reported by the patient after 48 years of chronic pain. As an alternative, the patient has been provided with a stand-alone system to be used at home and he has been instructed to use it at his own discretion. Follow-ups will be conducted every 2 months for a year and every year after that for 5 years.

The combination of myoelectric control of a virtual limb using physiologically appropriate signals, the enhanced illusion given by AR, and the entertainment provided by gaming has enabled the patient to develop the skill to control the motion of his phantom limb at will, even outside the lab. It is not clear whether this skill alone is enough to reduce PLP, because (1) this was acquired through visual feedback forcing the brain into the illusion that the limb is present, thus facilitating phantom limb motions (Brodie et al., 2003); and (2) the intervention inevitably results in motion intent and a workout of muscles at the stump which are normally neglected. It has been argued that the second factor alone is a cause of PLP relief (Sherman, 1980). The independent contribution of these two factors could be difficult to isolate in the proposed system. Motor intention alone has been shown to similarly reduce PLP when comparing mirror therapy with and without visual feedback (Brodie et al., 2007). When visual feedback was used, however, the capabilities of phantom limb motion increased. On the other hand, VR interventions where the controlling side is the amputated has shown signs of PLP relief, despite that the musculature at the stump was not directly involved (Cole et al., 2009). Treatment-wise, the combination presented here including all the latter was successful in a particular but complicated case, and it requires further investigation in a wider clinical study.

The possibility of decoding distal movements using muscles synergies was initially explored decades ago (Wirta et al., 1978). In 1982, Saridis and Gootee used pattern recognition to decode wrist pro/supination from biceps and triceps muscles (Saridis and Gootee, 1982), however, they reported that they were not able to decode hand open/close; possibly due to the limited number of electrodes used (2 bipolars). In our experience, patients can quickly learn to control a few distal motions and the results presented here suggest that they are able to develop that skill further to several motions. It is worth noting that one limitation of this non-invasive approach is that a certain degree of musculature is required, i.e., shoulder disarticulations would hardly be treatable unless they were recipients of targeted muscle reinnvervation (Kuiken et al., 2004, 2009). On the other hand, the proposed treatment can be used seamlessly in any patient requiring neuromuscular rehabilitation, in cases such as stroke and incomplete spinal cord injuries (Lee et al., 2011; Liu and Zhou, 2013), again, given the availability of myoelectric signals.

VR treatments are commonly justified and encouraged by the assumption that sensory stimulation boosts neuromuscular rehabilitation. At the current stage, the system employs only visual feedback stimuli, mostly due to the technical difficulties involved in providing proper somatosensory stimulation. In our experience, patients invariably prefer a virtual limb to any other visual feedback and we are therefore presently developing rehabilitation games based on AR that are specifically designed to exercise selected motions in a controlled manner.

The proposed system incorporates different advantages of computational rehabilitation systems, such as progress tracking, adjustable task difficulty, engaging rehabilitation tasks, and portability. Furthermore, the VR environment and all the source code necessary for motion prediction using sEMG (including game control) are freely available and open source in BioPatRec (Ortiz-Catalan et al., 2013), which aims to enable researchers worldwide to use this technology.

#### **CONCLUSIONS**

PLP has historically being a difficult condition to treat and it affects the majority of amputees. In this work, we introduce a non-invasive technological proposal that combines the prediction of motion intent through the decoding of myoelectric signals, virtual and augmented reality, and gaming. As opposed to conventional mirror therapy, this system allows full range of motion and direct volitional control of the virtual limb, and it is applicable for bilateral amputees, in addition to having the known motivational benefits of gaming and progress tracking by computerized systems. This system is presented with a case study of a chronic PLP patient with known resistance to conventional PLP treatments. Having shown that the system has considerably increased the quality of life of a single patient, where other previous conventional treatments had proved unsuccessful, we believe that it offers sufficient justification to further explore its efficacy on a wider PLP population.

#### **AUTHOR CONTRIBUTIONS**

Max Ortiz-Catalan designed the study, developed the motion prediction technology (software and hardware), performed the literature review, and drafted the manuscript. Nichlas Sander developed the virtual reality environment. Morten B. Kristoffersen developed the augmented reality environment. Max Ortiz-Catalan, Nichlas Sander, and Morten B. Kristoffersen performed the interventions and analyzed the results. Rickard Brånemark and Bo Håkansson supervised this research and revised the manuscript. All the authors have read and approved the final manuscript.

#### **ACKNOWLEDGMENTS**

The authors would like to thank the patient that participated in this study for his helpful feedback and cooperation. The authors also thank Alejandra Zepeda E. for her help in conducting the latest sessions, and Kerstin Caine-Winterberger. This work has been funded by the Jimmy Dahlstens Fond, CONACYT, VINNOVA (IFH 2010-00482), and Integrum AB.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnins.2014. 00024/abstract

#### **Additional File 1 | Video examples of the patient in the augmented reality environment, gaming, motion and TAC test.**

#### **REFERENCES**

Bach, F., Schmitz, B., Maaß, H., Cakmak, H., Diers, M., Bodmann, R., et al. (2010). "Using interactive immersive VR/AR for the therapy of phantom limb pain," in *Proceedings of the 13th International Conference on Humans Computer* (Aizu-Wakamatsu), 183–187. Available online at: http://dl*.*acm*.*org/citation*.* cfm?id=1994529 [Accessed September 16, 2013].


of multifunctional upper-limb prostheses. *J. Rehabil. Res. Dev.* 48, 619–628. doi: 10.1682/JRRD.2010.08.0149


**Conflict of Interest Statement:** In addition to governmental institutions, this work was partially funded by Integrum AB, which is currently investing in the advanced control of robotic prostheses. The technology for motion prediction was originally developed for prosthetic control and it is open source.

*Received: 04 October 2013; accepted: 27 January 2014; published online: 25 February 2014. 5*

*Citation: Ortiz-Catalan M, Sander N, Kristoffersen MB, Håkansson B and Brånemark R (2014) Treatment of phantom limb pain (PLP) based on augmented reality and gaming controlled by myoelectric pattern recognition: a case study of a chronic PLP patient. Front. Neurosci. 8:24. doi: 10.3389/fnins.2014.00024*

*This article was submitted to Consciousness Research, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Ortiz-Catalan, Sander, Kristoffersen, Håkansson and Brånemark. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A neurochemical closed-loop controller for deep brain stimulation: toward individualized smart neuromodulation therapies

*Peter J. Grahn1, Grant W. Mallory 2, Obaid U. Khurram1, B. Michael Berry1, Jan T. Hachmann 2, Allan J. Bieber 2,3, Kevin E. Bennet 2,4, Hoon-Ki Min2,5, Su-Youne Chang2 , Kendall H. Lee 2,5 and J. L. Lujan2,5\**

*<sup>1</sup> Mayo Clinic College of Medicine, Mayo Clinic, Rochester, MN, USA*

*<sup>2</sup> Department of Neurologic Surgery, Mayo Clinic, Rochester, MN, USA*

*<sup>3</sup> Department of Neurology, Mayo Clinic, Rochester, MN, USA*

*<sup>4</sup> Division of Engineering, Mayo Clinic, Rochester, MN, USA*

*<sup>5</sup> Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA*

#### *Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*Matthew Johnson, University of Minnesota, USA Christian J. Hartmann, Heinrich Heine University Duesseldorf, Germany*

#### *\*Correspondence:*

*J. L. Lujan, Departments of Neurologic Surgery and Physiology and Biomedical Engineering, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA e-mail: lujan.luis@mayo.edu*

Current strategies for optimizing deep brain stimulation (DBS) therapy involve multiple postoperative visits. During each visit, stimulation parameters are adjusted until desired therapeutic effects are achieved and adverse effects are minimized. However, the efficacy of these therapeutic parameters may decline with time due at least in part to disease progression, interactions between the host environment and the electrode, and lead migration. As such, development of closed-loop control systems that can respond to changing neurochemical environments, tailoring DBS therapy to individual patients, is paramount for improving the therapeutic efficacy of DBS. Evidence obtained using electrophysiology and imaging techniques in both animals and humans suggests that DBS works by modulating neural network activity. Recently, animal studies have shown that stimulation-evoked changes in neurotransmitter release that mirror normal physiology are associated with the therapeutic benefits of DBS. Therefore, to fully understand the neurophysiology of DBS and optimize its efficacy, it may be necessary to look beyond conventional electrophysiological analyses and characterize the neurochemical effects of therapeutic and non-therapeutic stimulation. By combining electrochemical monitoring and mathematical modeling techniques, we can potentially replace the trial-and-error process used in clinical programming with deterministic approaches that help attain optimal and stable neurochemical profiles. In this manuscript, we summarize the current understanding of electrophysiological and electrochemical processing for control of neuromodulation therapies. Additionally, we describe a proof-of-principle closed-loop controller that characterizes DBS-evoked dopamine changes to adjust stimulation parameters in a rodent model of DBS. The work described herein represents the initial steps toward achieving a "smart" neuroprosthetic system for treatment of neurologic and psychiatric disorders.

**Keywords: deep brain stimulation (DBS), feedback control systems, local field potentials (LFP), fast scan cyclic voltammetry (FSCV), machine learning, individualized medicine**

#### **INTRODUCTION**

Neurologic and psychiatric disorders can be characterized by motor, behavioral, cognitive, affective, or perceptual traits that affect how individuals move, feel, think, and behave (Benabid et al., 2005; Nemeroff, 2007; Williams and Okun, 2013). These disorders affect over 94 million people in the United States alone with health-care related costs exceeding \$648 billion (Logothetis, 2003b; Benabid et al., 2005; Speert et al., 2012; Williams and Okun, 2013). Although most individuals suffering from neurologic and psychiatric disorders are successfully treated with a combination of medications and therapy, up to 30% of patients are unable to respond to standard therapeutic interventions (Olanow et al., 2000; Benabid et al., 2005; Hamani et al., 2006; Nemeroff, 2007; Williams and Okun, 2013). For these treatmentresistant patients, high-frequency electrical stimulation of subcortical brain structures, known as deep brain stimulation (DBS), presents a highly successful therapeutic alternative (Benabid et al., 2005; Williams and Okun, 2013). DBS is FDA-approved for the treatment of Parkinson's disease (PD) and essential tremor (ET) (Benabid et al., 1987, 1991; Burchiel et al., 1999; Koller et al., 2001; Obeso and Guridi, 2001; Simuni et al., 2002; Rehncrona et al., 2003; Germano et al., 2004; Rodriguez-Oroz, 2004; Blomstedt and Hariz, 2010; Moro et al., 2010; Weiss et al., 2013). Additionally, DBS has received humanitarian device exemptions for dystonia and obsessive-compulsive disorder, and there are multiple studies underway for the treatment of other neurologic and psychiatric disorders (Benabid et al., 1987, 1991; Burchiel et al., 1999; Obeso and Guridi, 2001; Simuni et al., 2002; Velasco et al., 2005; Lim et al., 2007; Mueller et al., 2008; Blomstedt and Hariz, 2010; Denys et al., 2010; Fisher et al., 2010; Ramasubbu et al., 2013).

Brain stimulation has been an important tool in the field of neurosurgery pioneered by Spiegel and Wycis (Blomstedt and Hariz, 2010). Intra-operative electrical stimulation of neural tissue has been used since the early days of human stereotaxis to identify surgical targets (Gildenberg, 2003, 2005). Application of brain stimulation in modern-day neurosurgery was revolutionized by Benabid and colleagues, who used high frequency stimulation (typically 100–130 Hz) delivered directly into specific brain regions to mimic the effects of surgical lesions without performing any tissue resection (Benabid et al., 1987, 1991; Blomstedt and Hariz, 2010). DBS achieves therapeutic benefits by delivering electrical currents to specific anatomical targets within the brain via multi-contact electrodes connected to implanted pulse generators. In DBS therapy, a balance between maximal clinical improvement and minimal stimulation-induced side effects is typically achieved by adjusting active electrode contacts, stimulus frequency, amplitude, and pulse duration.

Clinical DBS programming is an iterative process in which stimulation parameters are adjusted in order to maximize therapeutic benefits while minimizing side effects (Morishita et al., 2013) Although many DBS patients require minimal stimulation adjustment following surgery, many more require several months of regular parameter adjustments before optimal therapeutic results can be achieved (Okun et al., 2005; Bronstein et al., 2011; Kluger et al., 2011). However, sustaining these therapeutic benefits requires subsequent adjustment of stimulation parameters every few months (Mayberg et al., 2000, 2005; Deuschl et al., 2006; Moro et al., 2006; Frankemolle et al., 2010; Mure et al., 2011). Therefore, existing clinical programming and stimulation paradigms are poorly suited to cope with the dynamic and comorbid nature of most neurologic disorders. This, in turn, highlights the need for dynamic feedback systems that can continually and automatically adjust stimulation parameters in response to changes within the environment of the brain.

## **THERAPEUTIC STIMULATION PARADIGMS**

The therapeutic success of DBS depends not only on accurate surgical targeting and electrode implantation, but also on the ability to optimize stimulation parameters to maximize therapeutic benefits while minimizing side effects. Clinical strategies for therapeutic DBS programming require multiple post-operative visits during which experienced clinicians perform clinical evaluations and corresponding device programming. In each visit, a series of inputs (active contacts, stimulus amplitude, pulse width, and frequency) are adjusted in an attempt to minimize adverse effects while maximizing clinical benefits. Although this strategy has provided significant patient benefit, the results are far from optimal. First, this open loop strategy relies on the subjective experiences of both the patient and clinician, without providing objective feedback to support parameter optimization. Second, the therapeutic response observed in this acute setting does not guarantee sustained therapeutic effects. Disease progression, environmental factors, and behaviorally induced changes in network activity can all render therapeutic stimulation ineffective, requiring additional programming sessions (Obeso and Guridi, 2001; Hunka et al., 2005; Kupsch et al., 2011). Third, the procedure is costly and time consuming. As such, only a fraction of the stimulation parameter space can be practically explored during each session. Fourth, DBS device programming can differ according to the target chosen, the orientation of the electrode relative to the target, the disorder being treated, and the symptoms being treated for a given disorder (Velasco et al., 2007; Ricchi et al., 2012; Min et al., 2013; Miocinovic et al., 2013). Additionally, the timing of programming as well as the waiting time between adjustments can influence when different therapeutic responses can be observed, and these responses also vary between disorders (e.g., Tremor is nearly immediate, whereas depression could take several weeks to observe the effect of a disorder) (Velasco et al., 2007; Ricchi et al., 2012; Min et al., 2013; Miocinovic et al., 2013). Therefore, it is necessary to implement DBS control strategies that can adjust stimulation parameters in real-time according to quantifiable and objective neurochemical, physiological, and behavioral changes while reducing the frequency of clinical interventions. However, before such control strategies can be implemented, it is necessary to improve the understanding of the cellular mechanisms responsible for the network effects of DBS.

The cellular response of single neurons to extracellular electrical fields has been well characterized over short time scales (Smith and Grace, 1992; Benazzouz et al., 2000; Hashimoto et al., 2003; Maurice et al., 2003; Kita et al., 2005; Miocinovic et al., 2006). It is known that excitation of efferent axons or fibers of passage near the site of stimulation results in network changes in neurotransmission and electrical activity (Grill et al., 2004; McIntyre et al., 2004a,b; Johnson et al., 2008; McIntyre and Hahn, 2010; Shah et al., 2010). Furthermore, functional and metabolic imaging studies have shown that successful treatment of neurologic and psychiatric disorders is associated with metabolic normalization in proximal and distal regions of the brain (Mayberg et al., 2000, 2005; Mure et al., 2011). The precise relationships between therapeutic improvement and changes in metabolic patterns remain unknown. As such, current research efforts focus on the use electrophysiology and electrochemistry to elucidate the network effects of DBS (Bledsoe et al., 2009; Lee et al., 2011; Vitek et al., 2012).

## **REAL-TIME MONITORING OF NEURAL ACTIVITY**

Signaling within the brain occurs both electrically and chemically. Technological advances in neural activity monitoring have enabled real-time investigation of cellular and molecular dynamics using electrophysiological and neurochemical probes. While the most used technique involves electrophysiological monitoring of extra-cellular neuronal activity (Smith and Grace, 1992; Benazzouz et al., 2000; Hashimoto et al., 2003; Maurice et al., 2003; Johnson et al., 2005; Kita et al., 2005; Miocinovic et al., 2006) recent advances in electrode technology allow *in vivo* monitoring of synaptic neurotransmitter activity (Roham et al., 2007; van Gompel et al., 2010).

Electrophysiological analysis has been widely used to study stimulation-evoked changes in brain activity, such as increased pallidal (Hashimoto et al., 2003; Kita et al., 2005; Miocinovic et al., 2006) and nigral activity (Smith and Grace, 1992; Benazzouz et al., 2000; Maurice et al., 2003) during subthalamic nucleus (STN) DBS. This has been accomplished by recording single neuron activity (single unit recordings), activity from local groups of neurons (multi unit activity, local field potentials), and distributed signals representing global brain activity [electrocorticograms (ECoGs), electroencephalograms (EEGs)]. Alternatively, neurochemical analysis techniques such as microdialysis, amperometry, and voltammetry, can detect local changes in neurotransmitter concentration evoked by internal and external mechanical, electrical, and chemical stimuli (Dale et al., 2005; Wightman, 2006). Neurochemical recordings have been used to monitor *in vivo* release of analytes such as oxygen, dopamine, adenosine, serotonin, and glutamate in small and large animal models of DBS (Agnesi et al., 2009; Bledsoe et al., 2009; Chang et al., 2009; Kimble et al., 2009; Griessenauer et al., 2010; Shon et al., 2010a,b).

#### **SINGLE-UNIT RECORDINGS**

Single unit recordings capture the activity of distinct neurons *in vivo* by placing a high-impedance microelectrode within the extracellular space surrounding the cell body. These electrodes, having surface areas under 2 <sup>×</sup> 10-5 cm<sup>2</sup> (Loffler, 2012), record extracellular potentials representative of intracellular action potentials from neurons adjacent to the electrode tip. The high spatial and temporal resolution provided by single unit recordings allows for precise measurements of neuronal spikes (Buzsáki et al., 2012). However, activity from single units can be difficult to isolate due to crosstalk from neighboring cells (Bai and Wise, 2001). Additionally, single unit recordings can be biased toward activity from larger (e.g., pyramidal) cells (Buzsáki et al., 1983). Furthermore, electrode migration, immune responses (e.g., glial scarring), and disruption of surrounding neural tissue interfere with signal quality and limit reliable single unit activity to acute recording conditions (Carter and Houk, 1993; Polikov et al., 2005).

#### **MULTI-UNIT RECORDINGS**

Multi unit recordings capture fast spiking activity from groups of neurons using high-impedance microelectrode arrays. Similar to single unit recordings, this technique provides good spatial and temporal resolution reflecting synaptic events occurring at high frequencies (*>*800 Hz) (Logothetis, 2003a,b; Mattia et al., 2010). Unfortunately, multi-unit recording arrays suffer from stiff form factors that result in shear-induced inflammation of the surrounding tissue (Cheung, 2007). Furthermore, recording can only occur from the tips of the electrodes, limiting recording selectivity (Maynard et al., 1997).

#### **LOCAL FIELD POTENTIALS**

Local field potential (LFP) analysis is an electrophysiological technique for detecting changes in brain activity that offers great potential for understanding the network effects of DBS (Tsang et al., 2012; Priori et al., 2013). This technique is capable of recording chronic electrical activity directly from single and multiple neural units using micro and macro electrodes implanted within the nucleus of interest (Bronte-Stewart et al., 2009; Giannicola et al., 2012). LFPs are typically used to record low-frequency changes in activity across groups of neurons within a volume of interest (Andersen et al., 2004; Buzsáki et al., 2012; Rosa et al., 2012). These activity changes reflect a weighted average of integrative processes and associations between cells that can be detected over longer distances through extracellular space (Logothetis, 2003a,b; Bronte-Stewart et al., 2009). Unfortunately, the longer recording range of LFP techniques is associated with decreased spatial resolution. Despite this limitation, LFP recordings can be performed in real-time using the same DBS electrode, which eliminates the need for additional electrode penetrations (Rossi et al., 2007). Therefore, local field potentials present a good starting point for establishing closed-loop neurostimulation control systems (Rosin et al., 2011; Santaniello et al., 2011; Berényi et al., 2012; Little et al., 2013).

#### **GLOBAL FIELD POTENTIALS**

Analysis of global brain activity can be used to identify both spontaneous and event-related responses from large groups of neurons. Whole-brain electrophysiological brain activity (i.e., global field potentials) is typically measured using far-field sensors located on the scalp (EEG) or directly on the brain surface (ECoG). These global field potentials can be used to identify information regarding high-level sensory processing, perception, and locomotor activity (Issa and Wang, 2013). For example, EEG signals with low spatial resolution can be recorded non-invasively by non-surgically attaching recording electrodes to the scalp. Alternatively, ECoG signals offer increased spatial resolution, but recording electrodes must be surgically attached at the cortical surface (Buzsáki et al., 2012). Despite the advantages of global field potentials, these signals do not provide insight into activity changes within specific subcortical structures. As such, a system that combines activity analysis within cortical (e.g., EcOG) and subcortical (e.g., LFP) networks should provide a better depiction of network dynamics which, in turn, will be required to develop optimal closed-loop stimulation paradigms (Rosa et al., 2012).

#### **NEUROCHEMICAL RECORDINGS**

Neurochemical sensing allows real-time characterization of neural activity with high spatial resolution and signal specificity (Lee et al., 2004). Microdialysis, amperometry, and voltammetry are three widely used techniques for neurochemical monitoring (Blaha and Phillips, 1996).

Microdialysis is a technique for sampling different analytes and determining their concentration in extracellular fluid (Chefer et al., 2009). This technique offers excellent specificity, selectivity, and sensitivity for quantifying neurotransmitter release in a laboratory setting (Watson et al., 2006). However, it suffers from limited temporal resolution (Smolders et al., 1997; Khan et al., 1999). Therefore, microdialysis is not suitable for real-time clinical application in closed-loop systems.

Amperometry is an alternative technique for measuring analytes in the extracellular space. Amperometric recordings involve the application of a fixed electric potential through a carbon fiber microelectrode (CFM) placed in close proximity to the target

**FIGURE 1 | Stimulation-evoked dopamine responses. (A)** Dopamine redox reactions at the tip of a carbon fiber microelectrode during fast scan cyclic voltammetry. As the potential applied to the electrode increases from −0.4 to 0.0 V, extracellular dopamine is reduced (reduction peak at −3.5 nA). As the

applied potential is further increased from 0.0 to 1.0 V, dopamine is oxidized (oxidation peak at 3.5 nA). Measured current background is shown in red. **(B)** Pseudo-color representation of dopamine oxidation current at +0.6 V at DBS onset (100 Hz, 2 ms, 300µA).

cells (Gale et al., 2013; Tye et al., 2013). These CFM are coated with specific enzymes known to react with non-electrolytic analytes of interest, resulting in electroactive products that can be electrically measured (Oldenziel et al., 2004). This allows continuous monitoring of changes in electrical currents within the surrounding extracellular fluid. The detected changes in current are caused by oxidative reactions between the applied potential and analyte molecules within the extracellular space (van Gompel et al., 2010). The downfall of this technique is the high complexity associated with chronic *in vivo* measurements, which require continuous enzyme delivery to detect the breakdown products of the neurotransmitter of interest (Jacobs et al., 2010).

Analogous to amperometry, voltammetry provides real-time high-resolution analyte measurements (Blaha et al., 1990). Specifically, fast scan cyclic voltammetry (FSCV) is a voltammetry technique in which a linearly varying potential is applied to a carbon fiber electrode, allowing for oxidation and reduction of surrounding electroactive molecules to take place (Robinson et al., 2003; Lee et al., 2007). The magnitudes of the analyte oxidation and reduction current peaks are directly proportional to the concentration of analyte oxidized and reduced at the electrode surface (Atcherley et al., 2013). Furthermore, the resulting electrical current vs. applied potential relationships (**Figure 1**) provide a chemical signature (i.e., voltammogram) that allows identification of specific neurotransmitters or other electroactive analytes (Robinson et al., 2003). FSCV detection of analytes is limited to electroactive molecules such as dopamine, adenosine, and oxygen (van Gompel et al., 2010). Furthermore, the lifetime of CFM is limited to a few months (Kim et al., under review), restricting clinical application of FSCV detection methods to intraoperative approaches.

#### **SMART DBS CONTROL**

Clinical DBS systems follow an open-loop paradigm. That is, stimulation parameters are pre-programmed into the DBS device and held constant until the next programming session, regardless of the internal state of the system or environmental factors (Foltynie and Hariz, 2010). In contrast, closed-loop DBS systems rely on sensor feedback to monitor the environment and internal state of the system in order to adjust stimulation parameters accordingly (Abbott, 2006; Fagg et al., 2007). That is, stimulation parameters (e.g., stimulation frequency, stimulus amplitude, etc.) are automatically adjusted to maintain specific therapeutic outputs such as tremor suppression in the presence of disturbances, environmental perturbations, and internal network changes (**Figure 2**). To date, development of closed-loop neuroprosthetic devices has largely focused on using electrophysiological activity as feedback signals (Avestruz et al., 2008; Skarpaas and Morrell, 2009; Rosin et al., 2011; Basu et al., 2013; Grant and Lowery, 2013). Neurochemical-based feedback, however, offers the prospect of finer control of stimulation-induced effects, as it allows activity monitoring from individual types of neurons by virtue of their neurotransmitters. The ability to use neurochemical feedback to control DBS has been demonstrated by characterizing glutamate release using mathematical models linking electrical stimulation to glutamate release in a rat model of DBS (Behrend et al., 2009). Thus, chemical sensing presents a unique opportunity for developing closed-loop smart neurocontrol systems that are optimized for specific disorders and targets, and which can account for intra- and inter-patient variability.

## **NEUROCHEMISTRY OF DBS**

Studies using small and large animal models suggest that therapeutic DBS coincides with changes in neurotransmitter release (Lee et al., 2004; Shon et al., 2010a,b). It has been established that dopaminergic cell loss in the substantia nigra leads to striatal dopamine deficiency and movement abnormalities in PD patients (MacDonald et al., 2013). It has also been shown that therapeutic STN DBS for treatment of PD decreases the need for exogenous levodopa (Moro et al., 1999; Molinuevo et al.,

2000) and has been hypothesized to increase striatal dopamine release (Lee et al., 2009). Complementing findings in electrophysiological and neurochemical sensing studies have shown that STN DBS evokes dopamine release in the striatum of parkinsonian rats (Blaha and Phillips, 1996; Lee et al., 2006). Similarly, stimulation-evoked adenosine release has been recorded intraoperatively in the ventral intermediate nucleus of the thalamus in human patients undergoing DBS for treatment of ET (Chang et al., 2012). However, the specific relationship between DBS and neurochemical activity changes remains unknown. Therefore, understanding the relationships between stimulation parameters and neurotransmitter concentration levels is paramount for developing closed-loop DBS control strategies.

In the following paragraphs, we describe a proof-of-principle approach to closed-loop DBS that automatically adjusts stimulation parameters in order to sustain stable dopamine levels in a rodent model of DBS. The paradigm proposed herein uses FSCV to quantify striatal dopamine release evoked by medial forebrain bundle (MFB) DBS. Additionally, this paradigm relies on nonlinear regression, computational modeling, and constrained optimization techniques to parameterize stimulation-evoked dopamine responses. The inverse dynamics of stimulation-evoked dopaminergic responses are modeled using artificial neural networks (ANN), which also predict stimulation parameters required for sustaining target dopaminergic concentration levels. The performance of this closed-loop paradigm was evaluated by comparing target dopaminergic responses to *in vivo* dopaminergic responses achieved using ANN-predicted stimulation parameters (**Figure 4**). While focused on DBS of ascending dopaminergic fibers in the MFB for evoking dopamine release in the rat striatum (Agnesi et al., 2009), this closed-loop paradigm is applicable to a variety of analytes, targets, and neurologic disorders.

#### **EXPERIMENTAL PARADIGM**

To quantify the dynamics of stimulation-evoked dopamine release, recording FSCV CFM and bipolar DBS macroelectrodes were implanted into the striatum and MFB, respectively, in four anesthetized rats. All animal procedures were performed according to the guidelines of the Mayo Clinic Institutional Animal Care and Use Committee (IACUC). Animals were kept on a standard 12 h light-dark cycle with access to food and water *ad libitum* in conventional housing in accordance with National Institutes of Health (NIH) and US Department of Agriculture guidelines.

Animals were anesthetized and the head was fixed in a Kopf stereotactic frame (David Kopf Instruments, California) for electrode targeting. Following brain exposure, one bipolar stimulating electrode, one FSCV recording electrode, and one silver-chloride reference electrode was inserted into the left MFB, striatum, and contralateral cortex, respectively. Recording electrodes were allowed to stabilize within the tissue environment for 20 min. Finally, the electrodes were connected to a wireless stimulator and neurotransmitter sensor for real-time detection of stimulation-evoked dopamine release (Kimble et al., 2009; Chang et al., 2013).

Following electrode implantation, a comprehensive range of stimulation parameters (**Table 1**) was used to determine the magnitudes and temporal patterns of stimulation-evoked dopamine release. Stimulation was divided into 65 20-s bins. Each bin corresponded to one combination of stimulation parameters delivered through the active electrode contact. Each stimulation bin was followed by a stabilization and washout period of 180 s.

#### **STIMULATION-EVOKED NEUROCHEMICAL MONITORING**

Stimulation-evoked dopamine measurements were obtained by changing the CFM potential from a resting potential of −0.4–1.3 V and back, at a rate of 400 V/s. This triangular waveform was repeated at a frequency of 10 Hz (Chang et al., 2012). The CFM was held at the resting potential between scans. We converted the measured oxidation and reduction current peaks to dopamine concentration using post-operative *in vitro* flow injection analysis calibration of each CFM (Griessenauer et al., 2010). Our preliminary results showed that as MFB DBS amplitude increases, extracellular dopamine levels within the striatum also increase (**Figure 3A**). A similar response is also observed as pulse duration is increased from 0.1 to 2.0 ms (**Figure 3B**). Changes in frequency, however, give rise to a different dopaminergic response. Maximum response was observed at 100 Hz, followed by a decrease in dopamine oxidation currents at higher frequencies (**Figure 3C**).

#### **Table 1 | Stimulation parameters.**


#### **NEUROCHEMICAL RESPONSE MODELING**

Implementation of neurochemically-driven closed-loop DBS control strategies requires characterization of the relationship between electrical stimulation and neurochemical responses. To characterize this relationship, stimulation-evoked FSCV dopamine signals were low pass filtered (5th-order Butterworth filter, 100 Hz cutoff frequency) to remove signal noise. Additionally, the responses to individual stimuli were characterized using a combination of 7th-degree polynomial and 2nd-order exponential mathematical models. The mathematical model parameters (eight for the polynomial fit and four for the exponential fit) and corresponding stimuli were presented to a double-layer feedforward ANN with sigmoidal and linear transfer functions (Lujan and Crago, 2009). The hidden layer contained 150-hidden neurons. The inputs to the ANN consisted on the stimulation frequency, pulsewidth, and stimulus amplitude, while system outputs corresponded to the 12 model parameters. Initial weights and biases were selected at random for 10 different initial conditions. Ten corresponding ANNs were trained on 80% of the data (selected at random) using the Levenberg-Marquardt algorithm. The trained ANN with the lowest generalization error, calculated using the remaining 20% of data, was selected as a system model. The resulting system model, when combined with constrained optimization for minimization of stimulation energy, can identify and eliminate mathematical redundancies for the optimal design of the closed-loop controller (Lujan and Crago, 2009).

### **STIMULATION PREDICTION**

In order to provide optimal stimulation, a predictive model that characterizes the inverse relationship between stimulation parameters and dopamine levels was created. Similarly to the system model, the predictive model was created using a doublelayer ANN with 600 hidden neurons, as well as sigmoidal and linear transfer functions (Lujan and Crago, 2009). The inputs to the predictive model corresponded to the sets of 12 model parameters, while the outputs corresponded to the three stimulation parameters. This inverse model was then used to predict the stimulation parameters required to sustain specific extracellular dopamine levels within the striatum, thus allowing for feedback control. This was followed by stimulation of the MFB using the predicted parameters, and simultaneous recording of extracellular dopamine levels. Root mean squared (RMS) errors between experimentally measured and desired stimulationevoked dopamine responses were used to determine controller efficacy. Least-squares regression analysis of the dependencies of actual dopamine levels on target levels was used in an effort to identify systematic (e.g., slope, offset) sources of error (Lujan and Crago, 2009).

#### **CLOSED-LOOP CONTROL**

Our preliminary results in four anesthetized rats suggest that mathematical models can be used to describe the relationships between stimulation-evoked extracellular dopamine responses and DBS parameters (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*8). Furthermore, these results show that adjusting stimulation parameter intensity can modulate dopamine concentration, and that we can use ANN to

dynamically predict stimulation parameters required to adjust stimulation-evoked dopamine levels (**Figure 4**). However, to further understand the network effects of DBS and optimize the therapeutic efficacy of stimulation, it may be necessary to combine electrophysiological (e.g., LFP, ECoG) and neurochemical feedback signals.

### **DISCUSSION**

Frequent adjustment of stimulation settings has been shown to improve the efficacy of DBS therapy (Rosin et al., 2011), which highlights the nature of the changing brain environment. Thus, a smart, automated system capable of dynamically adjusting stimulation parameters in response to a changing environment becomes critical for improving the therapeutic efficacy of DBS therapy. The proof-of-principle closed-loop DBS system proposed above offers the potential for maintaining therapeutic responses during disease progression. By taking advantage of mathematical models, the paradigm presented here can potentially replace the trial-and-error process currently used in clinical programming with deterministic approaches, thereby achieving optimal therapeutic outcomes while minimizing the number of clinical interventions. In turn, this will ultimately reduce required hospital visits and associated healthcare costs (Fraix et al., 2006).

Before automated adjustment of stimulation parameters can be clinically implemented, however, several key clinical questions need to be investigated. Specifically, the relationship between neurotransmitter levels and symptoms of neurologic disease needs further elucidation. For example, there is indirect evidence to suggest that dopamine depletion plays a role in the symptoms of PD and that dopaminergic medications have a therapeutic response. However, precise concentration changes that occur with symptom exacerbation and amelioration are unknown. Additionally, multiple neurotransmitters may play a critical role in the disease (Fitzgerald, 2014). Thus, optimal neurotransmitters and optimal recording locations should be identified for each disorder. Future work should be directed toward validating closed-loop algorithms, correlating neurotransmitter release to clinical benefit in a large animal disease model of Parkinsonism or ET.

lines) and actual (solid lines) dopaminergic responses evoked by stimulation parameters predicted by the artificial neural network

responses were compared using linear regression and Pearson's correlation (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*8538).

Similarly, an important technical barrier that needs to be addressed is that chronic recordings are not possible using current electrode technology. CFMs are subject to electrode fouling due to the charge imbalance of the waveforms required for FSCV. Efforts are underway to develop electrochemical-sensing techniques capable of extending electrode longevity by renewing the electrochemically active surface following adsorption of chemical species (Takmakov et al., 2010). Additionally, it has been reported that diamond coating may potentially prolong the life of recording electrodes (Roham et al., 2007). Once these technologies have been developed, they will need to undergo extensive safety and efficacy testing and validation in pathological animal models before advancing to clinical trials.

## **CONCLUSIONS**

Conventional neuromodulation systems have been successful at achieving therapeutic outcomes in patients with neurologic and psychiatric disorders. However, limitations in existing technology make ensuring optimal benefits a difficult and expensive endeavor. Correlation of multi-modal electrophysiological and neurochemical recordings may provide new insight into the cellular and molecular mechanisms of therapeutic neuromodulation. Therefore, development of smart DBS controllers that rely on the relationships between neurochemical and electrophysiological recordings with the clinical effects of DBS offers the potential of replacing the trial-and-error process used in clinical programming with a deterministic approach. Furthermore, the versatility and adaptability of such controllers will allow expansion of the clinical indications that can be treated with DBS while tailoring its application to individual patients and symptoms. In turn, these will likely improve clinical outcomes, reduce the time and frequency of patient visits, and lower overall health care costs.

## **ACKNOWLEDGMENTS**

This work is supported by The Grainger Foundation, NIH grants R01 NS084975 (J. L. Lujan) and R01 NS070872 (Kendall H. Lee). The authors thank Brian Paek, James Baek, and Megan Settel for their assistance with FSCV electrode manufacture and animal surgery.

## **REFERENCES**


of refractory epilepsy. *Epilepsia* 51, 899–908. doi: 10.1111/j.1528-1167.2010. 02536.x


serial changes and relationship to clinical response. *Biol. Psychiatry* 48, 830–843. doi: 10.1016/S0006-3223(00)01036-2


**Conflict of Interest Statement:** J. L. Lujan has intellectual property licensed to Boston Scientific. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 April 2014; accepted: 02 June 2014; published online: 25 June 2014. Citation: Grahn PJ, Mallory GW, Khurram OU, Berry BM, Hachmann JT, Bieber AJ, Bennet KE, Min H-K, Chang S-Y, Lee KH and Lujan JL (2014) A neurochemical closed-loop controller for deep brain stimulation: toward individualized smart neuromodulation therapies. Front. Neurosci. 8:169. doi: 10.3389/fnins.2014.00169 This article was submitted to Neuroprosthetics, a section of the journal Frontiers in*

*Neuroscience. Copyright © 2014 Grahn, Mallory, Khurram, Berry, Hachmann, Bieber, Bennet, Min, Chang, Lee and Lujan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Post-stroke balance rehabilitation under multi-level electrotherapy: a conceptual review

## *Anirban Dutta1,2\*, Uttama Lahiri 3, Abhijit Das 4, Michael A. Nitsche5 and David Guiraud1,2*

*<sup>1</sup> DEMAR (INRIA Sophia Antipolis), INRIA, CNRS: UMR5506, Université Montpellier II - Sciences et Techniques, Université Montpellier I, Montpellier, France*

*<sup>2</sup> Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CNRS: UMR5506, Université Montpellier II - Sciences et Techniques,*

*<sup>5</sup> Department of Clinical Neurophysiology, Göttingen University Medical School, Göttingen, Germany*

#### *Edited by:*

*Jose L. Pons, Consejo Superior de Investigaciones Científicas, Spain*

#### *Reviewed by:*

*Ricardo Chavarriaga, Ecole Polytechnique Fédérale de Lausanne, Switzerland Friedhelm C. Hummel, Universitätsklinikum Hamburg-Eppendorf, Germany*

#### *\*Correspondence:*

*Anirban Dutta, Inria Sophia Antipolis Méditerranée, INRIA - Bt 5 - CC05 017, Université Montpellier II, 860 Rue Saint Priest - 34095 Montpellier cedex 5, Montpellier, France e-mail: adutta@ieee.org*

Stroke is caused when an artery carrying blood from heart to an area in the brain bursts or a clot obstructs the blood flow thereby preventing delivery of oxygen and nutrients. About half of the stroke survivors are left with some degree of disability. Innovative methodologies for restorative neurorehabilitation are urgently required to reduce long-term disability. The ability of the nervous system to respond to intrinsic or extrinsic stimuli by reorganizing its structure, function, and connections is called neuroplasticity. Neuroplasticity is involved in post-stroke functional disturbances, but also in rehabilitation. It has been shown that active cortical participation in a closed-loop brain machine interface (BMI) can induce neuroplasticity in cortical networks where the brain acts as a controller, e.g., during a visuomotor task. Here, the motor task can be assisted with neuromuscular electrical stimulation (NMES) where the BMI will act as a real-time decoder. However, the cortical control and induction of neuroplasticity in a closed-loop BMI is also dependent on the state of brain, e.g., visuospatial attention during visuomotor task performance. In fact, spatial neglect is a hidden disability that is a common complication of stroke and is associated with prolonged hospital stays, accidents, falls, safety problems, and chronic functional disability. This hypothesis and theory article presents a multi-level electrotherapy paradigm toward motor rehabilitation in virtual reality that postulates that while the brain acts as a controller in a closed-loop BMI to drive NMES, the state of brain can be can be altered toward improvement of visuomotor task performance with non-invasive brain stimulation (NIBS). This leads to a multi-level electrotherapy paradigm where a virtual reality-based adaptive response technology is proposed for post-stroke balance rehabilitation. In this article, we present a conceptual review of the related experimental findings.

**Keywords: virtual reality, eye tracking, neuromuscular electrical stimulation, stroke, neurorehabilitation, non-invasive brain stimulation**

## **INTRODUCTION**

Stroke, defined as an episode of neurological dysfunction caused by focal cerebral, spinal, or retinal infarction, is a global health problem and fourth leading cause of disability worldwide (Strong et al., 2007; Sacco et al., 2013). One of the most common medical complications after stroke are falls, with a reported incidence of up to 73% in the first year post-stroke (Verheyden et al., 2013). Preliminary results from Marigold et al. (2005) suggest that agility training programs challenging dynamic balance may be more effective than static stretching/weight-shifting exercise programs in preventing falls in the chronic stroke population. Stroke-related ankle impairments, which enhance the probability of falls, include weakness of the ankle dorsiflexor muscles and increased spasticity of the ankle plantarflexor muscles. This leads to the foot drop syndrome that is clinically described as poor ankle dorsiflexion during the swing phase along with a forefoot or flat-foot initial contact in the stance phase. Here, the impact of standing balance on activities of daily living is critical, since balance is associated with ambulatory ability (Patterson et al., 2007) and recovery of gross motor function (Tyson et al., 2007). Toward improving muscle strength and reducing muscle spasticity, we leverage recent advances in rehabilitation technology, particularly Neuromuscular Electrical Stimulation (NMES), for post-stroke standing balance rehabilitation. NMES involves coordinated electrical stimulation of nerves and muscles by continuous short pulses of electrical current and has been shown to improve gait speed in subjects poststroke (Robbins et al., 2006). This hypothesis and theory article first proposes a volitionally controlled NMES system for ankle muscles, which acts as a muscle amplifier to improve adequate ankle movement for upright stance during postural perturbations (Hwang et al., 2009). The proposed NMES approach is based on recent state-of-the-art work in humans that postulated that the neural control of muscles may be modular, organized

*Montpellier, France <sup>3</sup> Electrical Engineering, Indian Institute of Technology, Gandhinagar, India*

*<sup>4</sup> Department of Neurorehabilitation, Institute of Neurosciences, Kolkata, India*

in functional groups often referred to as muscle synergies (Piazza et al., 2012; Chvatal and Ting, 2013).

During postural perturbations, the body acts as a single segment pendulum centered about the ankle joint where the ankle muscles provide the torque needed to retain upright posture (Hwang et al., 2009). Gatev et al. (1999) presented a feedforward ankle strategy based on the fact that a moderate negative zerophased correlation exists between the antero-posterior motion of CoP and ankle angular motion. The antero-posterior (A-P) displacements in CoM are performed by ankle plantarflexors (such as medial gastrocnemius and soleus muscles) and dorsiflexors (such as the anterior tibial muscle), while medio-lateral (M-L) displacements are performed by ankle invertors (such as the anterior tibial muscle) and evertors (such as the peroneus longus and brevis muscles) (Winter et al., 1996). Therefore, stroke-related ankle impairments, including weakness of the ankle dorsiflexor muscles and increased spasticity of the ankle plantarflexor muscles, lead to impaired postural control. Respective changes in reflex excitability with respect to postural sway have been shown during standing (Tokuno et al., 2009). For post-stroke standing balance rehabilitation, we thus might be able to ameliorate these stroke-related ankle impairments via an improved modulation of ankle stiffness by modulating muscle tone (Winter et al., 2001) via NMES. We further hypothesize that a coordinated increase in corticospinal excitability of the representation of specific ankle muscles can result in an improved modulation of ankle stiffness. In this connection, prior work has shown that NMES elicits lasting changes in corticospinal excitability, possibly as a result of co-activating motor and sensory fibers (Knash et al., 2003). Moreover, Khaslavskaia and Sinkjaer (2005) showed in humans that concurrent motor cortical drive present at the time of NMES goes along with enhanced motor cortical excitability. Furthermore, at the spinal level, volitionally-driven NMES under visual feedback may induce short-term neuroplasticity in spinal reflexes (e.g., reciprocal Ia inhibition; Perez et al., 2003). Also, corticospinal neurons that project via descending pathways to a given motoneuron pool can inhibit the antagonistic motoneuron pool via Ia-inhibitory interneurons in humans (Pierrot-Deseilligny and Burke, 2005). Consequently, post-stroke impaired reciprocal inhibition between antagonistic muscles may be strengthened via increased presynaptic inhibition of group Ia-afferents under operant conditioning with visual feedback. In this operant conditioning paradigm with visual feedback (Dutta et al., 2013a), the brain acts as the controller during the visuomotor task, where the center of pressure (CoP) is volitionally moved across a display monitor and this movement is assisted with volitionally-driven NMES, as illustrated in **Figure 1**.

However, prior work suggests that active supraspinal control mechanisms are relevant for balance and their adaptation is important in balance training (Taube et al., 2008). Indeed, supraspinal control mechanisms help to counteract internal perturbations caused by self-initiated movements during activities of daily living to maintain standing balance (Geurts et al., 2005). Balance measures reveal underlying limb-specific control such as between-limb CoP synchronization for standing balance that appears to be a unique index of balance control, independent

from postural sway, and load symmetry during stance (Mansfield et al., 2012). A review of standing balance recovery from stroke by Geurts et al. (2005) shows that brain lesions involving particularly the parieto-temporal junction are associated with poor postural control, and suggests that normal multisensory integration in addition to muscle strength is critical for balance recovery. Tokuno et al. (2009) concluded that the sensory feedback mechanisms do relevantly contribute, as the excitability of the respective cortical area was modulated as a function of postural sway, where stroke-related sensorimotor impairment potentially contributes to impaired balance (Mansfield et al., 2012). Indeed, active cortical control based on sensory feedback is relevant for maintaining balance during activities of daily living (Qu and Nussbaum, 2009). In this connection, unilateral spatial neglect, i.e., failure or slowness to respond, orient, or initiate action toward contralesional stimuli, is a common neurological syndrome following predominantly right hemisphere injuries to ventral fronto-parietal cortex (Corbetta and Shulman, 2011). Spatial neglect is associated with prolonged hospital stays, accidents, falls, safety problems, and chronic functional disability (Goedert et al., 2012), probably caused to a relevant degree by compromised cortical control of balance. Here, amelioration of spatial neglect may be possible with non-invasive brain stimulation (NIBS) (Hesse et al., 2011). NIBS—namely transcranial direct current stimulation (tDCS)—over the posterior parietal cortex (PPC) has been shown to modulate visuospatial localization (Wright and Krekelberg, 2014) and to alter perceived position (Wright and Krekelberg, 2013). Moreover, modulation of sensorimotor cortical excitability by tDCS is feasible (Nitsche and Paulus, 2000), and may facilitate post-stroke rehabilitation (Hallett, 2005; Flöel, 2014) by enhancing sensory feedback mechanisms for brain machine interface (BMI) control (Dutta et al., 2014b). Matsunaga et al. (2004) have shown that anodal tDCS over the sensorimotor cortex induces a long-lasting increase of the size of ipsilateral cortical components of somatosensory evoked potentials. Moreover, anodal tDCS enhances corticospinal excitability (Nitsche and Paulus, 2000), including long-term changes of synaptic strength (Nitsche et al., 2008), and anodal tDCS over the primary motor cortex has an impact on spinal network excitability in humans (Roche et al., 2009). Roche and colleagues describe an increase of disynaptic inhibition at the spinal level reflex pathways during anodal tDCS that was caused by an increase in disynaptic interneuron excitability (Roche et al., 2009).

The computational neuroanatomy for motor control (Shadmehr and Krakauer, 2008) is shown in **Figure 1**. Shadmehr and Krakauer (2008) suggested specific functions of different parts of the brain in motor control. The main function of the


Here, during operant conditioning with visual feedback (Dutta et al., 2013a), the brain acts as the controller for trial-by-trial error correction during the visuomotor task which is assisted with volitionally-driven NMES (**Figure 1**). The real-time decoder for NMES (see **Figure 2**) acts as a intent detector to assist residual muscle function with electrical stimulation-evoked muscle action. However, stroke survivors often suffer from heterogeneous deficits in cortical control, e.g., delay in initiation and termination of muscle contraction (Chae et al., 2002) as well as deficits in the visuomotor attention networks (Corbetta and Shulman, 2011) conducive for motor learning. Therefore, our hypothesis is that the cortical control of NMES during visuomotor task and motor learning during balance rehabilitation may be facilitated with NIBS. The underlying concept of NIBS approaches is that NIBS can modulate excitability of a targeted cortical region. The sensor fusion for NIBS (see **Figure 2**) includes a NIBS controller that tries to maintain a more balanced brain state (Schlaug and Renga, 2008). The sensor fusion also includes gaze-interaction with CoP visual feedback (Sailer et al., 2005) to objectively quantify the engagement and stage of motor learning for the affected and unaffected sides, such that the quality of error feedback can be titrated to balance bilateral performance during operant conditioning. The human-machine interface (HMI) integrating biosignal sensors and motion capture with a NMES system for post-stroke balance rehabilitation is based on a point-of-care testing system (Dutta et al., 2013b) that has been shown feasible for EMG-triggered NMES therapy (Banerjee et al., 2014).

## **HYPOTHESIS 1: BRAIN ACTS AS A CONTROLLER FOR TRIAL-BY-TRIAL ERROR CORRECTION DURING VISUOMOTOR BALANCE THERAPY**

As shown in **Figure 1**, coordinated movement depends on interactions between multiple brain areas leading to transient functional connectivity networks (Shafi et al., 2012) where the brain acts as a controller viz. state estimation, optimization, prediction, cost, and reward. Active participation of motor-cortex (and other cortical areas) may be facilitated by modulating NMES with volitional effort where state-of-the-art prior works show that stimulation envelopes may be controlled (Yeom and Chang, 2010) or triggered (Banerjee et al., 2014) with volitional electromyogram (EMG). During operant conditioning, post-stroke subject volitionally drives NMES during visuomotor task performance for balance rehabilitation where the goal is to reduce error while steering a computer cursor to a peripheral target using volitionally generated CoP excursions, as illustrated in **Figure 3**. The human machine interface (HMI) integrating biosignal sensors and motion capture for volitionally driven NMES toward operant conditioning with visual feedback was evaluated in a community setting (Banerjee et al., 2014). We present a real-time

decoder in Subsection Proposed Method: Volitionally-Driven NMES-Assisted Visuomotor Balance Therapy for volitionally driven NMES that combines physical sensor signals with biopotentials from the HMI to facilitate erect posture recovery following internal postural perturbations caused by self-initiated movements.

A proof-of-concept study (without NMES) on HMI was successfully conducted on 10 able-bodied subjects (5 right-leg dominant males and 5 right-leg dominant females aged between 22 and 46 years) (unpublished material). All subjects gave their written informed consent for the experiments in compliance with the Declaration of Helsinki. They had no known neurological disorder at the time of the study. Here, stroke presents with heterogeneous deficits in motor control where the recovery of erect posture in stroke survivors following CoP excursions is proposed (Subsection Proposed Method: Volitionally-Driven NMES-Assisted Visuomotor Balance Therapy) to be assisted with NMES. Geurts et al. (2005) reviewed cross-sectional studies of voluntary weight-shifting capacity in patients after stroke compared to age-matched healthy control subjects and provided evidence of the following deficits: (1) multi-directionally impaired maximal weight shifting during bipedal standing, in particular toward the paretic leg; (2) slow speed, directional imprecision and small amplitudes of single and cyclic sub-maximal frontal plane weight shifts, most prominently toward the paretic side. An increased magnitude of postural sway has been described for individuals after stroke (Mansfield et al., 2012). Post-stroke sensory deficits may be responsible for these symptoms, because upright standing requires to be stabilized by active control strategies against instability induced by a large neural feedback transmission delay. Mansfield and colleagues proposed that measures of betweenlimb synchronization, overall postural sway, and weight-bearing symmetry are each independently important measures of poststroke standing balance control and can reveal discernable control problems (Mansfield et al., 2012).

Prior work suggests that visual CoP feedback during a weightshifting task may improve performance (Ustinova et al., 2001). Moreover, patients in the post-acute phase of stroke tend to rely more on visual information for postural control in both anteroposterior (A-P) and medio-lateral (M-L) planes than healthy age-matched controls (Geurts et al., 2005). Indeed, excessive reliance on vision for erect standing may decrease during rehabilitation, but can still be found in the chronic phase under more challenging conditions. Such abnormal reliance on vision may be related to a higher-level inability to select the pertinent sensory input. There is evidence that even in the chronic phase of stroke, visual deprivation training can reduce the degree of visual dependence for postural control (Geurts et al., 2005). In accordance, we present an operant conditioning paradigm where CoP excursions steers the cursor on a screen and the visual feedback of the cursor is corrupted by noise thereby effecting visual deprivation. We propose to vary the quality of visual feedback using

different noise levels for different locations on the screen according to the visuospatial attention during the visuomotor task for uniform learning of the affected and unaffected sides, and therefore present the subject with a virtual reality toward constrained induced movement therapy (Morris et al., 1997), as discussed in Section Proposed Method: Operant Conditioning Based on Gaze-Interaction in Virtual Reality. In Section Preliminary Evidence: Trial-by-Trial Error Correction during Operant Conditioning, we present evidence from our proof-of-concept study on healthy for trial-by-trial error correction during visuomotor balance therapy under an operant conditioning paradigm.

## **PROPOSED METHOD: VOLITIONALLY-DRIVEN NMES-ASSISTED VISUOMOTOR BALANCE THERAPY**

The capacity to voluntarily transfer body weight while maintaining standing balance over a fixed base of support is a prerequisite for safe mobility (Geurts et al., 2005). During balance training, the stroke survivors will voluntary shift their CoP location to steer the cursor as fast as possible under visual feedback. The stateof-the-art prior works show that NMES stimulation envelopes may be controlled (Yeom and Chang, 2010; Zhang et al., 2013) or triggered (Dutta, 2009) with volitional electromyogram (EMG) or electroencephalogram (EEG) (Niazi et al., 2012; Takahashi et al., 2012). However, post-stroke biopotentials often suffer from deficits, e.g., EMG suffers from delays in initiation/termination (Chae et al., 2002) as well as fatigue, and therefore solely biopotentials based control of a NMES-assisted dynamic balance task is challenging where such activation delays may result in falls. Such faults may be alleviated through sensor fusion with physical sensor signals (Dutta et al., 2011). Here, able-bodied muscle activation profiles from EMG can be used to define the NMES templates (Kobetic and Marsolais, 1994) for erect posture recovery where (optimal) muscle synergies (Chvatal and Ting, 2013) can be extracted from the EMG recorded bilaterally from healthy ankle muscles approximately coincident with those targeted for NMES (Piazza et al., 2012; Li et al., 2014) in post-stroke subjects right after presentation of the visual cue. The muscle synergy specifies the coordinated activation of several muscles, and each muscle synergy is expected to get activated during specific perturbation directions (A-P or M-L) and time bins following the visual cue (Torres-Oviedo and Ting, 2007). Recent work in humans demonstrates that the neural control of muscles may be modular, organized in functional groups often referred to as muscle synergies (Chvatal and Ting, 2013). Moreover, Torres-Oviedo and Ting (2007) showed that muscle synergies, i.e., a pattern of task-specific co-activation of muscles, represent a general neural strategy underlying muscle coordination in postural tasks. In fact, the composition and temporal activation of several muscle synergies identified across subjects are consistent with "ankle" and "hip" strategies in human postural responses (Torres-Oviedo and Ting, 2007). Although several studies show how the motor system elegantly circumvents the need to control its large number of degrees of freedom through a flexible combination of motor synergies (Chvatal and Ting, 2013), such a framework has not yet been leveraged for the generation of NMES stimulation patterns. Here, Alessandro et al. (2012) discussed the synthesis and adaptation of effective motor synergies for the solution of reaching tasks which can be leveraged with a reduced-order biped model for NMES template generation (Piazza et al., 2012; Li et al., 2014). To model the performance of a dynamic balance task such as volitional CoP excursions while maintaining standing balance over a fixed Base of Support (BoS), we will apply the "extrapolated center of mass" (xCoM) concept to define the Margin of Stability (MoS) (Hof, 2008). Here, bipedal standing is approximated as an inverted pendulum centered about the ankle joint. Its eigenfrequency (*ω*0) can be computed from the leg length (*l*), i.e., the height of the upper margin of the greater trochanter above the floor,

$$
\omega\_0 = \sqrt{\frac{\text{g}}{l}} \tag{1}
$$

where *g* is the acceleration due to gravity. Therefore, the xCoM location, *x y xCoM* , can be defined from the CoM projection on the ground, *x y CoM* ,

$$
\begin{bmatrix} \mathbf{x} \\ \mathbf{y} \end{bmatrix}\_{\mathbf{x}\text{CoM}} = \begin{bmatrix} \mathbf{x} \\ \mathbf{y} \end{bmatrix}\_{\mathbf{CoM}} + \begin{bmatrix} \dot{\mathbf{x}} / a\_0 \\ \dot{\mathbf{y}} / a\_0 \end{bmatrix}\_{\mathbf{CoM}} \tag{2}
$$

During performance of the bipedal reaching task, the maximum excursion of the xCoM location which does not result in a stepping response can be monitored. This will provide an estimate of the MoS within the BoS for standing balance control. Animal studies have shown that perturbations evoke coordinated longlatency responses that help to return the body to its postural equilibrium (Macpherson and Fung, 1999; Deliagina et al., 2008). A real-time decoder can detect this long-latency responses to control and/or trigger NMES to assist the post-stroke subjects to recover to the erect posture following internal perturbations. NMES is based on the observation of intermittent, ballistic-type corrective movements in healthy humans (Loram et al., 2005) where NMES of the ankle muscles will provide the assistive torque not only to generate basic support (i.e., adequate ankle stiffness) (Hwang et al., 2009) for upright standing but also to provide frequent, ballistic bias impulses for regaining balance from micro falls (Loram et al., 2005).

In our proof-of-concept study on healthy (no NMES), the augmented HMI system (**Figure 4**) was used to record CoP-CoM trajectories while the subjects were asked to keep their body rigid, and to maintain full feet contact with the Wii BB. The subjects were asked to lean as far as possible toward forward, toward backward, toward the right side, and toward the left side using visual feedback of the CoP location to provide calibration values for *α* and *β* (in Equation 3) such that the cursor does not go off the screen during performance of the visuomotor task when the subject uses full functional reach to steer the computer cursor. During the Central Hold task (CHT), the subjects were asked to keep the

cursor close to its origin with CoP excursions, *x y CoP* , which

accelerated the computer cursor, *x*¨ *y*¨ *Cur* , according to Equation (3) (discretized time, *t*, with time-step, *dt*)

$$
\begin{bmatrix}
\ddot{\boldsymbol{x}} \\
\ddot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t} = \boldsymbol{\varepsilon} \begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\text{CoP}}^{t} - \alpha \begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\text{Cur}}^{t-1} - \beta \begin{bmatrix}
\dot{\boldsymbol{\varkappa}} \\
\dot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t-1} + \eta
$$

$$
\begin{bmatrix}
\dot{\boldsymbol{\varkappa}} \\
\dot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t} = \begin{bmatrix}
\dot{\boldsymbol{\varkappa}} \\
\dot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t-1} + \begin{bmatrix}
\ddot{\boldsymbol{\varkappa}} \\
\ddot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t-1} dt
\\
\begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\text{Cur}}^{t} = \begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\text{Cur}}^{t-1} + \begin{bmatrix}
\dot{\boldsymbol{\varkappa}} \\
\dot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\text{Cur}}^{t} dt
\end{bmatrix}
$$

where *ε* = 0*.*01 s2*/*cm, *α* = 0*.*2 s<sup>−</sup>2, *β* = 0*.*1 s<sup>−</sup>1, *η* = *N*(0*,σ* = 0*.*1 s<sup>−</sup>2). The visuomotor task was divided into 100 trials lasting for a random duration evenly distributed between 11.5 and 15 s based on prior work of Stevenson et al. (2009). Every 20 ms a new dot was shown on the screen with a position drawn from a radially isotropic Gaussian distribution [*N*(0*,* 3*.*5 cm)] centered on the true position of the cursor. The subjects learned to modulate CoP excursion to keep the cursor close to the origin where the mean squared errors (MSEs) were monitored. It was hypothesized that MSE will stay steady during the exploratory stage, show a quick improvement during the skill acquisition stage, followed by a slow improvement during the skill refinement stage. Then, under amodified functional reach task (mFRT) paradigm (Dutta et al., 2014c) in upright standing, called the "Central hold" phase, the subject needs to steer the cursor as fast as possible toward a randomly presented peripheral target as cued by visual feedback (see **Figure 3**). Following this "Move" phase, the subject will have to hold the cursor at the target location for 1 s during the "Peripheral hold" phase. Following the "Peripheral hold" phase, the cursor will "Reset" back to the center. Following CoP excursion to steer the cursor during the "Move" phase, the recovery of erect posture will be assisted with NMES.

In our proof-of-concept study on healthy (no NMES), EEG and electrooculogram (EOG) recordings were investigated to detect motor intent during CoP excursions. EEG recordings were conducted using the Emotiv neuroheadset (Emotiv, Australia) which wirelessly relayed EEG data to the PC from 14 channels (saline soaked sponges of the Emotiv Neuroheadset were replaced with Ag/AgCl electrodes) of the EEG cap (International 10–20 system)—Fp1, Fp2, F3, Fz, F4, C3, Cz, C4, P3, Pz, P4, O1, O2 (plus Common Mode Sense/Driven Right Leg references). EEG electrode impedance was kept below 5 kOhm by scratching the scalp and using conductive gel (Ten20, Weaver, USA). The EEG data were analyzed using EEGlab Matlab (Mathworks, USA) software (Delorme and Makeig, 2004). Additionally, a fourelectrode EEG, with one electrode positioned at the outer edge of each eye to monitor the horizontal motions and one electrode positioned above and one below the right eye for obtaining vertical movements, was acquired. The eye-blinks along with saccades were identified using EOGUI Matlab (Mathworks, USA) software ["Eogui—a Software to Analyze Electro-Oculogram (EOG) Recordings - File Exchange - MATLAB Central" 20141], which provides the Duration (milliseconds), Amplitude (angular degree), and Viewing Direction (for the saccades in nautic degree; 0 for upwards gazes, 90 for gaze to the right, 180 for downwards gazes, 270 for gazes to the left). Then, eye-blink artifacts identified from EOG were rejected using EEGlab functions and the artifact-free EEG was analyzed for each trial in 4.096 s epochs using a Hanning time window (epochs were overlapped by 50%), and an estimation of the power spectra was calculated for the absolute alpha (7.5–14 Hz) band via fast Fourier transformation using the Welch technique ("pwelch" in Matlab, MathWorks, USA) to detect alpha event-related desynchronization (aERD) (Pfurtscheller and Lopes da Silva, 1999). aERD appearance was defined when the power was below the resting state value, thereby reflecting cognitive attention during CoP visuomotor task, i.e.,

$$aERD\% = \left(\frac{P\_{task} - P\_{resting} - \text{state}}{P\_{resting} - \text{state}}\right) \times 100\tag{4}$$

where *Ptask* is the power spectral density estimate during the visuomotor task and *Presting*<sup>−</sup>*state* is the power spectral density estimate during resting state. The mFRT is proposed to quantify the subjects' ability to volitionally shift their CoP position as quickly as possible without losing balance while cued with CoP visual biofeedback. During CHT and mFRT, alpha event-related desynchronization (*aERD*%) was found primarily in the parietal and occipital EEG electrodes (unpublished material), shown by an illustrative plot in **Figure 5**.

#### **PROPOSED METHOD: OPERANT CONDITIONING BASED ON GAZE-INTERACTION IN VIRTUAL REALITY**

The capacity to voluntarily transfer body weight while maintaining standing balance over a fixed base of support is a prerequisite for activities of daily living. Stroke survivors use only a small part of their base of support for voluntary weight displacements. Also, during standing and antero-posterior (A-P) weight-shifting, stroke patients deviate from the mid-line of the base of support more than healthy control subjects (Goldie et al., 1996). Moreover, compared to control subjects, stroke patients have significant deficits in the ability to weight-shift in the medio-lateral (M-L) direction (Goldie et al., 1996). Furthermore, there is strong evidence that physiological markers such as blink rate can be used as effective indicator of one's mental workload (Marshall, 2007). In our augmented HMI, two Wii Balance Board™ (Wii BB) (Nintendo, USA) (Clark et al., 2010) were positioned side by side without touching (i.e., *<*1 mm apart). Following the experimental protocol of Mansfield and colleagues (Mansfield et al., 2012), the subjects could stand with one foot on each Wii BB in a standard position (feet oriented at 14◦ with 7◦ rotation of each foot with an inter-malleoli distance equal to 8% of the height), with each foot equidistant from the midline between both Wii BBs. In our integrated system, we augment the operant conditioning paradigm with a gaze-sensitive virtual reality-based adaptive

<sup>1</sup>http://www.mathworks.com/matlabcentral/fileexchange/file\_infos/32493-eo gui-a-software-to-analyze-electro-oculogram-eog-recordings. Accessed April 2.

response technology (Lahiri et al., 2013) that evaluates motor learning during the performance of the visuomotor task with regard to visuomotor coordination via applying the principles of engagement. Specifically, making the visuomotor task easier for the affected side in virtual reality may yield greater neuroplastic changes and functional outcomes in neurorehabilitation (Danzl et al., 2012).

The post-stroke subject will stand with a minimum baseline stimulation level necessary to generate basic support for upright standing according to clinical observation. From this upright standing, the patient needs to steer the cursor as fast as possible toward a randomly presented peripheral target as cued by visual feedback (see **Figure 3**) under a modified functional reach task (mFRT) paradigm, as discussed in Section Proposed Method: Operant Conditioning Based on Gaze-Interaction in Virtual Reality. During the bipedal reaching task using visual feedback, the acceleration of the cursor can be controlled with CoP excursions measured by two (one for each limb) Wii BB according to the following dynamics (Stevenson et al., 2009),

$$
\begin{bmatrix}
\ddot{\boldsymbol{x}} \\
\ddot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\boldsymbol{Cur}} = \boldsymbol{\varepsilon}\_1 \begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\boldsymbol{CoP\_1}} + \boldsymbol{\varepsilon}\_2 \begin{bmatrix}
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\boldsymbol{CoP\_2}}
$$

$$
\boldsymbol{\varkappa} \\
\boldsymbol{\mathcal{Y}}
\end{bmatrix}\_{\boldsymbol{Cur}} - \boldsymbol{\beta} \begin{bmatrix}
\dot{\boldsymbol{\varkappa}} \\
\dot{\boldsymbol{\mathcal{Y}}}
\end{bmatrix}\_{\boldsymbol{Cur}} + \boldsymbol{\eta} \tag{5}
$$

where *ε*1*, ε*<sup>2</sup> parameterizes the effect of recorded *CoP*1*, CoP*<sup>2</sup> excursions *x y CoP*1 and *x y CoP*2 on the cursor acceleration,

 *x*¨ *y*¨ *Cur* , and *α* and *β* parameters prevent the cursor from going off-screen, and *η* ∼ *N*(0*,σp*) represents the process noise with variance *σp*. The increase in gain *ε*1*,ε*<sup>2</sup> makes the task require lesser CoP excursion range while a decrease in the variance, *σp*, reduces the uncertainty. Task difficulty can be increased by decreasing the gain *ε*1*,ε*<sup>2</sup> and increasing the variance, *σp*, where we present the subject with a virtual reality toward constrained induced movement therapy (Morris et al., 1997). Furthermore, toward constrained induced movement therapy (Morris et al., 1997), visual deprivation will be effected by introducing observer noise in the visual feedback by flashing a low contrast dot on the screen with a position drawn from a radially isotropic Gaussian distribution centered on the true position of the cursor. The variance representing this Gaussian cloud of points *N*(0*,σo*), will introduce observer noise as shown by prior work (Stevenson et al., 2009). Therefore, task difficulty can be modulated with parameters *ε*1*,ε*2*,σp*, and *σ<sup>o</sup>* for the affected and unaffected limbs during operant conditioning. For example, the gain, *ε*2*,ε*2, can be set individually for the affected and unaffected limbs for each peripheral target such that they present similar reaching errors during the exploratory stage of motor learning for the unipedal reaching task, which may lead to comparable reward expectations. During performance of the bipedal reaching task, the subject can learn to volitionally modulate CoP excursions using coordinated bipedal muscle activity to generate cued cursor movement under visual feedback. Here, identification of visuospatial attention and motor learning is critical for constrained induced movement therapy (Morris et al., 1997) where a Bayesian framework addresses the problem of updating beliefs and making inferences based on observed data. We present a standard Kalman filter to compute the estimated state of the cursor from observations while capturing user behavior during the reaching task, i.e., the "Central hold," "Move," and then "Peripheral hold" phases of the task. The peripheral targets are at the subject-specific limits (position and velocity) of CoP excursion, which are mapped using the *α* and *β* parameters of the Equation (5) for each target. An important feature of the Kalman filter is how estimation changes as a function of feedback uncertainty (Stevenson et al., 2009). For example, increasing the observation noise by increasing the variance, *σo*, for a certain peripheral target while keeping the process dynamics and process noise identical (*ε*1*, ε*2*,σp*) may have different effects on its state updates (i.e., Kalman update) based on poststroke residual function. Hence, the Kalman filter model allows to interrogate the post-stroke control mechanisms by capturing the effects of observation noise (or, visual feedback uncertainty) on the control of cursor state (and reaching errors) toward multidirectional peripheral targets. The Kalman filter assumes that the cursor state, *X* = [*x, y, x*˙*, y*˙], at time *t* evolves from the state at time, *t* − 1, according to linear dynamics and control:

$$X\_t = AX\_{t-1} + Bu\_{t-1} + W\_t \tag{6}$$

where *ut* is the control signal and *Wt* is the process noise derived from a Gaussian distribution. Here, A and B follow from the Equation (6) for an ideal observer and *Wt* reflects the effects of process noise *η* ∼ *N*(0*,σp*). For example, Stevenson et al. (2009) found bang-bang controller more similar to human control mechanisms than a linear-quadratic regulator (LQR) during bipedal reaching tasks.

$$
\mu\_t = \lambda\_1 \text{sign}\left(\left[\cos\theta \text{ } \sin\theta\right]X\right) + \lambda\_0 \tag{7}
$$

Here, *θ* parameterizes the decision rule for a given state of the cursor, *X*, and *λ*0*,λ*<sup>1</sup> parameterizes the magnitude of the two states for bang-bang control for each peripheral target, to capture the "Move" phase toward that target. Moreover, Stevenson et al. (2009) have found that healthy humans readily dampen the cursor oscillations induced by the process noise, *η* ∼ *N*(0*,σp*) which may be deficient post-stroke based on residual function. Here, the cross-correlation between the fluctuations of cursor dynamics (process noise, *η* ∼ *N*(0*,σp*) and CoP during the "Central hold" and "Peripheral hold" phases of the task should be consistent with ideal observer models in normal cases of residual function, i.e., subjects should respond more slowly and with lower amplitudes when the feedback is more uncertain (increased variance, *σo*). Also the gains, *ε*1*, ε*2, i.e., the range of CoP excursion required for the reaching task, should not affect subject's control policy in normal cases of residual function (Stevenson et al., 2009). Therefore, the post-stroke subject's postural control policies can be evaluated for each peripheral target by changing the gain to investigate if hemiparesis affects control strategies for reaching certain targets, e.g., at the paretic side. Especially at low gains, the visuomotor task becomes challenging when the subject may use compensatory mechanisms to map between CoP actions and their visual sensory consequences. Motor learning will start from exploratory and skill acquisition to skill refinement stages. The reach errors will stay steady during the exploratory stage and will show a quick improvement during the skill acquisition stage followed by slow improvement during the skill refinement stage. In fact, this can be detected with gaze behavior where active sensing with eye movements during exploratory actions may contribute to coupling of perception and action (Vickers, 2009). For example, the quiet eye (QE) period can be defined as the elapsed time between the last visual fixation to the target and the initiation of the motor response, which has emerged as a characteristic of higher levels of performance (Vickers, 2009). In fact, Mann et al. (2007) presented a meta-analysis that supported the critical role of visual attention in the expert advantage, revealing that experts consistently exhibit fewer fixations of longer duration than nonexpert comparison groups. Moreover, during visuomotor tasks, Mann et al. (2011) found that experts exhibit a prolonged QE period and greater cortical activation in the right-central region compared with non-experts.

Prior work on gaze behavior during eye-hand coordination (Sailer et al., 2005) suggests that gaze interaction can provide an evaluative feedback of motor learning. It starts with pursuing the cursor during the exploratory stage, continues with predicatively marking the desired cursor positions during skill acquisition stage, and ends up with direct shifts toward the target during the skill refinement stage. Therefore, the time delay, *τdelay*, between the two signals, as determined by the argument of the maximal cross-correlation, should indicate the stage of motor learning. Moreover, during skill acquisition the desired cursor trajectory can be decoded from gaze activity to see if the desired cursor positions are successfully reached under visuomotor control during the "Move" phase. Here, Bayesian learning involves computing the posterior probability distribution of the stage of motor learning during the "Move" phase from the observed gazeinteraction (i.e., *τdelay*) where a coarse estimate of the stage of motor learning is based on the reach error at the end of the "Move" phase. In this center-out bipedal visuomotor reaching task with zero process and observer noise (*σ<sup>p</sup>* = 0 and *σ<sup>o</sup>* = 0), two modes of performance—skilled, unskilled—are possible during the "Move" phase. These two modes of performance (or hypotheses) are considered to be mutually exclusive and exhaustive hypotheses, H = {*hskilled, hunskilled*}, and can be formulated for each cued peripheral target, *Ti*, during the visuomotor reaching task. In the Bayesian framework, we denote the degree of belief in a hypothesis by probabilities and determine this belief, called posterior probabilities, using the product of data likelihood and prior probabilities (Bayes's rule):

$$P(h\_i|d, T\_i) = \frac{P(d|h\_i, T\_i)P(h\_i)}{\sum\_{i=1}^{N} P(d|h\_i, T\_i)P(h\_i)}\tag{8}$$

where prior probabilities, *P*(*hi*), represent the belief before observing the data, *d* (e.g., *τdelay*, etc.), and likelihoods, *P*(*d*|*hi, Ti*), for each peripheral target, *Ti*, denote the probability with which we would expect to observe the data if the hypothesis is true. To estimate the best peripheral targets for motor learning (i.e., distinguishing *hskilled, hunskilled*) with subject-specific gaze interaction data, *d*, the confusion probability matrix for each possible peripheral target, *Ti*, can be found

$$C\_{T\_i} = \int \sqrt{p(d|h\_{skilled}, T\_i)p(d|h\_{unskilled}, T\_i)dd}.\tag{9}$$

Here, we present a modular neural network implementing Bayesian learning and inference for each possible peripheral target, *Ti*, as described in a prior work by Kharratzadeh and Shultz (2013).

The first module, Module 1, implements the Bayes's rule assuming that the values of prior and likelihood probabilities are given as input. Its output is the posterior probability. This needs to be run for each hypothesis.

Module 2 computes the likelihood probabilities based on observed data. The role of Module 2 is to learn these distributions as the underlying mechanisms generating the data. For example, positive *τdelay* (i.e., gaze pursuing the cursor positions) represents unskilled performance, i.e., at the exploratory stage and negative *τdelay* (i.e., gaze predicatively marking the desired cursor positions) indicates skilled performance, i.e., at the end of skill acquisition stage. Here, the generative process can be described as a Gaussian with mean *hi* and standard deviation 1; with positive mean for *i* = *unskilled*, negative mean for *i* = *skilled* (Kharratzadeh and Shultz, 2013),

$$f(d, h\_i) = \frac{1}{2\pi} e^{\frac{(d - h\_i)^2}{2}} \tag{10}$$

Module 3 computes the hypothesis's prior probabilities by learning their generative discrete distribution function. For example, the generative function can be of the form (Kharratzadeh and Shultz, 2013),

$$\log(h\_i) = \nu e^{\frac{h\_i^2}{2}} \tag{11}$$

where *υ* is chosen such that the sum of the prior probabilities equal 1.

For our hypotheses presented for each peripheral reach target, we need to first learn Modules 1 and 3 one-time based on the gaze behavior with respect to the cursor position during the "Move" phase toward a peripheral reach target. During exploratory and skill acquisition stages of operant conditioning, multiple "Move" phases will be performed for each peripheral reach target where the stage of motor learning can be estimated from the reach error following each "Move" phase (Stevenson et al., 2009). A variant of the cascade correlation method, called the sibling descendent cascade correlation (SDCC), can be used to train the Modules using input(s)-output training pairs as shown earlier (Kharratzadeh and Shultz, 2013). After learning Modules 1 and 3, we will use them twice for computing the posterior and prior of each hypothesis, which needs to be learned for each hypothesis, using Module 2. After sufficient training on gaze-interaction, the modular neural network will provide online feedback of the subject's stage of motor learning during the "Move" phase toward the cued peripheral reach target, *P*(*hi*|*d, Ti*), that is based on the observed saccades and gaze fixations with respect to the cursor position (Equation 8). Here, the confusion matrix, *CTi* , will provide an estimate of the subject-specific performance of such a classifier (Equation 9). Therefore, a comparable reach error can be maintained across peripheral reach targets by online adaptation of *σ<sup>p</sup>* for operant conditioning of volitional multi-directional CoP excursions. For example, increasing the variances, *σp*, will increase the difficulty of the visuomotor task during the "Move" phase, leading to an increase in the reach error at the end of the "Move" phase. Such performance-based adaptive schedules have been shown to enhance motor learning when compared to random scheduling (Choi et al., 2008).

Based on these prior investigations and specifically on a prior work on gaze behavior during eye-hand coordination (Sailer et al., 2005), we postulate that gaze interaction may provide evaluative feedback of motor learning during the "Move" phase that can be used to adapt cursor dynamics such that compensatory mechanisms of the unaffected side can be constrained (Taub and Morris, 2001) toward constrained induced movement therapy (Morris et al., 1997). Such performance-based adaptive schedules have been shown to enhance motor learning when compared to random scheduling (Choi et al., 2008).

### **PRELIMINARY EVIDENCE: TRIAL-BY-TRIAL ERROR CORRECTION DURING OPERANT CONDITIONING**

The mean square error (MSE) and gaze-interaction (Sailer et al., 2005) with the visual feedback can be continuously monitored during the visuomotor task and post-stroke subjective learning in the affected and unaffected sides may be modulated by changing the respective error feedback in an operant conditioning framework (Dutta et al., 2013a), i.e., in principle constrained induced movement therapy (Morris et al., 1997) in a virtual reality. A conceptual review of this operant conditioning framework for balance training is presented in the last Subsection Proposed Method: Operant Conditioning Based on Gaze-Interaction in Virtual Reality. Additionally, modulation of event-related desynchronization (ERD) with motor cortex tDCS in healthy volunteers (Matsumoto et al., 2010) and patients with chronic severe hemiparetic stroke has been shown feasible (Kasashima et al., 2012). Based on our proof-of-concept study on healthy (no NMES), the MSE normalized by the baseline value [(without process noise, i.e., *η* = *N*(0*,σ* = 0*s* <sup>−</sup>2)] trended toward a decrease (see **Figure 6**), the blink rate trended toward an increase (see **Figure 6**), and the saccadic direction relative to the cursor acceleration trended toward zero (see **Figure 6**) during the visuomotor task under an operant conditioning paradigm. Moreover, **Figure 7** shows that the *aERD*% at the position Pz correlated with the normalized mean square error (MSEnorm) during the visuomotor task performance in CHT. The 95% prediction bounds are also shown for a linear-fit which indicates a 95% chance that a new observation is placed within the lower and upper prediction bounds. The coefficients (with 95% confidence bounds) of the linear fit, *aERD*% = *a* × *MSEnormalized* + *b*, are *a* = 10*.*97 (10.19, 11.76) and *b* = −18*.*16( −18*.*8*,* −17*.*52). The *R*2-value was 0.4316 indicating the goodness of fit. Moreover, during mFRT, we could correctly classify roughly 76% of the movement directions as left or right based on pair-wise *aERD*% asymmetry in P3, P4, and O1, O2 electrodes from epochs lasting 0 to 700 ms following peripheral cue presentation (unpublished material).

Therefore, MSE and gaze-interaction (Sailer et al., 2005) can be continuously monitored during the visuomotor task and poststroke subjective learning in the affected and unaffected sides may be modulated by changing the respective error feedback in an operant conditioning framework (Dutta et al., 2013a), i.e., in principle constrained induced movement therapy (Morris et al., 1997) in a virtual reality. Additionally, modulation of eventrelated desynchronization (ERD) with motor cortex tDCS—a NIBS modality—in healthy volunteers (Matsumoto et al., 2010) and patients with chronic severe hemiparetic stroke has been shown feasible (Kasashima et al., 2012). Also, tDCS over PPC has been shown to modulate visuospatial localization (Wright and Krekelberg, 2014) where lesions in human PPC can lead to complex syndromes consisting of an inability to attend, perceive and react to stimuli in the visual field contralaterally to the lesion, an inability to voluntarily control the gaze, and an inability

to coordinate visually elicited hand movements (Caminiti et al., 2010; Lindner et al., 2010; Hwang et al., 2012). Based on these prior works, we postulate that NIBS can facilitate trial-by-trial error correction process during visuomotor balance therapy under operant conditioning.

## **HYPOTHESIS 2: NIBS CAN FACILITATE TRIAL-BY-TRIAL ERROR CORRECTION AND ITS RETENTION DURING OPERANT CONDITIONING**

In Subsection Preliminary Evidence: Trial-by-Trial Error Correction during Operant Conditioning, evidence for

trial-by-trial error correction during operant conditioning was presented where the brain acting as a controller need to not only perform trial-by-trial error correction but also need to adapt in response to prior error information via retention which is called motor adaptation. Here, active participation of cortical areas may be facilitated with NIBS of motor and premotor cortex, cerebellum, and/or PPC (see **Figure 1**). A hierarchical error processing was proposed (Krigolson and Holroyd, 2007) in the brain acting as a controller where error processing during visuomotor control involves the evaluation of "high-level" errors (i.e., failures to meet a system goal) by a frontal system involving the anterior cingulate cortex and the basal ganglia (Krigolson and Holroyd, 2006; Holroyd and Coles, 2008), and the evaluation of "low-level" errors (i.e., discrepancies between actual and desired motor commands) by a posterior system involving the PPC and/or the cerebellum (Desmurget and Grafton, 2000; Pisella et al., 2000; Miall et al., 2001; Gréa et al., 2002). Here, the PPC is an important interface between sensory and motor cortices, integrating multimodal sensory and motor signals to process spatial information for a variety of functions including guiding attention and planning movements (Andersen and Gnadt, 1989; Snyder et al., 1997).

In our single-blind, sham-controlled study (Dutta et al., 2014c), five healthy right-leg dominant subjects (age: 26.4 ± 5.3 yrs) were evaluated using the HMI system under two conditions—with anodal tDCS of primary motor representations of right tibialis anterior muscle and with sham tDCS. Paired *t*-test (Matlab "ttest" function, The Mathworks, Inc., USA) was performed for the differences in % change of stabilogram metrics from baseline values after administrating tDCS/sham session, for all the subjects pooled together. The results showed that anodal tDCS of primary motor representations of the right tibialis anterior muscle strongly (*p <* 0*.*0001) affected maximum CoP excursions but not return reaction time in healthy volunteers. Also, anodal tDCS had a strong (*P <* 0*.*0001) effect on the % change (decrease) in sway area from baseline values when compared to sham at 45 and 60 min post-tDCS. Anodal tDCS had only a moderate effect (*P* = 0*.*0113) on the change (decrease) in the path length of the CoP trajectory from the respective baseline value when compared with sham 60 min post-tDCS. Moreover, the results showed that anodal tDCS strongly (*P <* 0*.*0001) affected the change in centroid of CoP data-points from baseline value during quiet standing in the medio-lateral direction when compared to sham tDCS. The reason for this change in the centroid of CoP data-points during quiet standing (Dutta et al., 2014c) following motor cortex tDCS is postulated to be inadvertent parietal tDCS due to the active electrode position roughly 1 cm left lateral and 2 cm posterior to Cz (International 10-20 EEG system), i.e., close to P3, and relatively high current density (0.06 mA/cm2). Indeed, the PPC is an important interface between sensory and motor cortices, integrating multimodal sensory and motor signals to process spatial information for a variety of functions including guiding attention and planning movements (Andersen, 1997). Therefore, a tDCS protocol targeting the PPC is presented in the Subsection Proposed Method: NIBS Protocol to Facilitate Trialby-Trial Cortical Control and Adaptation During Visuomotor Task. Here, in order to test successful trial-by-trial error correction and its retention during visuomotor balance therapy under operant conditioning, we propose in Subsection Proposed Method: Using Aftereffects to Evaluate Successful Trial-by-Trial Adaptation During Operant Conditioningthe use of aftereffects that occur in motor control when the visual or mechanical variables of the targets are perturbed in a systematic manner. This is based on our prior work on using aftereffects to evaluate successful adaptation during EMG-driven NMES-assisted locomotor exploration activity for post-stroke gait training (Dutta et al., 2014a) where we found that only stroke subjects who showed aftereffects during systematic perturbation of the "EMG to NMES mapping" parameters at random catch-trials during the locomotor exploration activity, showed post-intervention changes in the EMG pattern during volitional (no NMES) treadmill walking.

#### **PROPOSED METHOD: NIBS PROTOCOL TO FACILITATE TRIAL-BY-TRIAL CORTICAL CONTROL AND ADAPTATION DURING VISUOMOTOR TASK**

Analysis of simultaneously acquired EEG/EMG and gazeinteraction data can be used to assess potential mechanisms underlying skill acquisition during visuomotor task (Mann et al., 2011). During volitionally generated CoP excursions based on visual feedback (**Figure 3**), the visual system must orient to and process the relevant visual (target) cues to ascertain both distance and direction of the required CoP excursion, while the working memory is called upon for the required joint torques to match the cursor with the visual (target) cues. Recent investigations lend support to the motor programming/preparation function of the QE period based on simultaneous EEG recordings (Mann et al., 2011) where slow cortical potential (SCP) negative shifts in EEG preceding voluntary movement, called bereitschaftspotential (BP) (Shibasaki and Hallett, 2006), lends itself well to the study of the preparatory period preceding task execution. Indeed, Mann et al. (2011) found: (1) greater BP negativity (particularly in central recording locations) for the expert compared with non-experts, and (2) QE duration was associated with BP negativity in central cortical regions. Therefore, it was postulated that the QE is a temporal period when task-relevant environmental cues are processed and motor plans are coordinated for the successful completion of an upcoming task. In our preliminary study (Dutta, 2014), we found that motor cortex anodal tDCS: (1) increased the frequency of negative epochs of the early (2.5 s–300 ms) phase of SCP before movement initiation, and (2) the slope of negative epoch for the late (300 ms–0 s) phase of SCP before movement initiation. Our NIBS protocol to facilitate cortical control and adaptation is based on the hypothesis that throughout the preparation and movement phases of skill execution, the visual attention centers (i.e., occipital and parietal cortex) disseminate requisite commands to motor regions of the cortex (i.e., motor cortex, premotor cortex, supplementary motor area, basal ganglia, and cerebellum), each of which are reflected in BP components (Mann et al., 2011). Here, our preliminary results from healthy subjects on facilitating myoelectric-control with tDCS (Dutta et al., 2014b) showed specific, and at least partially antagonistic effects, of motor cortex and cerebellar anodal tDCS on motor performance during myoelectric control where cerebellum may play a critical role in both formation of motor memory and its retention (Herzfeld et al., 2014). Moreover, during visuomotor task performance, visual search to orient to and process the relevant visual (target) cues require contributions of human frontal eye fields (FEF) and PPC where PPC seems to be involved only when a manual motor response to a stimulus is required (Muggleton et al., 2011). Therefore, PPC may play a critical role in the preparatory activity in the general context of sensorimotor transformations linking perception to action where the SCP (e.g., BP) reflects activation of subcortical and cortical generators (cortico-basal ganglia-thalamo-cortical circuitry) necessary not only in motor execution but also in its preparation (Jahanshahi and Hallett, 2003).

Wright and Krekelberg (2014) hypothesized that each hemisphere biases processing to the contralateral hemifield and that the balance of activation between the hemispheres contributes to position perception. They presented a bihemispheric tDCS protocol for PPC and hypothesized that excitability is reduced beneath the cathode and increased beneath the anode where closedloop feedback control of bihemispheric tDCS for PPC using the MatNIC and StarStim (Neuroelectrics, Spain) NIBS interface is presented in **Figure 8**. Indeed, when Wright and Krekelberg (2013) applied tDCS bilaterally, e.g., cathodal stimulation over right PPC concurrent with anodal stimulation over left PPC

(right-cathodal) or vice versa (left-cathodal), they found that both tDCS conditions altered perceived position to the left relative to a sham stimulation baseline condition. This effect was stronger for right-anodal than for right cathodal tDCS, and lasted for about 15 min after stimulation. Based on these prior works, we postulate that bihemispheric application of tDCS (Wright and Krekelberg, 2013) at P3 and P4 (International 10–20 system) will facilitate cortical control during visuomotor task while cerebellar tDCS (Herzfeld et al., 2014) will facilitate up- or downregulation of error-dependent motor learning and retention in a polarity-dependent manner. Bihemispheric application of tDCS (Wright and Krekelberg, 2013) at P3 and P4 (International 10- 20 system) in conjunction with cerebellar tDCS (Herzfeld et al., 2014) is postulated to facilitate cortical control and adaptation during visuomotor task performance especially on the affected side since the evaluation of "low-level" errors (i.e., discrepancies between actual and desired motor commands) is hypothesized to be performed by a posterior system involving the PPC and/or the cerebellum (Desmurget and Grafton, 2000; Pisella et al., 2000; Miall et al., 2001; Gréa et al., 2002).

## **PROPOSED METHOD: USING AFTEREFFECTS TO EVALUATE SUCCESSFUL TRIAL-BY-TRIAL ADAPTATION DURING OPERANT CONDITIONING**

Trial-by-trial error correction during the visuomotor task may be facilitated with bihemispheric application of tDCS for PPC (Wright and Krekelberg, 2013). However, it is important to also evaluate trial-by-trial motor adaptation during the visuomotor task under operant conditioning paradigm that may be facilitated with cerebellar tDCS (Herzfeld et al., 2014). Here, Held and colleagues (Held and Gottlieb, 1958; Held and Freedman, 1963) have found aftereffects only with sensorimotor integration, which may then lead to motor adaptation. In principal accordance, aftereffects that occur in motor control when the visual or mechanical variables of the targets are perturbed in a systematic manner can be used to test successful motor adaptation (Dutta et al., 2014a). Therefore, controlled variability can be introduced in the form of pseudorandomly interspersed catch trials in the otherwise predictable visuomotor task where the parameter *εaffected* that maps the effect of recorded *CoPaffected* excursions of the affected side on the cursor acceleration (Equation 5) can be perturbed. Thus, catch trials are proposed to be a reasonable method of exaggerating performance errors during the visuomotor task without disrupting the predictive process. Therefore, the subjects should correct both their own prediction errors and the artificially induced errors resulting from the catch trials in the same manner. It is postulated that in case of successful trial-by-trial adaptation during operant conditioning, the subject should greatly change

their CoP excursion

(Equation 5) on the next trials

*CoPaffected* to catch trial in response to the unusually large error in the catch trial.

#### **PRELIMINARY EVIDENCE: EFFECTS OF BIHEMISPHERIC tDCS FOR THE POSTERIOR PARIETAL CORTEX**

The PPC may play a critical role in sensorimotor transformations linking perception to action during quiet standing in terms of CoP trajectory (and stabilogram) (Dutta et al., 2014c). The proofof-concept pilot study was based on our prior work (Dutta et al., 2014c) where five healthy right-leg dominant male subjects aged between 24 and 46 years were evaluated under two conditions right-cathodal vs. left-cathodal—tDCS with a pair of 6.7 × 6.7 cm saline-soaked sponge-rubber electrodes (see **Figure 9**). The current was 1 mA applied for 15 min such that the current density (0.02 mA/cm2) was in agreement with Wright and Krekelberg (2013) but lower than our prior work (0.06 mA/cm2) (Dutta et al., 2014c). The CoP measurements were made during rest periods of quiet standing for 3 min, just before and immediately after the completion of the tDCS sessions. The study design was repeatedmeasure, randomized-order with sufficient (1 week) "wash-out" time in between the sessions. Paired *t*-tests (Matlab "ttest" function, The Mathworks, Inc., USA) were performed to compare the impact of right-cathodal vs. left-cathodal for the % posttDCS change in the centroid of the CoP from baseline (pre-tDCS) values. Indeed, right-cathodal (P4 cathodal, P3 anodal) shifted the CoP centroid toward left by 14 ± 8% and left-cathodal (P4 anodal, P3 cathodal) shifted the CoP centroid toward right by 11 ± 9%. Consequently, a statistically significant (*p <* 0*.*1) difference was found between right-cathodal vs. left-cathodal tDCS. Since weight bearing on the paretic lower extremity and transfer of weight from one lower extremity to the other are important goals of stroke rehabilitation (De Nunzio et al., 2014), tDCSfacilitated amelioration of post-stroke limb loading asymmetry during biofeedback rehabilitation may improve performance of many functional activities.

## **DISCUSSION**

The degree to which voluntary guided reaching movements are planned in advance or adapted online is still under investigation. Most well-known models such as the "feedforward models" assume that when motor commands are planned, the outcome of the movement is predicted by the current position of the limbs (Desmurget and Grafton, 2000). According to the "feedforward models" for the visuomotor task, the predicted position of the cursor is compared with the actual position of the cursor with respect to the reaching goal and then online-corrected if the parameters deviate due to noise (e.g., process and observation noise). Thus, a subjects' internal model of the visuomotor task has to be able to adapt to the new dynamics of the environment (Shadmehr and Mussa-Ivaldi, 1994). In fact, it has been proposed that the P300, an ERP component with a parietal scalp distribution, reflects the updating of an internal model of the movement environment that is used to help to plan and execute future motor output (Krigolson et al., 2008). Correspondingly, lesions in the human PPC can lead to complex syndromes consisting of an inability to attend, perceive and react to stimuli in the visual field contralateral to the lesion, an inability to voluntarily control the eye gaze, and an inability to coordinate visually elicited movements (Hyvärinen, 1982; Caminiti et al., 2010; Hwang et al., 2012; Wilke et al., 2012). A recent work demonstrated that in the resting brain, monocephalic anodal tDCS over PPC areas altered ongoing brain activity, specifically in the alpha band rhythm (Spitoni et al., 2013), which may facilitate updating of a deficient internal model during post-stroke

*x y* 

rehabilitation. Here, timing of tDCS with respect to the rehabilitation task is critical (Stagg et al., 2011) since regulatory metaplastic mechanisms exist to modulate the effects of a stimulation intervention in a manner dependent on prior cortical excitability, thereby preventing destabilization of existing cortical networks. In our study, the strongest change occurred in the first 2 min after the stimulation ended. Spitoni et al. (2013) found that the tDCS aftereffects diminished systematically and suggested that tDCS affects EEGs immediately after stimulation. Our preliminary study (Dutta and Nitsche, 2013) supported this notion that tDCS affects EEGs immediately after stimulation where Stagg et al. (2011) showed that the application of tDCS during an explicit sequence-learning task led to modulation of behavior in a polarity specific manner: relative to sham stimulation, anodal tDCS was associated with faster learning and cathodal tDCS with slower learning. However, application of tDCS prior to performance of the sequence-learning task led to slower learning after both anodal and cathodal tDCS (Stagg et al., 2011). Based on these prior works that showed that anodal tDCS interacts with subsequent motor learning in a metaplastic manner and suggested that anodal stimulation modulates cortical excitability in a manner similar to motor learning (Stagg et al., 2011), a closed-loop feedback control of bihemispheric tDCS for PPC is proposed during visuomotor task performance, as illustrated in **Figure 8**.

The goal of this hypothesis and theory paper was to examine prior works for a conceptual review to make a case for multi-level electrotherapy toward post-stroke balance rehabilitation. Under this multi-level electrotherapy concept, both the cortical control of NMES assisted visuomotor task and the motor adaptation toward balance rehabilitation are facilitated with an adjuvant treatment with NIBS. Such a reconceptualization of electrotherapy approaches, where one (NIBS) is facilitating the other (NMES) toward a common goal (motor learning), could help to push forward electrotherapy for neurorehabilitation.

## **ACKNOWLEDGMENTS**

This material is based upon work supported by the Franco-German Hubert Curien Partnership (PHC)—PROCOPE Programme 2014, and Franco-Indian INRIA-DST Associate Team support 2014-2017. The technical help received from the interns—R. Sehgal, G. Aggarwal, A. Jacob, and D. Kumar—in the development of the experimental setup is gratefully acknowledged. We would also like to thank the reviewers for their insightful comments which helped us to significantly improve the manuscript.

## **REFERENCES**


activity for post-stroke gait training. *Clin. Neurophysiol.* 125, S98–S99. doi: 10.1016/S1388-2457(14)50326-4


Shadmehr, R., and Mussa-Ivaldi, F. A. (1994). Adaptive representation of dynamics during learning of a motor task. *J. Neurosci.* 14(5 Pt 2), 3208–3224.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 April 2014; accepted: 19 November 2014; published online: 15 December 2014.*

*Citation: Dutta A, Lahiri U, Das A, Nitsche MA and Guiraud D (2014) Post-stroke balance rehabilitation under multi-level electrotherapy: a conceptual review. Front. Neurosci. 8:403. doi: 10.3389/fnins.2014.00403*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Dutta, Lahiri, Das, Nitsche and Guiraud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Reconstructing for joint angles on the shoulder and elbow from non-invasive electroencephalographic signals through electromyography

## *Kyuwan Choi\**

*Psychology Department, Computational Biomedicine Imaging and Modeling, Computer Science, Rutgers University, Piscataway, NJ, USA*

#### *Edited by:*

*Mitsuhiro Hayashibe, Institut National de la Recherche en Informatique et en Automatique, France*

#### *Reviewed by:*

*Anirban Dutta, Georg-August-University, Germany Eduardo Iáñez, Miguel Hernández University of Elche, Spain*

#### *\*Correspondence:*

*Kyuwan Choi, Psychology Department, Computational Biomedicine Imaging and Modeling, Computer Science, Rutgers University, 152 Frelinghuysen Rd., Piscataway, NJ 08854, USA e-mail: kyuwanchoi@gmail.com*

In this study, first the cortical activities over 2240 vertexes on the brain were estimated from 64 channels electroencephalography (EEG) signals using the Hierarchical Bayesian estimation while 5 subjects did continuous arm reaching movements. From the estimated cortical activities, a sparse linear regression method selected only useful features in reconstructing the electromyography (EMG) signals and estimated the EMG signals of 9 arm muscles. Then, a modular artificial neural network was used to estimate four joint angles from the estimated EMG signals of 9 muscles: one for movement control and the other for posture control. The estimated joint angles using this method have the correlation coefficient (CC) of 0.807 (±0.10) and the normalized root-mean-square error (nRMSE) of 0.176 (±0.29) with the actual joint angles.

**Keywords: EEG, EMG, neural activity, primary motor cortex**

## **INTRODUCTION**

The field of Brain Machine Interface (BMI) has engaged in active research to help paralyzed patients regain some independence and to better integrate within societal activities. Brain-machine interface can be broadly divided into invasive and non-invasive modalities depending on how brain signals are harnessed. Invasive BMI, mainly targets motor related cortical areas. Non-human primates are often used to for example, harness the spikes and local field potentials from the primary motor cortex, known to interface with the spinal cord and containing signals useful to control arm movement. Such neural signal has been used to control external devices such as a robotic arm or a mouse cursor by reconstructing hand trajectories from the measured neural activities (Chapin et al., 1999; Wessberg et al., 2000; Serruya et al., 2002; Talylor et al., 2002; Carmena et al., 2003). More recently invasive BMI has been approved to be used in humans (Hochberg et al., 2006; Chadwick et al., 2011), including as well electrocorticography (ECoG) (Schalk et al., 2007; Sanchez et al., 2008).

In the case of non-invasive BMI, the state-of-the-art research uses motor imagery, a paradigm that classifies whether the subject performs left or right hand motor imagery using electroencephalography (EEG) signals (Ramoser et al., 2000; Wolpaw and McFarland, 2004). It was believed that the noisy EEG signal in non-invasive BMI would be insufficient to estimate threedimensional hand movement (Lebedev and Nicolelis, 2006). However, recently Bradberry et al. (2010) succeeded in reconstructing the three-dimensional hand movements from the EEG signals while the subjects perform natural and self-initiated reaching actions.

In the present study, a new method using electromyography (EMG) signals is proposed that first reconstructs the EMG signals of the arm muscles from the source currents estimated from EEG signals, and then estimates joint angles on the shoulder and the elbow from the reconstructed EEG signals. By reconstructing the EMG signals of the arm muscles from EEG signals, it is possible to reconstruct not only kinematics-, but also dynamics-based information involving force generation. For example, impedance and joint torque can be obtained to build more realistic brainmachine interfaces, compatible with human motion execution. Furthermore, when reconstructing EMG signals from the EEG signals, by electrically stimulating the arm muscles of a paralyzed person, it is possible using a functional electrical stimulation (FES) system to engage the person in self-adaptive control of his/her arm.

In this study, source currents over 2240 vertexes were estimated from EEG signals of 64 channels through a hierarchical Bayesian method introducing a hierarchical prior (Sato et al., 2004). This method can effectively incorporate both structural and functional MRI data. In this hierarchical Bayesian method, the variance of the source current at each source location is considered an unknown parameter and estimated from the observed EEG data and prior information by using the Variational Bayesian (VB) method. The fMRI information was imposed as prior information on the variance distribution rather than the variance itself so that it gives a soft constraint on the variance. From the estimated source currents over 2240 vertexes, only 33 vertexes are selected, which is located in the left primary motor cortex contralateral to moving arm, to estimate the filtered EMG signals of 9 muscles by using a sparse linear regression method which can automatically select only useful features in estimating the filtered EMG signals. A modular artificial neural network was then used to reconstruct 4 joint angles on the shoulder and elbow from the estimated filtered EMG signals, which trains movement data and posture data in two different networks. This modular structure improves the accuracy of the estimation.

## **MATERIALS AND METHODS**

#### **EXPERIMENTAL TASK**

Five healthy right-handed subjects (five men, Mage = 22.51, age range: 20–29 years) participated in the experiment. All five subjects do not have any experience of participating in the experiments of brain-machine interface study before. All participated subjects submitted a written form of consent before starting the experiment. The subjects performed a continuous arm-reaching task as shown in **Figure 1A**. The task consisted of pushing buttons in the following sequences: Hold-C-A-B, Hold-C-D-B, Hold-D-B-A, and Hold-D-C-A. Theses sequences are explained in greater detail below.

Here only the Hold-C-A-B sequence is explained (**Figure 1B**), since the others have similar patterns. First, a subject pushes the hold button for 1 s when the hold signals turns on. If the subject succeeds in pushing the hold button for 1 s, the C button turns on, and the hand of the subject has to move to the C button within 1 s. If the subject is successful in keeping the C button pressed for 1 s, the A button turns on. The hand of the subject is then supposed to push the A button within 1 s and keep it pressed for 1s. If the A ˙ button is successfully pressed for 1 s, the B button turns on. The hand of the subject then has to push the B button within 1 s and keep it pressed for 1 s.

When the subject succeeded in pressing all three buttons correctly, it was considered as success, and only successful trials were analyzed in this study. After pressing the three buttons, the subject takes a rest between 3 and 4 s, then, it goes to the next trial. The task of the next trial is decided randomly. By performing 10 trials for one of four tasks randomly, each subject conducted 40 trials within one set. A total of seven sets of experiment were conducted for one subject. The leave-one-out cross-validation method was used to analyze the measured data by using six sets for training data and one set for the test data.

#### **fMRI EXPERIMENT**

**Figure 2** shows the fMRI task used to collect fMRI data as prior information to estimate cortical activity. One trial consisted of the execution task in which the participant moves the right index finger (e.g., up or down) every 1 s. This is followed by a resting period in which the participant takes a break for 15 s. Each participant conducted 24 trials of the fMRI task. The fMRI activity when participants take a rest (rest periods) was subtracted from the fMRI activity when participants moved their fingers (execution periods). All five participants conducted the fMRI task to get their individual prior information.

#### **ESTIMATION OF CORTICAL ACTIVITIES FROM EEG SIGNALS**

EEG signals were measured at 1 kHz sampling rate on 64 channels by using a biosemi system (Amsterdam, Netherlands). The measured EEG signals were taken baseline corrected (baseline data from −1 to 0 s) and band-pass filtered between 8 and 30 Hz using a fifth-order butterworth filter.

To estimate cortical activities from EEG signals, an inverse filter L (a matrix of dimensions 2240 × 64) in Equation 1 was used. By multiplying real-time EEG signals to the obtained inverse filter as in Equation 1, it is possible to quickly estimate the cortical activity.

$$L\left(\sum\_{a}^{-1}\right) = \sum\_{a}^{-1} \cdot G' \cdot \left(G \cdot \sum\_{a}^{-1} \cdot G' + \theta^{-1} I\_M\right)^{-1},$$

$$J(t) = L\left(\sum\_{a}^{-1}\right) \cdot E(t) \tag{1}$$

Here, *E*(*t*) represents measured real-time EEG signals given by 64 × 256 Hz (sampling rate). *J*(*t*) denotes the estimated cortical activities over 2240 vertexes every second and is given by 2240 × 256 Hz entries. *G* (64 × 2240) is a lead field matrix which represents the impulse response of each source vector component at every measurement site (Baillet et al., 2001) and G denotes its transpose. The boundaries between brain, skull, and scalp were generated by using the Curry 5 software (Compumedics, USA). Here, the relative conductivities of the brain, skull, and scalp are 10.0125 and 1. *IM* represents an identity matrix of M-by-M (M:number of sensors), β−<sup>1</sup> (64 × 64) corresponds to the inverse of the noise variance of the observed EEG signals. −<sup>1</sup> <sup>α</sup> denotes the source covariance matrix, and is calculated as −<sup>1</sup> <sup>α</sup> <sup>=</sup> diag*(*α−1*)*. Here, <sup>α</sup><sup>−</sup>1(2240 <sup>×</sup> 2240) represents the source current variance which is considered unknown parameters in this study and estimated from the measured EEG data by applying a hierarchical prior on current variance.

Artifact dipoles were also incorporated in the estimation according to previous studies (Fujiwara et al., 2009; Morishige et al., 2009). Artifact dipoles were located at the center of the heart, the right shoulder, and wrist joints, the left and right eyeballs, and the carotid arteries, and estimated.

#### *Estimation of current variance*

In this study, the current variance α<sup>−</sup>1was estimated by the Automatic relevance determination (ARD) hierarchical prior (Neal, 1996).

$$P(f(t)|\alpha,\ \beta) \propto \exp\left[-\frac{\beta}{2}f'(t) \cdot A \cdot f(t)\right]$$

$$P(\alpha\_i) = \Gamma(\alpha\_i|\alpha\_{0i}, r\_0)$$

$$P(\beta) = \frac{1}{\beta} \tag{2}$$

where β is the inverse noise variance of the observed EEG signals, *A* = diag*(*α), and α is an I-by-1 vector whose component α*<sup>i</sup>* is the inverse current variance corresponding to the *i*-th current dipole.  represents the Gamma distribution with mean α0*<sup>i</sup>* and degree of freedom *r*0. Intuitively, the hyper-parameter *r*<sup>0</sup> represents confidence of the hierarchical prior information. A prior current variance *v*0*<sup>i</sup>* = α−<sup>1</sup> <sup>0</sup>*<sup>i</sup>* represents the prior information on current intensity. For large and small *v*0*i*, estimated current *Ji(t)* tends to be large and small, respectively. These values were determined from the fMRI information:

$$\nu\_{0i} = \nu\_{\text{base}} + (m\_0 - 1) \cdot \nu\_{\text{base}} \cdot \left(\hat{t}\_i\right)^2,\tag{3}$$

where *t* ˆ*<sup>i</sup>* is a normalized *T*-value on the *i*-th vertex. Normalized *T*-values are computed by dividing the original *T*-values by the maximum of those *T*-values (thus ranging from 0 to 1).

*v*base is a baseline of the current variance, which is estimated from the pre-movement interval (1.0–0.5 s before the movement initiation) of the EEG data by a Bayesian minimum norm estimation. A variance magnification parameter *m*0, which is the other hyper-parameter, specifies the scaling between the current variances in the baseline and task periods. *m*<sup>0</sup> = 100 and *r*<sup>0</sup> = 10 were used.

Due to the hierarchical prior, the estimation problem becomes non-linear and cannot be solved analytically. Therefore, the VB method (Attias, 1999; Sato, 2001) is employed. In the VB method, *J*(*t*), α, and β are iterately updated until convergence.

**Figure 3A** depicts the fMRI activity while subject 1 conducts the Hold-C-A-B sequence task with the right arm. The left primary motor area is strongly activated. The fMRI information was used as the prior information to estimate cortical activities. **Figure 3B** shows the cortical activities of subject 1 estimated from the EEG signals for the Hold-C-A-B task. As expected, strong cortical activities are estimated in the left motor cortex. Meanwhile, several parts in the visual cortex are activated in the figure. The reason of the activation of the visual cortex is that while the subject performs the task of the experiment, he sees the target buttons emitting high intensity light.

#### **EMG SIGNAL PROCESSING**

For all trials in this study, EEG, EMG signals, and the positions of the shoulder, the elbow, and the wrist of the subject were simultaneously measured. EMG signals were collected in the nine muscles involving four degrees of freedom (see **Figure 4** and **Table 1**).

In order to measure the EMG signals, a silver/silver chloride surface electrode (NE-102, Nihon Kohden) was used. After differential amplification, each signal was sampled at 1 kHz with a 12-bit resolution. The signals were digitally rectified, averaged

over 5 ms, and then filtered through a second-order low-pass filter with a cut-off frequency of ∼3 Hz (Koike and Kawato, 1995).

$$f\_{\rm EMG}(t) = \sum\_{j=1}^{n} h\_{\vec{j}} \text{EMG}(t - j + 1), \tag{4}$$

$$h(t) = 6.44 \times \left( \exp^{-10.80t} - \exp^{-16.52t} \right),\tag{5}$$

The coefficient *hj* in Equation 4 can be acquired by sampling *h*(*t*) in Equation 5 discretely. The resulting signal is very similar to the actual tension; consequently, it is called quasi-tension (Basmajian and DeLuca, 1985).

The method that uses a low-pass filter to estimate muscle tension shows good performance when the velocity of muscle contraction is slow. However, the method cannot estimate muscle tension precisely when the velocity of contraction is very high, and the method does not consider the non-linear characteristics of muscles, such as length and velocity. However, it is reasonable to assume that the output of the low-pass filter is similar to the actual tension (Mannard and Stein, 1973).

#### **KINEMATICS**

In order to measure the position of the shoulder, the elbow, and the wrist of the subjects, an infrared marker was attached on their arms and measured each position by using a 3D position

#### **Table 1 | Muscles measured for EMG signals.**


measurement system (MacReflex, Qualisys). The sampling rate was 120 Hz. In order to calculate the joint angles of the four degrees of freedom in the shoulder and elbow from the positions measured, the inverse kinematics equations (Koike and Kawato, 1994) was used.

In **Figure 5**, if we set the transition matrix of θ1θ2*,* ··· *,* θ<sup>7</sup> to *Ax (*θ1*), Ay (*θ2*),* ··· *, Az(*θ7*)* and the transition matrix of *l*1(the length of the upper arm), *l*2(the length of the fore arm), and *l*3(the length of the hand) to *Lz(l*1*)*, *Lz(l*2*)*, and *Lz(l*3*)*, we can represent the transition matrix of *AE*, *AW* , and *AH*, which represents the relation from the elbow position *E* to the hand position *H*, like below,

$$A\_E = A\_\mathbf{x} \left(\theta\_1\right) A\_\mathbf{y} \left(\theta\_2\right) L\_\mathbf{z} (l\_1) = \begin{bmatrix} C\_2 & C\_2 l\_{zl} \\ 0^T & 1 \end{bmatrix},\tag{6}$$

Here,

$$\mathbf{C}\_{2} = \mathbf{C}\_{\mathbf{x}} \begin{pmatrix} \theta\_{1} \end{pmatrix} \mathbf{C}\_{\mathcal{V}} \begin{pmatrix} \theta\_{2} \end{pmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 \ c\_{1} & -s\_{1} \\ 0 \ s\_{1} & c\_{1} \end{bmatrix} \times \begin{bmatrix} c\_{2} & 0 \ s\_{2} \\ 0 & 1 \ 0 \\ -s\_{2} & 0 \ c\_{2} \end{bmatrix},\tag{7}$$

$$l\_{z1} = \begin{bmatrix} 0 \ 0 \ -l\_1 \end{bmatrix}^T,\tag{8}$$

Equation 6 becomes

$$A\_E = \begin{bmatrix} c\_2 & 0 & s\_2 & -s\_2 l\_1 \\ s\_1 s\_2 & c\_1 & -s\_1 c\_2 & s\_1 c\_2 l\_1 \\ -c\_1 s\_2 & s\_1 & c\_1 c\_2 & -c\_1 c\_2 l\_1 \\ 0 & 0 & 0 & 1 \end{bmatrix},\tag{9}$$

The coordination of the elbow *E* (*xE, yE, zE*) is represented in the 4th column that is,

$$
\begin{bmatrix} \varkappa\_E \\ \varkappa\_E \\ \varkappa\_E \end{bmatrix} = -l\_l \begin{bmatrix} s\_2 \\ -s\_1 c\_2 \\ c\_1 c\_2 \end{bmatrix},\tag{10}
$$

$$
\begin{bmatrix} \mathbf{x}\_W \\ \mathbf{y}\_W \\ \mathbf{z}\_W \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_E \\ \mathbf{y}\_E \\ \mathbf{z}\_E \end{bmatrix} + l\_2 \begin{bmatrix} -s\_2 c\_4 - c\_2 s\_3 s\_4 \\ s\_1 c\_2 c\_4 + (c\_1 c\_3 - s\_1 s\_2 s\_3) s\_4 \\ -c\_1 c\_2 c\_4 + (s\_1 c\_3 - c\_1 c\_2 s\_3) s\_4 \end{bmatrix}, \tag{11}
$$

Finally, we can get the following Equations (12–15).

$$
\tan \theta\_1 = -\frac{\varkappa\_E}{\varkappa\_E},
\tag{12}
$$

$$
\sin \theta\_2 = -\frac{\varkappa\_E}{l\_1},
\tag{13}
$$

$$\sin\theta\_3 = \frac{(\mathbf{x}\_E - \mathbf{x}\_W)/l\_2 - \sin\theta\_2\cos\theta\_4}{\cos\theta\_2\sin\theta\_4},\tag{14}$$

$$\cos\theta\_4 = \frac{l\_1^2 + l\_2^2 - R^2}{2l\_1l\_2},\tag{15}$$

#### **ESTIMATION OF EMG SIGNALS FROM ESTIMATED CORTICAL ACTIVITIES**

A sparse linear regression method (Toda et al., 2011) was used to estimate filtered EMG signals from the cortical activities estimated over 2240 vertexes.

$$f\text{EMG}\_{l}(t+\delta t) = \sum\_{j=1}^{N\_{\text{source}}} \boldsymbol{w}\_{l\bar{j}} \times J\_{\bar{j}}(t) + \text{bias},\tag{16}$$

Here, *f* EMG*<sup>i</sup>* describes the *i*-th filtered EMG signal from the cortical activity on the *j*-th vertex (*Jj*). *N*source denotes the number of vertexes used in estimating filtered EMG signals. In this study, since all subjects are right-handed, the cortical activities over 33 vertexes in the left primary motor cortex were used to estimate the filtered EMG signals. The weighting factor *wij* represents the strength influence from the cortical activity on the *j*-th vertex on muscle *i*-th muscle. δ*t* is the delay between the cortical activity of the primary motor cortex and the EMG signals.

#### **MODULAR ARTIFICIAL NEURAL NETWORK MODEL**

In order to estimate joint angles from the filtered EMG signals, a modular artificial neural network (Jacobs et al., 1991) was used, as shown in **Figure 6**. Training the data of posture and movement in different networks will improve the accuracy of estimating joint angles compared to training the entire set of data in the same network, since the muscle tension is different in these two cases. Here, posture is defined as the state where the arm of the subject is in contact with a button on the screen, and movement is defined as the condition where the arm of the subject moves from one button to another. If training is done well, a gating network will select one of the two expert networks by its input signal. In this case, one of the two expert networks is used for posture control and the other is used for movement control. Since the gating network determines the output ratio for each expert network depending on its input signal, the sum of the outputs of the gating network should always be equal to 1.

To achieve this, as shown in Equation 17, the output *gj* of the gating network, which corresponds to the *j*-th expert network, is normalized by using the soft max activation function.

$$\mathbf{g}\_{j} = \frac{e^{\mathbf{x}\_{j}}}{\sum\_{i=1}^{N} e^{\mathbf{x}\_{i}}},\tag{17}$$

Here, *xi* is the value determined by the input signal of the gating network and *N* is the total number of outputs of the gating network. The total output is calculated by multiplying the output of the gating network by the output of each expert network and summing the result, as given in Equation 18.

$$\Theta = \sum\_{i=1}^{N} \mathbf{g}\_i \hat{\Theta}\_i,\tag{18}$$

The gating network and each expert network are trained to maximize the likelihood function lnL (Equation 19) by the back propagation algorithm (Rumelhart et al., 1986).

$$\ln L = \ln \sum\_{i=1}^{N} \mathbf{g}\_i e^{\frac{-\left\|\mathbf{e} - \hat{\mathbf{e}}\_i\right\|^2}{2\sigma\_i^2}},\tag{19}$$

The update of the weights of the gating network is calculated by a chain rule, as in Equation 20.

$$\frac{\partial \ln L}{\partial \mathbf{x}\_i} = \sum\_{i=1}^{N} \left( \mathbf{g}(i|X, \hat{\theta}\_i) - \mathbf{g}\_i \right), \tag{20}$$

Here, *X* is the input of the gating network, and the posteriori probability *g i*|*X,* θˆ*<sup>i</sup>* is

$$\log\left(i|X,\hat{\theta}\_{i}\right) = \frac{\operatorname\*{\}^{\operatorname\*{\mathbb{P}}\left\|\theta-\hat{\theta}\_{i}\right\|^{2}}}{\operatorname\*{\}^{N}\operatorname\*{\mathbb{P}}\operatorname\*{\mathbb{P}}^{\operatorname\*{\mathbb{P}}\left\|\theta-\hat{\theta}\_{i}\right\|^{2}}},\tag{21}$$

The update of the weights of each expert network is calculated by a chain rule as in Equation 22.

$$\frac{\partial \ln L}{\partial \hat{\theta}\_i} = \sum\_{i=1}^{N} g\left(i|X, \hat{\theta}\_i\right) \frac{\theta - \hat{\theta}\_i}{\sigma\_i^2},\tag{22}$$

Each network is trained by using the kick-out method (Ochiai and Usui, 1993).

The filtered EMG signals of the nine muscles were used as the input of each expert network model. The summed-squared velocity value of the four joint angles were used as the input of the gating network because when the value of the cortical activities in the primary motor cortex was directly used as the input of the gating network, the gating network could not distinguish between posture and movement. However, when using the summed-squared velocity of the four joint angles as the input, the gating network distinguished posture and movement correctly.

#### **ANALYSIS**

The correlation coefficient (CC) was used to evaluate the similarity between actual and predicted signals. Accuracy was also evaluated using normalized root-mean-square error (nRMSE) between actual and predicted signals, defined as

$$\text{nRMSE} = \sqrt{\frac{\sum\_{i=1}^{n} \left(\nu\_i^{\text{predicted}} - \nu\_i^{\text{actual}}\right)^2}{n}} \Big/ \left(\nu\_{\text{max}}^{\text{actual}} - \nu\_{\text{min}}^{\text{actual}}\right), \text{(23)}$$

where for each time i (*i* = 1*,* 2*,..., n)*, *y* predicted *<sup>i</sup>* is the predicted signal and *y*actual *<sup>i</sup>* is the actual signal, and *<sup>y</sup>*actual max and *y*actual min are the maximum and minimum of actual signal, respectively.

## **RESULTS**

### **ESTIMATION RESULT OF FILTERED EMG SIGNALS FROM THE CORTICAL ACTIVITIES OF THE PRIMARY MOTOR CORTEX**

The filtered EMG signals were estimated from the cortical activities in the primary motor cortex by using Equation 23. To determine the delay-time parameter, the intracortical microstimulation (ICMS) method (Heusler et al., 2000) was used and the delay time 17 ms was decided when the filtered EMG signals are estimated from the cortical activities of the primary motor cortex.

Of the 70 trials measured for each task (Hold-C-A-B, etc.), 60 trials were used for training data and 10 trials for the test data. The sparse linear regression method has an ability to automatically select only useful features in estimation among all extracted features. Therefore, this method is very strong against the overfitting problem. **Figure 7A** shows the weights of the selected features in estimating filtered EMG signals from the cortical activities in the left primary motor cortex while subject 3 performs the experimental task. In the case of **Figure 7A**, 20 vertexes are selected among 33 vertexes located in the left primary motor cortex to estimate filtered EMG signals.

**Figure 7B** shows the filtered EMG signals of subject 1 estimated from the cortical activities over selected 20 vertexes in the left primary motor cortex. The estimated filtered EMG signals had a CC of 0.827 (±0.10) and nRMSE of 0.142 (±0.38) with the actual EMG signals. **Table 2** shows the CC between the actual EMG signals and the reconstructed EMG signals of all of the 5 subjects participated in the experiment. The averaged CC and nRMSE of 5 subjects were 0.851 (±0.11) and 0.233 (±0.17).

#### **ESTIMATION RESULT OF JOINT ANGLES FROM FILTERED EMG SIGNALS**

After measuring 70 trials of the EMG signals and movement trajectories of the subject's arm, 60 trials were used as training data and one trial as test data. The number of training data samples was 1,080,720 (60 trials × 1 kHz × 4.503 s × 4 cases) and the number of test data samples was 180,120 (10 trials × 1 kHz × 4.503 s × 4 cases). In the case of the gating network, the network was trained by the summed-squared velocity value of the four joint angles. However, since this value cannot be used as test data, the velocity values from the filtered EMG signals were estimated. **Figure 8** shows the four joint angles of subject 1 estimated from the cortical activities of the primary motor cortex. The CC and nRMSE between the estimated joint angles and the actual joint angles were about 0.817 (±0.10) and 0.212(±0.04). **Table 3** depicts the CC between the actual joint angles of 5 subjects and the joint angles reconstructed by the modular artificial

**FIGURE 7 | Reconstruction of the filtered EMG signals from the cortical activities estimated on 33 vertexes in the left primary motor cortex. (A)** The weights of the important features selected by the sparse linear regression to estimated filtered EMG signals from the cortical activities while subject 3 performs four tasks. **(B)** The filtered EMG signals estimated from the selected 20 features. Dotted lines (blue) represents the actual filtered EMG signals, and solid lines (red) show the reconstructed filtered EMG signals (normalized scale).

neural network model. The averages of the CC and nRMSE of the reconstructed joint angles were 0.807 (±0.10) and 0.176 (±0.29).

## **DISCUSSION**

In this study, the cortical activities on 2240 vertexes were estimated from the EEG signals of 64 channels using the hierarchical Bayesian method. Then, of the estimated cortical

**Table 2 | The correlation coefficient (CC) and normalized root-mean-square error (nRMSE) between the actual EMG signals and the estimated EMG signals.**

activities, only the cortical activities in the left primary motor cortex were used to reconstruct the EMG signals of nine muscles through the sparse linear regression method. When reconstructing EMG signals from the cortical activities, we could determine the delay time between the cortical activities and

## **Table 3 | The correlation coefficient (CC) and normalized root-mean-square error (nRMSE) between the actual joint angles and the joint angles estimated by the modular artificial neural network model.**



the EMG signals by searching the correlation of those two signals. However, the pattern of EMG signals has a simple waveform which has one or two peaks, and that of cortical activities is also similar. Thus, in this study, the ICMS method was used to decide the delay time. The delay time of 17 ms found from ICMS method was applied for all the subjects for estimating EMG signals from the cortical activities. In the future, we are going to study whether or not this delay time is effective for individuals in disease state. A modular artificial neural network model was used to estimate four joint angles on the elbow and the shoulder from the estimated EMG signals.

## **WHY IS IT IMPORTANT TO RECONSTRUCT EMG SIGNALS FROM BRAIN SIGNALS?**

Morrow et al. (Morrow and Miller, 2003) succeeded in reconstructing the EMG signals of the distal forelimb muscles from the 50 M1 neurons of an non-human primate while performing a stereotyped precision grips task. Furthermore, Koike et al. (2006) estimated the EMG signals of seven arm muscles from the neural activities of 18 neurons of an non-human primate during an arm reaching task. Then, three joint angles (two at the shoulder and one at the elbow) were reconstructed from the estimated EMG signals. Similarly, most existing brain-machine interface studies reconstruct EMG signals from the neural activities of the primary motor cortex of non-human primates, by using invasive needle electrodes. In such cases, it is possible to obtain relatively clean brain signals.

When reconstructing EMG signals with non-invasive BMI technologies however, there are several difficulties because the skull, which is an insulator, is located between the brain and the sensors, thus introducing noise. Ganesh et al. (2008) succeeded in reconstructing the EMG signals of two antagonist muscles from fMRI signals measured in the primary motor cortex and pre-motor cortex. EEG signals have good time resolution, but its spatial resolution is poor. Consequently, it is difficult to estimate EMG signals with EEG signals. In this study, spatial resolution is improved by estimating cortical activities over 2240 vertexes from the EEG signals measured over 64 channels through the hierarchical Bayesian method. Among the features being abundant, the sparse linear regression method automatically selects only useful features in reconstructing EMG signals. The proposed method is very robust against the overfitting problem.

When EMG signals are reconstructed from the brain signals, there are several advantages: First, we can reconstruct not only position related information such as hand position but also force related information such as joint torque and stiffness from the estimated EMG signals (Koike and Kawato, 1993, 1994, 1995). For example, when we pick up an object, the brain stabilizes the posture of the arm by controlling muscle tensions. The stiffness is controlled by the co-contraction of the muscles. It is difficult to model this phenomenon by directly estimating the hand position because co-contraction causes different muscle patterns for the same posture. Similarly, when in addition to reconstructing the kinematics of hand motion, we obtain force information such as joint torque and stiffness from the brain signals, it is possible to control a robotic arm based on these information. In such cases it could also be possible to implement a brain-machine interface more compatible with features of the human arm.

Second, by using the estimated EMG signals as the command signals of the FES, we raise the possibility that a paralyzed person could in principle control his arm once we electrically stimulate his paralyzed muscles (Degnan et al., 2002; Uechi et al., 2004). Fagg et al. (2007), without modeling the characteristics of the musculoskeletal system, controlled arm movement by electrical stimulation of arm muscles through FES after reconstructing EMG signals from the neural activities in the primary motor cortex. Furthermore, Moritz et al. (2008), by facilitating the direct control of the stimulation of muscles from the neural activities of the primary motor cortex, made it possible for non-human primates to control bidirectional wrist torques from cortical cells. This research suggests that it may be possible to create more realistic neuro-prostheses. By modeling the musculoskeletal system, we may be able to extend non-invasive brain-machine interfaces to control anthropomorphic robotic devices.

## **WHICH BRAIN PART IS MEASURED FOR RECONSTRUCTING EMG SIGNALS?**

The cortical region of choice to harness the control neural signal from seems to be important. In the case of studies using nonhuman primates, the neural activities of the primary motor cortex are mainly measured to reconstruct EMG signals (Nicolelis et al., 1998; Shoham et al., 2005; Wu et al., 2006). From the research result of Fried et al. (1991), the process of motor related information in the brain is that first the urge to move the arm occurs from the premotor cortex, then the occurred signals goes to the primary motor cortex via the supplementary motor area. The primary motor cortex is the final output part of motor related signals in the brain. The signal is transmitted to the arm muscles through the alpha motor neuron of the spinal cord, and finally it generates arm movement. Anatomically, since the primary motor

**FIGURE 9 | The correlation coefficients when reconstructing EMG signals from the cortical activities estimated in different brain areas (M1, primary motor cortex; PMd, dorsal premotor cortex; PP, posterior parietal cortex; and All, using all brain areas).**

cortex is linked to the muscles via one or more intermediate neurons, the neural activities of the primary motor cortex have high correlation with muscle activities. **Figure 9** shows the CCs when reconstructing EMG signals from the cortical activities in several brain areas. In this study, it is found that when reconstructing EMG signals from the cortical activities estimated in the primary motor cortex, the highest CC is obtained.

### **LIMITATIONS AND FUTURE WORK**

There are some limitations with the use of modular neural networks for joint angle estimation. The estimated joint angles have a CC of 0.81 with the actual joint angles. The reason the modular artificial neural network model was used in estimating joint angles is because, in the case of isotonic movement, where force is outputted with a changing length of the muscle, the tension is different depending on the velocity that the muscle flexes or extends. In the case of muscle flexion, the tension decreases as the flex velocity increases. In the case of muscle extension, the tension increases as the extension velocity increases. The performance of estimating joint angles could be improved by training two networks with tension values, which change depending on the velocity, rather than training the data in the same network. One network was used for 0 velocity and the other for movement

#### **REFERENCES**


*J. Neural Eng.* 8, 1741–1750. doi: 10.1088/1741-2560/8/3/034003


velocity. When joint angles are estimated from muscle tensions, the muscle tensions for posture have low values. In comparison, the muscle tensions for movement have significantly high values. If we trained these data in the same network, the network would determine that the error of posture data is much lower than that of movement data. Consequently, in the case of posture data, the estimated results are poor. In future work we will use different neural network structures for joint angle estimation. Recent advances in machine learning point at deep learning algorithms and neural networks (Salakhutdinov and Hinton, 2009, 2012) as a possibility for improving feature extraction to reconstruct the joint angles. We plan to explore these new avenues of research.

In this study, five normal subject's joint angles were estimated from EEG signals through EMG signals. In the case of individuals with spinal cord injuries where the pathway between primary motor cortex and muscles was disconnected, there was a necessity of identifying the relationship between EMG signals and joint angles of a normal subject. Then, the EEG signals of an individual with spinal cord injury is connected to EMG signals of the normal subject. In the future, we are going to study more about this topic with individuals with spinal cord injury. Furthermore, there is a possibility of using this proposed method in a study of post-stroke individual where primary motor cortex is not damaged.

reconstructing muscle activity from human cortical fMRI. *Neuroimage* 42, 1463–1472. doi: 10.1016/j.neuroimage.2008.06.018


*Biol. Cybern.* 73, 291–300. doi: 10.1007/BF00199465


Extraction and localization of mesoscopic motor control signals for human ECoG neuroprosthetics. *J. Neurosci. Methods* 167, 63–81. doi: 10.1016/j.jneumeth. 2007.04.019


activity using a Kalman filter. *Neural Comput.* 18, 80–118. doi: 10.1162/089976606774841585

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 June 2013; paper pending published: 14 August 2013; accepted: 03 October 2013; published online: 24 October 2013.*

*Citation: Choi K (2013) Reconstructing for joint angles on the shoulder and elbow from non-invasive electroencephalographic signals through electromyography. Front. Neurosci. 7:190. doi: 10.3389/fnins.2013.00190 This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience. Copyright © 2013 Choi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Brain-machine interface to control a prosthetic arm with monkey ECoGs during periodic movements

#### *Soichiro Morishita1 \*, Keita Sato2, Hidenori Watanabe3, Yukio Nishimura3,4,5, Tadashi Isa3,4, Ryu Kato6, Tatsuhiro Nakamura7 and Hiroshi Yokoi <sup>2</sup>*

*<sup>1</sup> Brain Science Inspired Life Support Research Center, The University of Electro-Communications, Chofu, Japan*

*<sup>2</sup> Department of Mechanical Engineering and Intelligent Systems, The University of Electro-Communications, Chofu, Japan*

*<sup>3</sup> Division of Behavioral Development, Department of Developmental Physiology, National Institute for Physiological Sciences, Okazaki, Japan*

*<sup>4</sup> Department of Physiological Sciences, School of Life Science, The Graduate University for Advanced Studies (SOKENDAI), Hayama, Japan*

*<sup>5</sup> PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan*

*<sup>6</sup> Division of Systems Research, Department of Systems Design, Faculty of Engineering, The Yokohama National University, Yokohama, Japan*

*<sup>7</sup> Integrative Brain Imaging Center, National Center of Neurology and Psychiatry, Kodaira, Japan*

#### *Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*Antonio Novellino, ETT s.r.l., Italy Abhishek Prasad, University of Miami, USA Fady Alnajjar, Brain Science Institute, RIKEN, Japan*

#### *\*Correspondence:*

*Soichiro Morishita, Brain Science Inspired Life Support Research Center, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu 182-8585, Japan e-mail: smori@hi.mce.uec.ac.jp*

Brain–machine interfaces (BMIs) are promising technologies for rehabilitation of upper limb functions in patients with severe paralysis. We previously developed a BMI prosthetic arm for a monkey implanted with electrocorticography (ECoG) electrodes, and trained it in a reaching task. The stability of the BMI prevented incorrect movements due to misclassification of ECoG patterns. As a trade-off for the stability, however, the latency (the time gap between the monkey's actual motion and the prosthetic arm movement) was about 200 ms. Therefore, in this study, we aimed to improve the response time of the BMI prosthetic arm. We focused on the generation of a trigger event by decoding muscle activity in order to predict integrated electromyograms (iEMGs) from the ECoGs. We verified the achievability of our method by conducting a performance test of the proposed method with actual achieved iEMGs instead of predicted iEMGs. Our results confirmed that the proposed method with predicted iEMGs eliminated the time delay. In addition, we found that motor intention is better reflected by muscle activity estimated from brain activity rather than actual muscle activity. Therefore, we propose that using predicted iEMGs to guide prosthetic arm movement results in minimal delay and excellent performance.

**Keywords: brain-machine interfaces, electrocorticography, electromyography, prosthetic arm, reaching task**

## **INTRODUCTION**

Brain-machine interfaces (BMIs), which are a type of manmachine interface that provides a direct connection between the brain and external devices, can be divided into 2 types: input-type and output-type. An input-type BMI is used for the recovery of central nervous system function with an external device (Yokoi et al., 2012), while an output-type BMI is used for the intuitive control of an external device instead of the limbs. For patients with severe paralysis, such as those with amyotrophic lateral sclerosis, output-type BMIs offer a promising technology for the rehabilitation of upper limb function (Lebedev and Nicolelis, 2009).

In an output-type BMI, brain activities are measured from the sensory motor area in the cerebral cortex; these signals can be detected invasively or noninvasively. Invasive approaches usually include the use of a multichannel needle-shaped sensor that is inserted into the cerebral cortex. Noninvasive approaches include the use of electroencephalography, functional near-infrared spectroscopy, or functional magnetic resonance imaging. Noninvasive approaches are ideal because they have no clinical risk; however, their spatial resolution and signal-to-noise ratio are not suitable for practical control. As a result, many studies continue to focus on invasive approaches. The initial studies on BMI focused on invasive signal detection of brain activity, and they achieved highly successful control of a prosthetic hand (Velliste et al., 2008) with good spatial resolution and signal-to-noise ratios. However, degeneration and necrosis limit the long-term use of these invasive signal detection methods (Szarowski et al., 2003; Biran et al., 2005). To overcome this problem, an electrocorticography (ECoG) electrode was developed. This is an invasive signal detection method involving the use of a surface electrode on the cerebral cortex under the dura matter. Importantly, it has long-term stability with low clinical risk. Moreover, it shows precise spatial resolution with a good signal-to-noise ratio. ECoGs have been used to develop output-type BMI systems for twodimensional cursor control and motion prediction of the upper arm (Schalk et al., 2007; Pistohl et al., 2008; Uejima et al., 2009; Yanagisawa et al., 2009; Chao et al., 2010; Yanagisawa et al., 2011).

We also developed a prosthetic arm that is controlled by a BMI with ECoGs (Sato et al., 2012). The subject was a monkey (*Macaca fuscata*) implanted with ECoG electrodes and then trained in a reaching task. The reaching task was performed periodically. Therefore, decoding could be achieved by phase estimation of the periodic movements. A decoder was constructed by machine learning to map between the ECoGs and motion states, which corresponded to the phases of periodic movement. We then tested whether the response delay of the prosthetic arm was controlled by the proposed method. We found that the latency (the time that elapsed between the monkey's actual motion and the prosthetic arm movement) was about 200 ms. Considering the primary delay that the prosthetic arm has as a robotic arm, it is desirable that the trigger event generated precedes the monkey's actual motion by about 200 ms.

Since muscle activity precedes changes in motion, and motor intentions can be detected more quickly, one potential way of improving the response of the BMI prosthetic arm could lie in decoding muscle activity. In other words, as the musculoskeletal system is the best "device" for achieving the brain's motor intentions, using the musculoskeletal system may be advantageous in optimizing BMIs. In fact, myoelectric prosthetic hands are already commercially available (Naidu et al., 2008), while BMI prosthetic hands are not in practical use. Unfortunately, the body image that is presented by a BMI prosthetic arm to the brain differs considerably from that presented by a natural arm because the former cannot reflect motor intentions as faithfully as the musculoskeletal system. An electromyogram (EMG) prosthetic arm estimates motor intentions from the activities of a patient's residual muscles, and it typically accomplishes more sophisticated motions than a BMI prosthetic arm. However, the results of our latest study (Yokoi et al., in press), in which we compared the muscle and brain activities of monkeys, suggested that EMG prosthetic arms might not always be superior to BMI robotic arms in the estimation of the brain's motor intentions. Specifically, during periodic movements, predicted muscle activity from brain activity maintains the periodicity rather than actual muscle activity. Moreover, it is difficult to estimate motor intention directly from brain activities as mentioned above. Therefore, estimating motor intentions with predicted muscle activity from brain activity is likely a better method than directly estimating brain activity or actual muscle activity. Therefore, we devised a method of controlling a BMI prosthetic arm based on the above ideas, and sought to experimentally confirm the validity of this method.

## **MATERIALS AND METHODS**

#### **ABSTRACT LEVEL OF MOTOR INTENTION**

Motor intentions are divided into different types depending on their abstract level. As an example, consider a reaching motion, such as that in self-feeding in monkeys. This motion consists of the following movement sequences: reaching forearm to an object, grasping the object, and returning forearm while grasping the object.

At first, various types of physical measures, such as the EMGs of each muscle, the grip force, angular velocities of the joints, three-dimensional wrist positions, and hand postures can be determined. These are motor intentions of the lower abstract level. Next, based on the interpretations of these physical values, the movement phase (e.g., waiting, reaching, grasping, or resting) can be considered as the motor intention of the higher abstract level. Of course, the monkey's intention in performing the reaching movement is one of the motor intentions of a higher abstract level. In this study, we considered the motor intentions of this abstract level as task-oriented motor intentions. According to the theory of localization of brain functions, information from different abstract levels is processed in different parts of the cerebral cortex. Following this, the planning, control, and execution of voluntary motions are processed in the motor cortex. Moreover, a preceding study confirmed the correlation between the modulation of neurons in the primary motor cortex and muscle activity (Morrow and Miller, 2002). In this paper, the abstract level of motion intention is discussed based on the brain and muscle activity that was measured in a monkey's motor cortex.

## **EXPERIMENTAL SUBJECT**

A monkey (*M. fuscata*) implanted with EMG and ECoG electrodes was used as the experimental subject. EMG and ECoG signals were recorded simultaneously with a Neural Data Acquisition System MAP system (Plexon Inc., Dallas, TX, USA). EMG signals were recorded as auxiliary analog inputs on an OmniPlex system. Signals were low-pass filtered (250-Hz cutoff), and the signals were recorded with a 500-Hz sampling rate. The target muscles that are related to the locomotion of the upper limb and hand and were used to measure EMGs are listed in **Table 1**.

**Figure 1** shows the placements of the ECoG electrodes. The target area was around the left motor cortex, including the frontal



eye field, the premotor area, the primary motor cortex, and the primary somatosensory cortex.

#### **A PROSTHETIC ARM WITH AN INTERFERENCE-BASED WIRE-DRIVEN MECHANISM**

An interference-based wire-driven mechanism was applied to the prosthetic arm to create a balance between the high grip force and high degree of motion retaining lightness. This mechanism involves use of wires to transmit driving force from the actuators. It considers the weight saving of the prosthetic hands attached to the patient's stump since it enables separation of the power sources and prosthetic hands. **Figure 2** shows the interferencebased wire-driven mechanism of the maniphalanx joints that are designed for the thumb and fingers of the prosthetic hand. When the palm-side wire is pulled and the back-side wire is allowed to relax, the hand performs flexion. With the opposite wire operation, it performs extension.

The joint mechanism, which has 2◦ of freedom in mutually orthogonal directions, is required for the wrist and upper arm joints. We thus invented an interference-based parallel-wiredriven mechanism that is hereafter referred to as a parallel wire mechanism. **Figure 3** is a schematic diagram of the structure. It has two rotation mechanisms for *x*-axis and *z*-axis rotations. The cylindrical wire-guide leads the wires such that they are parallel to each other. Then, a rotating torque is generated around the *x*axis by the synchronous traction of wires, and a rotating torque is generated around the *z*-axis by the asynchronous traction of wires in the same manner. To connect the wire symmetrically to

the pulleys of the two motors, the interference power of the two motors is assigned for each degree of freedom.

These two types of interference-based wire-driven mechanisms were applied to develop a prosthetic hand and arm as shown in **Figure 4**. The shoulder joint of this arm has 2◦ of freedom in motion, flexion/extension and adduction/abduction, and the elbow joint has 2◦ of freedom, flexion/extension and internal rotation/external rotation.

It is important to consider the latency caused by power transmission through the wire when controlling prosthetic arms with a wire-driven mechanism. Because the wire is not rigid, power transmission latency is inevitable. As mentioned above, the latency of the prosthetic arm adopted in this study was about 200 ms. Here, the control operation delay was eliminated due to the brain activities preceding the appearance of motion.

#### **MODELING OF THE REACHING TASK**

We designed a lever operation task as a reaching task based on the self-feeding motion of monkeys. **Figure 5** shows an outline of the task, and **Table 2** describes the monkey's different movement states during the task. The monkey was kept under restraint in a chair. First, a push button (home button) was set up under the monkey's right hand, and a lever was placed in front of the monkey. A tube was introduced into the mouth of the monkey, and liquid reward was given through a pump. The pump was triggered when the monkey pulled the lever after the home button was pushed. The monkey was adequately trained in performing this task.

**FIGURE 4 | The prosthetic arm with an interference-based wire-driven mechanism.**

#### **PREPROCESSING OF THE EMG SIGNALS**

The measured EMG signals were transformed into integrated EMGs (iEMGs) as follows:

$$M\_m(t) = \frac{1}{T\_W} \sum\_{\mathbf{r}=0}^{T\_W - 1} |\mathcal{S}\_m(t - \mathbf{r})|,\tag{1}$$

where *Sm*(*t*) is the signal measured by the *m*-th EMG electrode at time step *t*, *Mm*(*t*) is the iEMG of the *m*-th channel of the EMG, and *T*<sup>W</sup> is the term of consideration. Because iEMGs strongly correlate with the exerted muscular force and are robust to white noise, they serve as appropriate indices of muscle activity.

#### **PREPROCESSING OF ECoG SIGNALS**

Some frequency bands are effective for determining the locomotive state of a subject (Sato et al., 2012). **Table 3** shows the range of each frequency band.

Previous studies have shown that high-gamma power strongly correlates to locomotive events in the same way as electroencephalography or local field potentials (LFPs). However, the range of the high-gamma band used differed in previous studies. For example, 60–200 Hz was used in a study assessing macaque LFPs and their potential implications in ECoG (Ray et al., 2008). On the other hand, another study used the frequency band of 80–150 Hz (Yanagisawa et al., 2011). To cover these different definitions, we separated the high-gamma band (80–250 Hz) into 2 ranges: *γ*<sup>L</sup> (80–150 Hz) and *γ*<sup>H</sup> (150–250 Hz). However, in our experimental setting (Western Japan), hum noise superimposed on the frequency band of 60 Hz. Therefore, the frequency band was trimmed at around 60 Hz. Additionally, the upper limit was decided according to the Nyquist frequency of our data acquisition system. The power of each band was determined by calculating the power spectrum with a short-time Fourier transform. The window size *L* equaled 128.



#### **Table 3 | Range of each frequency band.**


#### **ESTIMATION OF EMGs FROM ECoGs BY A PARTIAL LEAST SQUARES REGRESSION**

We estimated the EMGs from ECoG signals by a partial least squares (PLS) regression (Wold, 1975). Because of the relationship between the spatial resolution of the ECoG electrodes and the distances between the adjacent electrodes, the signals obtained from the electrodes were collinear. In the regression analysis, the collinearity made it difficult to determine the values of the regression coefficients and reduced the prediction accuracy. However, the PLS regression served to remove the collinearity and improved the precision of the regression analysis.

In this study, a PLS regression was performed with the following procedure. First, the feature vectors of the ECoGs were constructed as follows:

$$\alpha\_i(t) = \begin{pmatrix} \alpha(i, t) \\ \beta(i, t) \\ \gamma\_\perp(i, t) \\ \gamma\_\mathbb{H}(i, t) \end{pmatrix},\tag{2}$$

$$\mathbf{x}(t) = \begin{pmatrix} \mathbf{x}\_1(t) \\ \mathbf{x}\_2(t) \\ \vdots \\ \vdots \\ \mathbf{x}\_N(t) \end{pmatrix},\tag{3}$$

where *xi*(*t*) (*i* = 1*,* 2*,..., N*) is the subvector of the feature vector *x*(*t*). Moreover, *α*(*i, t*), *β*(*i, t*), *γ*L(*i, t*), and *γ*H(*i, t*) are frequency band's power defined in **Table 3** of channel *i* at time *t*. Namely, each element of *xi*(*t*) indicates the power of the corresponding frequency band. These elements are considered explanatory variables in the PLS regression. The regression model is as follows:

$$\wp(t) = \beta\_0 + \sum\_{k=1}^{r} \beta\_k \mathfrak{x}\_k'(t) + E(t), \tag{4}$$

$$\mathbf{x}'(t) = A\mathbf{x}(t),\tag{5}$$

where *y*(*t*) is the iEMG of the target muscle at time step *t*, *x <sup>k</sup>*(*t*) is the *k*-th element of the latent variable vector *x* (*t*) corresponding to *x*(*t*)*,β<sup>k</sup>* (*k* = 0*,...,r*) is the *k*-th regression coefficient, and *E*(*t*) is the error term. By using the PLS regression, the coefficient matrix *A* to maximize the covariance of *y* and *x* is decided, and the vector *x* (*t*) is calculated as Equation (5). Namely, the latent variables which express the relationship between *y* and *xi* are achieved as the vector *x* .

#### **PATTERN CLASSIFICATION WITH A LINEAR DISCRIMINANT ANALYSIS**

The linear discriminant analysis (LDA), developed by Fisher (1936), was applied to classify the ECoGs into the four motions defined in **Table 2**. The scatter matrix *Sc* (*c* = 1*,* 2*,* 3*,* 4) was defined as follows:

$$S\_{\mathfrak{c}} = \sum\_{\mathbf{x} \in \mathcal{X}\_{\mathfrak{c}}} (\mathfrak{x} - \bar{\mathfrak{x}}\_{\mathfrak{c}}) (\mathfrak{x} - \bar{\mathfrak{x}}\_{\mathfrak{c}})^{\mathrm{T}},\tag{6}$$

where *Xc* is a dataset of *x*(*t*) in the class *ωc*, *x*¯*<sup>c</sup>* is the average vector of data set *Xc*, and *Nc* is the size of *Xc*. The within-class scatter matrix *Wij* is defined by Equation (7) with the 2 classes of *ω<sup>i</sup>* and *ωj*, and the between-class covariance matrix *Bij* is defined with Equation (8).

$$\mathcal{W}\_{\vec{\boldsymbol{\eta}}} = \mathcal{S}\_{\boldsymbol{i}} + \mathcal{S}\_{\boldsymbol{j}} = \sum\_{n=i,j} \sum\_{\mathbf{x} \in \mathcal{X}\_{\boldsymbol{\epsilon}}} (\mathbf{x} - \bar{\mathbf{x}}\_{n})(\mathbf{x} - \bar{\mathbf{x}}\_{n})^{\mathrm{T}},\tag{7}$$

$$B\_{i\bar{j}} = \sum\_{n=i,j}^{N\_n} (\bar{\mathbf{x}}\_n - \bar{\mathbf{x}})(\bar{\mathbf{x}}\_n - \bar{\mathbf{x}})^T,\tag{8}$$

Then, the evaluation function *J*(*wij*), which indicates the separation performance, is defined by Equation (9).

$$J(\boldsymbol{\omega}\_{ij}) = \frac{\boldsymbol{\mathsf{w}}\_{ij}^{\mathrm{T}} \boldsymbol{B}\_{ij} \boldsymbol{\mathsf{w}}\_{ij}}{\boldsymbol{\mathsf{w}}\_{ij}^{\mathrm{T}} \boldsymbol{\mathsf{W}}\_{ij} \boldsymbol{\mathsf{w}}\_{ij}}.\tag{9}$$

The LDA yields the transform coefficient vector *wij* by maximizing the evaluation function *J*(*wij*). The discrimination function is defined as *gij*(*x*(*t*)) in order to discriminate class *ω<sup>i</sup>* from *ω<sup>j</sup>* by using the transform coefficient *wij* as follows:

$$\begin{cases} \mathfrak{x}(t) \in \omega\_{\bar{i}} \Rightarrow \mathfrak{g}\_{\bar{i}\bar{j}}(\mathfrak{x}(t)) > 0\\ \mathfrak{x}(t) \in \omega\_{\bar{j}} \Rightarrow \mathfrak{g}\_{\bar{i}\bar{j}}(\mathfrak{x}(t)) < 0 \end{cases},\tag{10}$$

$$g\_{\vec{\imath}\vec{\jmath}}(\mathbf{x}(t)) = \mathbf{w}\_{\vec{\imath}\vec{\jmath}}^{\mathrm{T}}\mathbf{x}(t) + \mathbf{w}\_{\vec{\imath}\vec{\jmath}}.\tag{11}$$

For multiclass classification, *c*ˆ(*t*) the class at time step *t* is determined with Equation (12).

$$\hat{c}(t) = \underset{i=1\ldots4}{\text{argmax}} \sum\_{j \neq i}^{\text{g}\_{ij}} (\mathbf{x}(t)) \tag{12}$$

#### **MOVEMENT DECISION WITH THE ACCUMULATED DISCRIMINATION RESULTS**

The movement of the prosthetic arm was determined with the results of the ECoG pattern discrimination. The discrimination results usually include misdiscrimination. Therefore, if the discrimination results directly reflect the control of a prosthetic arm, it can overdrive the arm. To avoid this problem, we performed movement decisions with the accumulated discrimination results. A schematic diagram of the algorithm is shown in **Figure 6**.

In **Figure 6**, Freq. is the abbreviation of frequency, and Acc. Results is the abbreviation of Accumulated Results, which is defined as the accumulated total of frequency of discrimination results. Focusing attention on the first row, the actual subject's states changed deterministically. In short, the frequency each state is always 100%. However, the discrimination results were probabilistic when considering misdiscrimination, and these often occurred around the point of state transition because the reaching task is a continuous motion. Finally, the accumulated discrimination results increased monotonically, and the upward trend began at the start of the subject's state transition.

With a proper threshold, the point of state transition, indicated by arrows, was estimated. The thresholds were determined

by considering the difference in the start time and the speed of the prosthetic arm. In addition, the waiting state was treated another way. When the state of the prosthetic was determined to be waiting, the accumulated discrimination results of the other states expected that waiting would be reset. In addition, the accumulated discrimination results of waiting reset the state when it was determined not to be waiting. The transition of the prosthetic arm should be proper. Otherwise, it was assumed that the subject was performing irregular motions, such as grasping the bar of the cage, and so on. In such a case, the prosthetic arm stopped until the state changed to waiting.

#### **TRIGGER EVENT GENERATION ACCORDING TO THE ESTIMATED EMGs**

With the algorithm mentioned in the preceding section, stable control of the prosthetic arm was achieved with a latency of about 200 ms. The completion time was delayed even though the start time for movement of the prosthetic arm was almost the same as that for the monkey's actual movement. The performance of the monkey's own arm was superior to the prosthetic arm. In fact, the changes in the EMGs appeared before the changes in the motion, and, thus, preceding control became possible to determine the motion according to the estimated EMGs. Usually, it was difficult to reconstruct motion from EMGs. However, it was simple to generate a trigger from the EMGs under the presupposition that the state of the subject was waiting. The threshold processing of the EMGs of a certain muscle generated the trigger, and it was specified according to anatomical knowledge. As change of muscle activities should precede appearance of movement, it is possible to calculate a threshold that generates a trigger preceding appearance of movement. In this study, we used a threshold that canceled the prosthetic arm stable control latency mentioned above (200 ms).

#### **ETHICAL APPROVAL**

All experimental procedures were performed in accordance with the Guidelines for Proper Conduct of Animal Experiments of the Science Council of Japan and approved by the Committee for Animal Experiment at the National Institutes of Natural Sciences (Approved No.: 11A157). The data presented for all experimental sessions were obtained from a female Japanese monkey (*M. fuscata*; body weight = 5.4 kg).

## **RESULTS**

To confirm the usefulness of our proposed methods, we performed a number of experiments. First, the results of the EMG prediction with a PLS regression were determined to compare the predicted values to the actual values. Next, in order to compare the regular EMG pattern with the irregular one, we confirmed the stability of the predicted EMGs from the ECoGs. Finally, the results of the motion decision by using trigger event generation according to the predicted EMGs are shown.

## **COMPARISON BETWEEN THE ACTUAL VALUES AND THE PREDICTED VALUES OF EMGs**

In this experiment, we acquired a data sequence that included 100 regular trials. When the state transition of the monkey occurred in the sequence shown in **Figure 6** ࣮ࣛ **!**ཧ↷ඖࡀぢ ࠋࡏࡘ, the series of movements was counted as one trial. The coefficient matrix *A* was determined with data that included 90 trials, and the prediction accuracy was evaluated with sequential data that included 10 trials except for the data that were used to determine the coefficient matrix *A*. The 100 regular trials were divided into 10 groups of 10 trials each.

An example of a prediction over 2 s is shown in **Figure 7**. The solid line represents the actual values, and the dashed line represents the predicted values. In most cases, the trends and peak values were well matched. Although a peak time shift was seen in some cases, such as for the Flexor Carpi Ulnaris (FCU), rise times mostly matched.

For the quantitative evaluation, correlation factors and rootmean-square errors for each muscle are shown in **Table 4**. Correlation factors were calculated between actual values and predicted values. Student's *t*-test was performed under the null hypothesis that the correlation factor equals 0. Following this, it was confirmed that all correlation factors were significant at the 95% confidence level. Nine factors exceeded the correlation value and were considered highly correlated (0.7). In the case of the Triceps Lateral Head (TLaH), Biceps Long Head (BLH), and Extensor Carpi Radialis (ECR), the correlation values were not very high. However, they resulted in little difference compared to the root-mean-square error. We calculated them by applying leave-one-out cross-validation in one group selected from the 10 groups.

#### **EXAMPLES OF IRREGULAR EMG PATTERNS**

As mentioned above, the iEMG prediction by PLS regression seemed to work well. However, some irregular patterns were found during the sequence that had period stability. The iEMG of the pectoralis major provides an illustrative example. **Figure 8** shows the typical regular iEMG pattern and the irregular iEMG pattern during the 10 trials of continuous reaching motion. In 9 of the 10 trials, a regular pattern of the actual iEMGs was observed. However, in 1 trial, an irregular pattern was observed, as shown in **Figure 8**. Nevertheless, the reaching task was performed correctly in both of these cases. Specifically, it had period stability from a task-oriented point of view. Similarly, the brain activity also had stability. The waveforms of the predicted values were more similar

#### **Table 4 | Comparative tables of correlation factors and root-mean-square errors for each muscle.**


*Table 1 contains the legend for all abbreviations.*

to the regular actual iEMG patterns than to the irregular ones. Although the activity of the motor cortex was regular, the activity of the muscles that differed from the typical pattern was produced because of kinematic redundancy, as known as the degrees of freedom problem formulated by Bernstein (1967, 1996).

## **COMPARISON AMONG MOVEMENT DECISION METHODS FOR PROSTHETIC ARMS**

We confirmed the performance of the proposed movement decision method. To detect the start time of the upper arm movement, the deltoid posterior was selected because it increased monotonically with the upper arm movement. **Figure 9** shows the difference between the actual start time of the subject and the start times that were determined with each method. In this section, the movement decision method that used accumulated discrimination results is treated as the conventional method (**Figure 9A**). Moreover, the result of the trigger generation with the predicted iEMG is shown as the proposed method (**Figure 9B**). The results obtained with the actual iEMGs are shown for comparison (**Figure 9C**).

For the conventional method, differences in start times were almost 0. However, for the proposed method, each start time was earlier than that with the conventional method, that is, the response of the prosthetic arm was improved by the proposed method. In addition, the same method was adopted with the actual value of the iEMG instead of the predicted value. In almost all trials, the start time was earlier than that in the conventional method and was the same as the case in which the predicted value was used. However, a lengthy delay occurred in the 8th trial. To attempt to explain this phenomenon, the actual and predicted values of the iEMG in the 8th trial and the others are shown in **Figure 10**.

In the regular pattern, both the actual and predicted values had a diphasic trend, and the heights of the 2 peaks nearly aligned. However, in the 8th trial, the trends of the actual values and the predicted values differed from each other. Namely, the predicted values had trends that were the same as the regular pattern. The trigger event can be generated with a proper threshold (e.g., 0.25 as shown with dashed lines). However, the height of the first peak of the trend of the actual values was too low to generate a trigger event. In this case, a trigger event was generated at the second peak, resulting in a lengthy delay.

## **DISCUSSION**

To construct a BMI prosthetic arm that performs a reaching task, it is preferable for the response to use information generated by muscle activity and not just the movement. Specifically, a trigger event is generated according to the predicted iEMGs from ECoGs by using a PLS regression. Additionally, motor intention can be correctly estimated by using the predicted value rather than

the actual values for the control of a prosthetic arm. It is usually easier to estimate motor intention with muscle activities than with brain activities. At present, BMI prosthetic hands are not in practical use, while myoelectric prosthetic hands are already commercially available. This is because myoelectric prosthetic hands typically accomplish more sophisticated motions than BMI prosthetic hands do. However, in our current study, a converse phenomenon was observed. Our results indicate that during periodic movements, muscle activity predicted from brain activity is maintained using the periodicity rather than the actual muscle activity. To interpret this counterintuitive phenomenon, we describe the contribution of the cerebellum to motor function, which was clarified by Domen et al. (1998), as follows:


The brain modifies these two aforementioned modes correctly and achieves a task. Feedback control is executed to correct the error between target position and actual position. Because feedback delay can be several tens or hundreds of milliseconds, feedback control is applicable only to slow and primary motions. On the other hand, feed forward control is executed without feedback information from the sensory organs; it is performed according to the internal model constructed in the cerebellum. The monkey that was used as the experimental subject was well trained in the lever operation task. In other words, the monkey performed the motion that was "programmed" in its cerebellum. Then, the task-oriented motor intentions were decoded by the cerebral cortex. However, EMGs appeared due to information processing in the central nervous system, which was slower than in the cerebellum.

In conclusion, we found superior estimation of task-oriented motor intentions by constructing a BMI prosthetic arm. This was confirmed by comparing the periodicity of actual muscle activity with the estimated activity taken from brain ECoGs during the periodic movements of a monkey. Interestingly, if actual muscle activity became disordered, the estimated muscle activity maintained periodicity. Moreover, by comparing the time delay between the prosthetic arm control method based on actual muscle activity and the method based on estimated muscle activity, we found that the method using estimated muscle activity maintained greater stability than that using actual muscle activity.

## **ACKNOWLEDGMENTS**

A part of this study was the result of the "Brain Machine Interface Development" that was conducted under the Strategic Research Program for Brain Sciences by the Ministry of Education, Culture, Sports, Science and Technology of Japan and supported by the Strategic Information and Communication R&D Promotion Program (SCOPE) that was conducted by the Ministry of Internal Affairs and Communication of Japan.

## **REFERENCES**


the monkey ECoGs data associated with self-feeding motions," in *TriSAI2012, SJ19* (Chofu).


interface," in *Advances in Ther. Engineering*, eds W. Yu, S. Chattopadhyay, T.-C. Lim, and U. R. Acharya (Boca Raton, FL: CRC Press), 219–249.

Yokoi, H., Sato, Y., Suzuki, M., Nakamura, T., Mori, T., Morishita, S., et al. (in press). "Engineering approach for functional recovery based on body image adjustment by using biofeedback of electrical stimulation," in *Clinical Systems Neuroscience, Part 2: Body Image Adjustment and Neuro-prosthetics* (New York, NY: Springer).

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 February 2014; accepted: 26 November 2014; published online: 12 December 2014.*

*Citation: Morishita S, Sato K, Watanabe H, Nishimura Y, Isa T, Kato R, Nakamura T and Yokoi H (2014) Brain-machine interface to control a prosthetic arm with monkey ECoGs during periodic movements. Front. Neurosci. 8:417. doi: 10.3389/fnins. 2014.00417*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Morishita, Sato, Watanabe, Nishimura, Isa, Kato, Nakamura and Yokoi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Single trial prediction of self-paced reaching directions from EEG signals

#### *Eileen Y. L. Lew1,2, Ricardo Chavarriaga1, Stefano Silvoni <sup>3</sup> and José del R. Millán1 \**

*<sup>1</sup> Defitech Chair in Non-Invasive Brain-Machine Interface, Center for Neuroprosthetics, School of Engineering, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland*

*<sup>2</sup> Laboratory for Experimental Research on Behavior, Institute of Psychology, University of Lausanne, Lausanne, Switzerland*

*<sup>3</sup> Laboratory of Robotics and Kinematics, I.R.C.C.S. S. Camillo Hospital Foundation, Venice, Italy*

#### *Edited by:*

*Jose L. Pons, Consejo Superior de Investigaciones Científicas, Spain*

#### *Reviewed by:*

*Jose Luis Contreras-Vidal, University of Houston, USA Surjo R. Soekadar, University Hospital of Tübingen, Germany*

#### *\*Correspondence:*

*José del R. Millán, Defitech Chair in Non-Invasive Brain-Machine Interface, Center for Neuroprosthetics, School of Engineering, Ecole Polytechnique Fédérale de Lausanne, EPFL STI-CNBI, ELB 138, Station 11, 1015 Lausanne, Switzerland e-mail: jose.millan@epfl.ch*

Early detection of movement intention could possibly minimize the delays in the activation of neuroprosthetic devices. As yet, single trial analysis using non-invasive approaches for understanding such movement preparation remains a challenging task. We studied the feasibility of predicting movement directions in self-paced upper limb center-out reaching tasks, i.e., spontaneous movements executed without an external cue that can better reflect natural motor behavior in humans. We reported results of non-invasive electroencephalography (EEG) recorded from mild stroke patients and able-bodied participants. Previous studies have shown that low frequency EEG oscillations are modulated by the intent to move and therefore, can be decoded prior to the movement execution. Motivated by these results, we investigated whether slow cortical potentials (SCPs) preceding movement onset can be used to classify reaching directions and evaluated the performance using 5-fold cross-validation. For able-bodied subjects, we obtained an average decoding accuracy of 76% (chance level of 25%) at 62.5 ms before onset using the amplitude of on-going SCPs with above chance level performances between 875 to 437.5 ms prior to onset. The decoding accuracy for the stroke patients was on average 47% with their paretic arms. Comparison of the decoding accuracy across different frequency ranges (i.e., SCPs, delta, theta, alpha, and gamma) yielded the best accuracy using SCPs filtered between 0.1 to 1 Hz. Across all the subjects, including stroke subjects, the best selected features were obtained mostly from the fronto-parietal regions, hence consistent with previous neurophysiological studies on arm reaching tasks. In summary, we concluded that SCPs allow the possibility of single trial decoding of reaching directions at least 312.5 ms before onset of reach.

**Keywords: stroke, self-paced voluntary movement, movement-related potentials, EEG, movement direction, brainmachine interface**

## **1. INTRODUCTION**

Brain machine interfaces (BMI) have been recently used for direct control of neuroprostheses by patients with different levels of motor disabilities (Hochberg et al., 2012; Collinger et al., 2013; Courtine et al., 2013; Leeb et al., 2013). In addition, BMI could also be used to improve the efficiency of post-stroke functional training through the use of brain signals to complement impaired muscle control in movement-assisted rehabilitation therapy (Daly and Wolpaw, 2008; Ang et al., 2011; Niazi et al., 2012; Biasiucci et al., 2013; Ramos-Murguialday et al., 2013). Earlier detection of movement intention could possibly minimize the delays in device activation, which may result in a more natural coupling between the motor planning activity in the cortex and the movementassisted devices (Krebs et al., 2003; Muralidharan et al., 2011). This form of therapy has the potential of speeding up recovery by enhancing the regeneration and reorganization of brain neuronal structures (i.e., brain plasticity) after stroke (Schaechter, 2004; Dobkin, 2007; Kwakkel et al., 2008). For this reason, we are motivated to study how early before the actual movement, the intention to reach toward a target (in the form of discrete direction planning) can be decoded from brain activity. The primary focus of this paper is on single trial decoding of self-paced reaching movements by stroke patients and able-bodied subjects.

Different studies in human and non-human primates have shown the possibility to decode movement parameters from single unit neural activity—such as hand position, velocity, gripping force and muscular activity—for the control of computer cursors and robot arms (Wessberg et al., 2000; Serruya et al., 2002; Taylor et al., 2002; Carmena et al., 2003; Schwartz, 2007; Ganguly and Carmena, 2009; O'Doherty et al., 2011; Hochberg et al., 2012; Collinger et al., 2013). A number of recent studies have proposed the use of non-invasive methods, in particular the electroencephalography (EEG) signal, for decoding reaching directions (Mehring et al., 2003; Waldert et al., 2008; Ince et al., 2010) and continuous trajectories (Wolpaw and McFarland, 2004; Bradberry et al., 2010). Nevertheless most of these studies, in particular those focused on decoding movement direction (Connolly et al., 2003; Mehring et al., 2003; Musallam et al., 2004; Rickert et al., 2005; Rizzuto et al., 2005; Hammon et al., 2008; Waldert et al., 2008; Robinson et al., 2013), rely on cue-based protocols (i.e., where a "go" cue is used to instruct the subject to perform the movement at a fixed time). In contrast, we focus on self-paced reaching, where movements are initiated by the subject in a spontaneous manner without any external cue. This form of reaching movement can better reflect natural motor behavior in humans. Throughout this paper, we define the state prior to movement onset as the *intention* to reach. Intention can be defined as an early plan to move (Andersen and Buneo, 2002) and represents a high level state which specifies the goals of movements rather than the exact muscle activations required for execution. Decoding of intention offers the capability to predict the timing (Niazi et al., 2011; Lew et al., 2012a,b; Xu et al., 2014) and, as studied in this work, the desired target.

Reaching is a complex spatial problem where different reference systems are involved in coding the hand positions directed toward the target location (Philipona et al., 2003; Beurze et al., 2006). Information about the upper limb position, eye position, and target location are combined, coordinated and integrated into a common distributed spatial representations in order to perform a successful goal-directed reach. The posterior parietal cortex (PPC) plays an important role in such coordinate transformation between different reference frames for planning a movement (Cohen and Andersen, 2002). The role of integration is played by a network involving the frontal and parietal cortices for the control and execution of reaching movements, as shown by studies with non-human primate performing visually guided movements (Burnod et al., 1999; Battaglia-Mayer et al., 2003; Gottlieb, 2007). More recently, studies with human subjects using fMRI have shown a similar frontal-parietal network (Culham and Valyear, 2006; Filimon, 2010). These studies suggest that brain signals in the frontal and parietal regions carry the necessary information for decoding visually guided reaching movements (Blohm et al., 2009; Andersen et al., 2010).

We have previously followed a data-driven approach to investigate the contribution of EEG slow cortical potentials (SCPs) in decoding self-paced movement intention (intent to move vs. intent not to move) of both able-bodied subjects and stroke patients (Lew et al., 2012a). Similar conclusions were also obtained by using intracortical recordings (Lew et al., 2012b). Interestingly, it has also been shown that the amplitude of motor cortical local field potentials (LFPs) in lower frequencies (*<*13 Hz) is modulated with the direction of movement (Rickert et al., 2005). In this work, we evaluate whether the same approach based on SCP allows decoding movement directions prior to actual execution of reaching. We also compare the decoding performance of the EEG activity in different frequency bands. To the best of our knowledge, there is no previous attempt to decode directions of self-paced movements from non-invasive signals before actual movement onset.

#### **2. MATERIALS AND METHODS**

We analyzed scalp EEG data recorded from three stroke patients and two able-bodied subjects. Participants were instructed to perform a center-out upper limb reaching task. All procedures were approved by the Ethics Committee of the San Camillo Hospital before the experiment. Subjects were informed about the procedures and gave their consent.

**Table 1** summarizes the subjects' particulars, including the Fugl-Meyer Motor Assessment score for upper extremity (FMA-UE)—maximum score of 66—for stroke subjects. Patient P1 suffered from a left cerebellar hemorrhagic stroke, also commonly known as intracerebral bleed, where the ipsilateral body part is affected. The second patient P2 suffered from a left nucleocapsular stroke caused by lesion in a deeper brain structure, thus affecting the contralateral limb. The third patient P3 has had an ischemic stroke caused by lesion in his frontal and left parietal area, thus affecting his right limb. In general, all patients had preserved tactile and proprioceptive sensibility of the arm with normal cognitive abilities at the time of admission to the hospital. All stroke subjects were able to achieve the reaching task without much difficulty, but with significantly longer average reaching time in comparison with the able-bodied subjects (c.f., **Table 6**).

#### **2.1. EXPERIMENTAL PROTOCOLS**

Subjects were seated in front of a computer screen holding on to a haptic manipulandum (PHANTOM Premium 3.0/6DOF, Sensable Technologies) with their arm resting comfortably on the table as shown in **Figure 1**. The reaching task was performed with both arms and subjects were instructed to move the manipulandum that controls the position of a cursor (a green circle) on a computer screen [c.f. **Figure 1**(Bottom)]. The resting position is the condition when the green circle remains inside the white box located in the middle of the screen. The task was to bring the cursor to one of the 4 center-out target locations (up, down, left, right, projected as white-frame boxes). The distance from the home position to each target positions was approximately 15 cm. When the target location was cued, the subject was asked to wait at least 2 s before initiating the movement at their own pace in order to induce a self-paced movement. The role of the visual cue was to ensure equal distribution of target locations during the recordings. Accordingly, when subjects moved before 2 s (an immediate reaction, as in cued-based reaction tasks), the trial was stopped and discarded.

Subjects were asked to stay in a relaxed position during this idle period before initiating a reaching whenever they wish. For each subject, there were 3 recordings of 80 trials each (targets locations were randomly selected), thus resulting in a total of 240 trials for each arm movement. After removing early starts and artifacts, an average of 230 trials remained for both hands and subject groups. This is the same experimental protocol where



we have demonstrated the detection of movement onset (Lew et al., 2012a). The design of this experiment allows voluntary initiation of movements, in contrast with most cue-based reaction time task protocols where there is a go cue that instructs the subject when to start the movement. It has been reported that there are neurophysiological differences between internally driven and externally cued movement (Thut et al., 2000). A similar protocol has been used to investigate self-paced arm movements with electrocorticography (ECoG) signals (Ball et al., 2009).

#### **2.2. METHODS**

We simultaneously recorded the EEG and electrooculography (EOG) signals with a portable BioSemi ActiveTwo system using 64 electrodes arranged using an extended 10/20 montage at a sampling rate of 2048 Hz, then downsampled to 256 Hz. EOG channels were placed above nasion and below the outer canthi of both eyes in order to capture horizontal and vertical EOG components. To reduce noise contamination, particularly from eye movement artifacts, we performed our analysis using a selection of 34 channels that excluded the peripheral channels and those that exhibited high correlation with the EOG activity (Lew et al., 2012a). The signals recorded from these 34 electrodes were spatially filtered using the common average referencing (CAR) procedure to remove the global background activity (Offner, 1950; Osselton, 1965; Bertrand et al., 1985).

The EEG signals were pre-processed by applying a zero-phase low-pass Butterworth filter (non-causal filter) with cutoff frequency at 120 Hz. The signals were further downsampled to 128 Hz. In order to evaluate direction-related information in different frequency bands, we applied narrow band filters between [0.1–1] Hz for extracting SCP, [1–4] Hz for delta band, [4–8] Hz for theta band, [7–13] Hz for the alpha band, as well as the ranges [13–20] Hz, [20–30] Hz and [30–45] Hz covering beta and gamma activity. For signals below 7 Hz, we directly used time domain features (EEG amplitude). In particular, for the SCP, Garipelli et al. (2013) have compared various spatial and spectral filtering methods to enhance the signal to noise ratio (SNR) of the slow potentials. Their results have shown higher separability index with the use of narrow pass-band filters between [0.1– 1] Hz. They have also reported that CAR filter seems to be a better choice than Laplacian filters. For frequency bands above 7 Hz, we extracted the envelope of the filtered signal by taking the absolute value of the real part of the analytic signal, computed using the Hilbert transform. The Hilbert transform is commonly used in calculating instantaneous amplitude and phase at each time point of a narrow band signal and non-stationary time series such as the scalp EEG signal (Huang et al., 1998; Marple, 1999).

To study the temporal characteristics of brain activity preceding movement onset, referred as *intention period*, we analyzed sliding windows of 250 ms overlapping every 62.5 ms in the period from 2 s before the movement onset to 1 s after. In this paper, the time reported always corresponds to the endpoint of these sliding windows. For each of these windows, we applied the Canonical Variant Analysis (CVA), which is a form of feature selection technique that identifies the most relevant features discriminating among classes, thus significantly reducing the dimensionality of the input vector for the classifier. This technique has previously been proven advantageous for BCI (Galán et al., 2007). CVA extracts subject-specific discriminant spatial patterns that maximizes the difference in variance between the 4 centerout directions classes. As it remains unclear the exact time when the intention to reach is made in a self-paced movement, this method can yield information about movement-related modulations in different brain regions during planning and how they evolve over time. We used the features selected from the training dataset (obtained from 5-fold cross validation) to build a classifier (see below). The feature vector consisted of temporal amplitudes from the 10 channels with the highest discriminant power (DP) for each sliding window. We further reduced the data dimensionality by subsampling to 16 Hz for classification, thus forming a vector of 40 features (10 channels × 4 points) within each 250 ms window.

For classification of movement directions, we relied on Linear Discriminant Analysis (LDA). We built a LDA classifier for each time window. LDA is a simple approach to classification where the samples from each class are modeled with a normal distribution and it is assumed that they have the same covariance matrix (Duda et al., 2001). The probability that the correct class is *y* given a sample *x* can be defined using Bayes' rule:

$$P(C=\mathcal{Y}|\mathbf{x}) = \frac{P(\mathbf{x}|C=\mathcal{Y})P(C=\mathcal{Y})}{P(\mathbf{x})} \tag{1}$$

The classification of a sample *x* is given by *argmaxyP*(*C* = *y*|*x*) over all classes. Data distribution, for all classes *P*(*x*|*C* = *y*), is assumed to be normal for each class, and is modeled using the same covariance matrix, *-*.

$$P(\mathbf{x}|C=\mathcal{Y}) = \frac{1}{(2\pi)^{\frac{\mathcal{C}}{2}}|\boldsymbol{\Sigma}|^{\frac{1}{2}}}e^{-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu}\_{k})^{T}\boldsymbol{\Sigma}^{-1}(\mathbf{x}-\boldsymbol{\mu}\_{k})}\tag{2}$$

Finally, the performance of our method is evaluated using a 5 fold cross validation procedure by maintaining the chronological order when partitioning the training and testing data (Lemm et al., 2006; Bourdaud et al., 2008). This method yields a more realistic estimation of accuracy than random splitting of trials from the entire recording session.

The movement direction decoding accuracy (DA) used in this paper is derived from the confusion matrix, which interprets the relationship between the actual class labels (i.e., 4-class target directions) and the classified label (predicted output), where the sum of the diagonal elements *nii* refers to the correctly classified trials (the actual target location). DA is defined as the ratio between the correct predictions divided by the total number of trials and measures the sensitivity rate. A value of 1 denotes perfect separation between the movement directions.

As mentioned above, we want to evaluate how early before movement onset the movement direction can be predicted. We calculated the chance level by training several classifiers on a randomized permutation of the labels of the training set (10 × 5 folds cross validation). The chance level is derived from the average performance of these classifiers. The chance level is always shown as a red horizontal dotted line in the Results Section.

In addition to assess the sensitivity rate of our classifiers during the intention period, [-2, 1] s around movement onset, we also evaluated its specificity during the idle period where subjects are supposed not to prepare for the reaching movement, namely from 1 s before the visual target cue to 2 s after the cue.

## **3. RESULTS**

#### **3.1. PREDICTING MOVEMENT DIRECTIONS: ABLE-BODIED SUBJECTS**

**Figure 2** shows a summary of results obtained from single trial classification of movement directions from SCPs ([0.1–1] Hz) for the able-bodied subjects, C1 and C2, when utilizing their dominant arm (right in both cases). The topographic plots in **Figure 2A** depict the selected channels based on the ranking of the discriminability power at 500, 250, 125, and 0 ms preceding the onset of movement (channels marked in red refers to the 10 highest ranked channels). These topographic maps show the brain regions which carried the most directional information. **Figure 2B** shows the average and standard deviation (gray shaded area) of single trial DA of movement direction from time −1.750 to 1 s. We tested if the DA measure is significantly above chance level (shown as red dotted line) with 95% confidence interval using the non-parametric Wilcoxon rank-sum test. The green vertical line in this graph shows the first time when the DA value of a group of five consecutive samples are significantly above chance level (*p <* 0*.*05).

For subject C1, DA rose above chance level at 687.5 ms before onset (*t* = 0 s) using amplitudes of on-going SCPs, where performance consistently increased until onset of movement and remained at high values afterwards indicating that directional discriminant neural signatures are continuously decoded during movement. The most discriminant channels for this subject are located in the frontal, parietal and ipsilateral regions starting from time 375 to 0 ms prior to onset.

At this stage of analysis we have built classifiers tuned to each time window. This allows us to pinpoint the most pertinent features for decoding movement direction over time. This approach is quite challenging due to the fact that the onset of self-paced movements has a higher variability compared to cuebased movements. In order to explore an online implementation of SCP-based approaches, we have identified that the time window at 62.5 ms *before movement onset* yielded the highest DA (0*.*83 ± 0*.*05), and contains the most discriminant features for decoding directions, see **Table 2**. In this paper, windows after onset are not taken into consideration as they represent movement execution rather than movement intention. We have then used the features and classifier associated to the window with the peak DA to test the decoding performance during the intention period. **Figure 2C** illustrates the corresponding DA, which climbed above chance level as early as 500 ms before onset. DA increases until 62.5 ms before movement onset, and then decreases after movement onset.

In addition to aim for high sensitivity (high DA) during the intention period, it is also desirable to achieve high specificity (low false positive rate) during the idle period. **Figure 2D** shows that the selected SCP-based classifier performs at random level during the idle period.

**Table 2** summarizes the results of the SCP-based direction decoding for the able-bodied subjects. Regardless of which arm was used, the best decoding performance for both subjects (considering only time before onset) occurred at 62.5 ms before movement onset, with a maximal DA of 0.83 for subject C1. The average DA across folds for each subject was slightly lower for the non-dominant arm. Performances with time-specific classifiers exceeded chance level before onset, early detection, between 875 ms to 437.5 ms and reached DA values above 0.8 after movement onset. For both subjects performance was significantly above random level during the intention period and had a rather low variance. Importantly, performances are quite similar when using the selected time-specific classifier with the best DA (see also **Figure 2C**). In this case, direction was decoded slightly later—between 500 and 312.5 ms before onset. Also, as shown in **Figure 2D**, for both subjects and arms DA was at random level during the idle period.

**Figure 3** depicts the channels selected at the window with the highest DA. They represent the brain regions with highest discriminability power to classify the 4 targets, for the left and right arms of the two able-bodied subjects. Results from both subjects displayed an evident fronto-parietal network, especially when reaching with the right arm. For subject C1, this network is more prone toward the frontal and bilateral central regions when reaching with the left (non-dominant) arm. For subject C2, the ipsilateral central-parietal areas are more discriminant for decoding reaching directions than the contralateral region for the non-dominant. The localization of brain areas will be further analyzed in the Discussion Section.

## **3.2. PREDICTING MOVEMENT DIRECTIONS: STROKE PATIENTS**

We evaluated our SCP-based method to decode movement direction when stroke patients performed the reaching task, notably with their paretic arm (see **Figure 4**). As for able-bodied subjects,

#### **Table 2 | Summary of decoding performances before movement onset for able-bodied subjects.**


*Time corresponds to the endpoint of the sample window with respect to movement onset (t* = *0 ms)*

we first built time-specific classifiers and then selected the best one (highest DA before movement onset) to test sensitivity and specificity. For patient P1, first panel, the channels selected from SCPs preceding onset were strongly focused at the centroparietal regions (**Figure 4A**), with bilateral activation of motor areas toward the time of movement execution. DA of timespecific classifiers (**Figure 4B**) started to exceed chance level at 1000 ms before onset of movement. The maximum DA was 0.51 at time 250 ms before onset. Using the selected classifier during the intention period (**Figure 4C**), DA crossed chance level at 1475 ms before onset. However, DA decreased to random level short after and it exceeded chance level again at 550 ms and steadily increased until onset of movement. Thereafter, DA remained above chance till 500 ms after onset. This selected classifier performed at random level during the idle period (**Figure 4D**).

For patient P2, second panel, CVA selected channels located mainly in the lateral parietal region starting from 500 ms before onset (**Figure 4A**). Time-specific classifiers reached a peak DA of 0.45 at time 62.5 ms before movement onset, while DA exceeded chance level at 1750 ms before onset (**Figure 4B**). However, when using the selected classifier, DA crossed chance level at 625 ms before onset (**Figure 4C**). This selected classifier also performed at random level during the idle period (**Figure 4D**).

Patient P3, third panel, exhibits similar DA trends to the other patients, although discriminant features were found mostly on the central and frontal areas (**Figure 4A**). Note that P3 had a frontal and left parietal area lesion. Using time-specific classifiers DA rose above exceeds chance level at 1062.5 ms before onset, peaking at 125 ms before onset with a value of 0.46 (**Figure 4B**). The selected classifier climbed over chance level at 312.5 ms before movement onset (**Figure 4C**), while never above random performances during the idle period (**Figure 4D**).

**Table 3** summarizes the results of the SCP-based direction decoding for the stroke patients. The DA values obtained for the paretic arm were above 0.45 for all subjects. This performance was reached between 250 ms to 62.5 ms before onset. Once movement started, DA reached values in between 0.51 and 0.73. Regarding early detection—i.e., when DA exceeded chance level—, timespecific classifiers did it earlier than the selected fixed classifiers (in between −1750 and −1000 ms vs. −625.0 and −312.5 ms, respectively). As for able-bodied subjects, performance was significantly above random level during the intention period and had a rather low variance, for both time-specific and selected classifier. Also, DA was at random level during the idle period. In summary, patients achieved a lower performance than ablebodied subjects, but early detection happened at similar times.

## **3.3. DIRECTION-RELATED SPECTRAL AND PHASIC MODULATIONS OF EEG ACTIVITY**

We evaluated direction-specific modulations in several EEG frequency bands, comprising SCPs (0.1–1 Hz), delta (1–4 Hz), theta (4–8 Hz), alpha (7–13 Hz), beta (13–20 Hz), high beta (20– 30 Hz) and low gamma (30–45 Hz). **Figure 5** shows the DAs for both left and right arm movements of the able-bodied subjects. The x-axis of each plot corresponds to the endpoint of each decoding window with respect to the movement onset (time = 0 s) and the y-axis provides the frequency bands. We observed

**Table 3 | Summary of decoding performances before movement onset for stroke patients, paretic arm.**


*Time corresponds to the endpoint of the sample window with respect to movement onset (t* = *0 ms)*

direction-specific modulations in both SCPs and delta band activity. In all cases, SCPs showed DAs above chance level before movement onset, although DAs were higher during movement execution. Moreover, narrow band filtered SCPs [0.1–1] Hz seem to provide information that may allow earlier decoding of movement directions as compared to the wider band SCPs. We also observed performances exceeding chance level when signals filtered in the delta band, which has been studied by Waldert et al. (2008) using signals below 4 Hz.

In the case of the stroke group (see **Figure 6**), SCPs yield higher DAs than other frequency ranges. As with the able-bodied subjects, discriminant modulations were observed in SCPs, although the delta band displayed random performance. The DA for the stroke group reached DA values above chance before the able-bodied group during the intention period.

The use of phase-based features has remained unexplored for decoding movement direction. However, this information has been used in classifying motor imagery-based BCI (Wang et al., 2006; Hamner et al., 2011), auditory target selection (Ng et al., 2013) and decoding continuous movement trajectories (Hammer et al., 2013). We used the instantaneous phase computed using Hilbert transform to explore the decoding power of these features. **Figure 7** shows that the DA of phase-based decoding exceeds chance level approximately 250 ms before movement onset for

both able-bodied subjects. However, the maximal DA values were lower than when using SCP amplitudes. Analysis from stroke patients showed random decoding performances. Despite this less promising results, the use of phase information can be further explored by studying amplitudes and frequency coupling, as well as entrainment. Indeed, high gamma amplitudes coupled with the phase of low-frequency alpha and theta during waiting (pre-movement) periods in a cued grasping task have been previously used to predict movement types (Yanagisawa et al., 2012). Similarly, Miller et al. (2012) found increased beta-phase entrainment from ECoG signals recorded during the no movement periods in finger flexion.

## **4. DISCUSSION**

This preliminary study demonstrates the feasibility of decoding directions of self-pace arm reaching before movement execution from EEG slow cortical potentials. This is also the first time that such a possibility is shown in stroke patients.

Results show good sensitivity (decoding accuracy, DA, significantly above chance level during intention period) and good specificity (DA at chance level during idle period), which are key requirements for its potential use in real-time rehabilitation interventions. This level of specificity indicates that decoding is due to direction-related features from SCPs appearing before onset and not generated by the visual cue. Although promising, the results achieved with stroke patients must be replicated with a larger population. Based on the comparison of different EEG frequency bands (i.e., SCP, delta, theta, alpha, beta, and gamma), we have observed that movement directions can be decoded significantly above chance level using signals filtered at low frequencies (*<*4 Hz), with SCP yielding the best performance in terms of accuracy and early detection. We have used a systematic approach (previously tested for detecting onset of self-paced movements from invasive and non-invasive recordings) to select discriminant features for decoding reaching movements to four directions at different times before movement onset. The outcome of this feature selection process is used to identify the relevant neural signatures associated to the intention to reach targets at different directions. Consistently with existing literature, we observed a frontal-parietal pattern of activity in the able-bodied group, and a more parietal pattern for the stroke group.

#### **4.1. ON THE ROLE OF FRONTO-PARIETAL NETWORKS IN THE PREPARATION OF REACHING MOVEMENTS**

For both able-bodied subjects, the time-resolved channel selection (**Figure 2A**) presented a change from an initial bilateral pattern to a dominant ipsilateral activation between 500 and 250 ms prior to movement onset, coinciding with the increase in decoding performance. Activation of the ipsilateral primary motor area seems to be required for the execution of challenging unimanual motor tasks in normal subjects (Roland et al., 1980; Kim et al., 1993; Kobayashi et al., 2003). The selected classifiers contain such ipsilateral primary motor features (**Figure 3**).

More prominently, our results showed that in the period preceding the movement onset, there is a discriminative pattern involving frontal and/or parietal areas for both able-bodied subjects and stroke patients. Our findings are in agreement with a number of studies, with both humans and non-human primates, where the fronto-parietal brain region seems to play a critical role in planning a reach movement. In a center-out task, Musallam et al. (2004) and Quian Quiroga et al. (2006) studied neural signals related to the goals (direction) of movement from electrodes implanted in the parietal reach region (PRR) of monkeys. Using the memory period activity in a cued paradigm (reflecting monkeys' intent before the "go" signal) from eight PRR neurons, four targets were correctly decoded with 64.4% accuracy.

With respect to human studies, ventral areas of the prefrontal cortex seem to encode spatial information. Intracranial EEG recordings during a memory task allowed decoding left vs. right target in single-trial movements using either temporal evoked activity or spectral activity with performance between 70 and 80% (Rizzuto et al., 2005). On the other hand, neuroimaging studies showed activation in the PRR, potentially encoding information related to the subject's intention to make a movement toward a particular spatial location (Connolly et al., 2003). Furthermore, using fMRI, Naranjo et al. (2007) showed an evolution of the cortex activation during movement preparation starting from frontal and parietal areas, slowly becoming more focused on the frontal cortex 500ms before movement. Gallivan et al. (2011) decoded grasping top or bottom direction (2-class task) from BOLD signal with accuracy of 55%, which suggested brain activation of the parietal and frontal regions during planning. Most of these studies employed a cue-based paradigm. A similar self-paced study with human ECoG signals has shown a steep rise in decoding accuracy starting from 200 ms before movement onset, peaking at 500 ms post movement (67%), based on spectral amplitude modulations in low frequencies and high gamma band from M1 and pre-motor cortex (Ball et al., 2009). An EEG study on visuomotor adaptation during self-initiated center-out hand movements have shown the involvement the fronto-parietal regions in healthy subjects (Contreras-Vidal and Kerick, 2004). In apparent contrast with our observations, based on the findings from Nenadic et al. (2007) with human intracranial EEG from supplementary motor and parietal areas, the signal in the period 500 ms after the appearance of the target stimulus can be decoded with accuracies of 20% higher than the period before onset in a cue-based protocol.

#### **4.2. PERFORMANCE COMPARISON WITH PREVIOUS EEG STUDIES IN DECODING MOVEMENT DIRECTION**

To the best of our knowledge, all previous works aiming at decoding movement directions from non-invasive brain signals (EEG mainly) utilized cue-based protocols. Also, these studies were performed with able-bodied subjects (c.f., **Table 4**). Column *Type* refers to when decoding was attempted, either before onset (*Intention*) and/or during movement (*Execution*).

In relation to the brain region involved in the execution of reaching movements, Waldert et al. (2008) showed that the motor-related areas are responsible for the execution of reaching from magnetoencephalography (MEG). The other works listed in **Table 4** emphasize that decoding of movement direction before onset is correlated to activity in the frontal and parietal areas. Reaching direction planning was decoded using the first 500 ms right after the visual stimulus presentation with performances between 57 and 59% for 4 directions and 80.25% (Hammon et al., 2008) for 2 directions (Wang and Makeig, 2009). Recently, Robinson et al. (2013) reported maximum decoding accuracy of 80% for 4 directions using features extracted from low frequency components of EEG taken from the entire [-1 1]s windows with respect to onset (i.e., including the signal during movement execution). As in our case, they also pointed out to the contribution of SCPs (in particular, motor-related potentials or MRPs) already known to capture preparation-related modulations—for such decoding.

#### **4.3. ROLE OF SCPs IN UNDERSTANDING SPATIAL INTENTION**

Our results showed that narrow-band SCPs contain information that may allow earlier decoding of movement directions as compared to broad-band SCPs. Such a level of decoding requires proper pre-processing techniques (Garipelli et al., 2013) to enhance the SNR of SCPs through the use of spectral and spatial filters. This finding is consistent with previous works on detection of movement intention (Lew et al., 2012a) and movement execution (Niazi et al., 2012; Robinson et al., 2013; Xu et al., 2014), which showed the advantage of low frequency EEG components.

Surface EEG consists of electrical activity generated by different sources in the active intracranial tissue and negative SCPs reflect the unspecific thalamo-cortical activation of a cortical area (Birbaumer, 1999). The first use of SCPs in BCI was through self-regulation of cortical excitability by a completely paralyzed patient (Kuebler et al., 1998). In recent years, a growing number of studies utilized slow potentials, also known as low frequency component (LFC) of measured neuronal population signals, such as for decoding movement trajectories (Bradberry et al., 2010) and for detecting movement intention (Niazi et al., 2011; Xu et al., 2014). LFCs have also been used in invasive studies exploiting LFPs (Rickert et al., 2005) and ECoG for decoding movement parameters (Milekovic et al., 2012; Hammer et al., 2013). As thalamic activation can be regarded as the allocation of attentional or processing resources toward a specific cortical region, this view has instigated the use of SCPs for studying retention of working memory under the viewpoint of attention (Bosch et al., 2001). The authors reported that retention of spatial locations in working memory was associated with a combination of slow waves over frontal and parietal-occipital sites. Within the framework of our experimental protocol, it seems likely that there were a form of memory retention after the initiation of the visual cue, in the form of spatial memory. A debatable question that follows is the role of working memory in our task and how to better elicit internally-driven intention to prevent the confound of decoding memory retention of the spatial location. This can be accomplished by modifying the experimental design to a self-initiated target selection in order to exclude any form of memory retention. On the other hand, we could explore the feasibility to decode spatial memory retention from EEG signals.

A potential limitation of SCPs for real-time implementation is the significant group delays introduced by filtering, which may be a problem for applications requiring prompt response. Xu et al. (2014) has successfully shown their use in a closedloop online implementation with true positive rate of 79% at a latency of 315 ms. Further studies could be done to assess this speed-accuracy trade-off for real-time implementation.

#### **4.4. COMPARING DETECTION OF MOVEMENT INTENTION AND PREDICTION OF MOVEMENT DIRECTION**

**Table 5** compares the early detection times (when DA exceeds chance level with performance evaluated using the best classifier) of both the intention to initiate the self-paced movement (Lew et al., 2012a) and the predicted direction. Which component should be detected earlier remains an open question. For all subjects but C1, movement intention was detected earlier than direction. For subject C1, however, detection only differs by 25 ms. Subject P1 deserves an additional comment. Although early detection of direction appeared at −1437.5 ms, decoding was not stable and rapidly decreased to random level. Only at −550 ms direction decoding remained above chance level until movement onset. It seems then that discriminant information about movement onset shortly precedes direction-related information.

As a future direction, we will explore the the use both decoders, either in parallel or sequentially, to enhance the reliability of upper-limb neuroprostheses. Fusion of both kind of

**Table 5 | Comparison between detection of movement intention and prediction of movement direction.**


**Table 4 | Non-invasive methods used for decoding movement directions.**


**Table 6 | Time to complete reaching movements and onset time (interval between target cue presentation to start of movement).**


*Able-bodied subjects: dominant arm; stroke patients: paretic arm.*

decoders can be also applied during online operation of the neuroprosthesis so as to achieve continuous control. Another extension is to incorporate eye movements tracking into the trajectory model for decoding directional reaches (Corbett et al., 2012).

#### **4.5. STROKE PATIENTS AND ABLE-BODIED SUBJECTS**

Our results show differences in decoding performance between able-bodied subjects and stroke patients. A reason that could explain this difference is the time required to complete the reaching movement (see **Table 6**). Able-bodied subjects completed the reaching movement in less than 700 ms, while stroke patients took more than 1000 ms when using their affected limb. These differences were statistically significant (*p <* 0*.*001, two-tailed Student's *t*-test). These behavioral differences are in line with other studies comparing motor deficits after stroke (Cirstea and Levin, 2000). Despite marked differences in execution times, further analysis showed that trajectories were similarly smooth for able-bodied subjects and stroke patients.

Besides the difference in reaching speed, the age difference between the able-bodied and patient groups could also potentially be a reason for lower decoding performance of approximately 30% by the stroke patients. The issue on age-related differences in BCI performance has been investigated in some studies (Vesco et al., 1993; Friedman et al., 1997; Allison et al., 2010), which reported differences in amplitudes, latencies and scalp topography. Therefore, a fair comparison between able-bodied subjects and stroke patients is only possible using age-matched groups in order to avoid potential confounds. In addition, there are plenty of references to EEG abnormalities caused by cerebrovascular disease (CVD) (Niedermeyer, 1982; Pfurtscheller et al., 1984). These differences are caused by the location, size of damage and time elapsed between stroke and EEG recording, thus resulting in distinct motor-related potentials as compared to able-bodied people (Colebatch, 2007).

Robot-assisted therapy for stroke patients with moderate-tosevere upper-limb deficits has shown promising results in terms of improving motor functional recovery compared to traditional therapy (Kwakkel et al., 2008). Ang et al. (2014) have shown that motor gains obtained with BCI-based therapy were comparable to those attained with intensive robotic therapy. The method proposed in this paper can be further verified in an online implementation to control a robotic arm and, later, in combination with rehabilitation robotics (Krebs et al., 2003) for motor recovery of spinal cord injury and stroke patients1 . For this purpose, it is important to assess the stability of brain patterns across days (i.e., to determine how MRPs change during the process of functional recovery). Furthermore, in a realistic scenario, it is also important to study the effect of feedback generated by the robot-assisted passive movement on the stability of the brain patterns.

## **FUNDING**

This work is supported by the European ICT Programme Project FP7-224631, Swiss NCCR "Robotics," S. Camillo Hospital Foundation, and Italian Ministry of Health.

#### **ACKNOWLEDGMENTS**

Authors warmly thank L. Tonin, S. Degallier, G. Cisotto, and C. Genna for their precious help with recordings. This paper only reflects the authors' views and funding agencies are not liable for any use that may be made of the information contained herein.

## **REFERENCES**


<sup>1</sup>The University of Texas Health Science Center, Houston, "Brain Machine Interface Control of an Robotic Exoskeleton in Training Upper Extremity Functions in Stroke," *ClinicalTrials.gov*, April 2014, http://clinicaltrials*.*gov/ ct2/show/NCT01948739


on underlying rhythms. *PLoS Comput. Biol.* 8:e1002655. doi: 10.1371/journal.pcbi.1002655


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 March 2014; accepted: 07 July 2014; published online: 01 August 2014. Citation: Lew EYL, Chavarriaga R, Silvoni S and Millán JdR (2014) Single trial prediction of self-paced reaching directions from EEG signals. Front. Neurosci. 8:222. doi: 10.3389/fnins.2014.00222*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Lew, Chavarriaga, Silvoni and Millán. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Sitting and standing intention can be decoded from scalp EEG recorded prior to movement execution

## *Thomas C. Bulea1,2\*, Saurabh Prasad2, Atilla Kilicarslan2 and Jose L. Contreras-Vidal <sup>2</sup>*

*<sup>1</sup> Functional and Applied Biomechanics Section, Rehabilitation Medicine Department, National Institutes of Health, Bethesda, MD, USA*

*<sup>2</sup> Laboratory for Non-invasive Brain-Machine Interface Systems, Department of Electrical and Computer Engineering, University of Houston, Houston, TX, USA*

#### *Edited by:*

*Jose L. Pons, Consejo Superior de Investigaciones Científicas, Spain*

#### *Reviewed by:*

*Reinhold Scherer, Graz University of Technology, Austria Alireza Mousavi, Brunel University, UK*

#### *\*Correspondence:*

*Thomas C. Bulea, Functional and Applied Biomechanics Section, Rehabilitation Medicine Department, National Institutes of Health, Building 10, CRC, Room 1-1469, 10 Center Dr., MSC-1604, Bethesda, MD 20892, USA e-mail: thomas.bulea@nih.gov*

Low frequency signals recorded from non-invasive electroencephalography (EEG), in particular movement-related cortical potentials (MRPs), are associated with preparation and execution of movement and thus present a target for use in brain-machine interfaces. We investigated the ability to decode movement intent from delta-band (0.1–4 Hz) EEG recorded immediately before movement execution in healthy volunteers. We used data from epochs starting 1.5 s before movement onset to classify future movements into one of three classes: stand-up, sit-down, or quiet. We assessed classification accuracy in both externally triggered and self-paced paradigms. Movement onset was determined from electromyography (EMG) recordings synchronized with EEG signals. We employed an artifact subspace reconstruction (ASR) algorithm to eliminate high amplitude noise before building our time-embedded EEG features. We applied local Fisher's discriminant analysis to reduce the dimensionality of our spatio-temporal features and subsequently used a Gaussian mixture model classifier for our three class problem. Our results demonstrate significantly better than chance classification accuracy (chance level = 33.3%) for the self-initiated (78.0 ± 2.6%) and triggered (74.7 ± 5.7%) paradigms. Surprisingly, we found no significant difference in classification accuracy between the self-paced and cued paradigms when using the full set of non-peripheral electrodes. However, accuracy was significantly increased for self-paced movements when only electrodes over the primary motor area were used. Overall, this study demonstrates that delta-band EEG recorded immediately before movement carries discriminative information regarding movement type. Our results suggest that EEG-based classifiers could improve lower-limb neuroprostheses and neurorehabilitation techniques by providing earlier detection of movement intent, which could be used in robot-assisted strategies for motor training and recovery of function.

**Keywords: EEG, electroencephalography, movement-related cortical potentials, classification, brain-machine interface, mobile neuroimaging, lower extremity**

## **INTRODUCTION**

Robot-assisted therapies have shown promising results, compared to traditional therapy, for functional recovery of movement after injury in the upper and lower extremities (Winchester et al., 2005; Hogan and Krebs, 2011). These neurorehabilitation paradigms could be improved by faster and more robust detection of movement intent where it originates in the brain. Incorporation of a brain machine interface (BMI) can reduce the latency between motor planning in the cortex and activation of a device to execute (or assist) the movement, thereby enhancing the opportunity for brain plasticity and motor recovery (Daly and Wolpaw, 2008). The intuitive nature of a BMI based on signals directly related to intended movement could be advantageous for rehabilitation by expediting adaptation of the brain to the BMI algorithm and the robotic device. Electroencephalography (EEG) provides a non-invasive method for imaging brain activity with enough time resolution to exert control over an assistive device. Many strategies for deploying EEG in a BMI by detecting movement intent (imagined and real) have been reported (Pfurtscheller et al., 1996, 2006; Wolpaw et al., 2002; Millán et al., 2004; Qin et al., 2004; Hung et al., 2005; Morash et al., 2008). These systems typically leverage one of two phenomena to detect movement intent: event related (de)synchronization (ERD/ERS) and movement related slow cortical potentials (MRPs). ERD, a decrease of power in alpha and beta bands, is typically localized to the contralateral sensorimotor areas before movement while ERS, a power increase, has been observed after movement (Pfurtscheller and Lopes da Silva, 1999). Modulation of these sensorimotor rhythms has been employed for classification of imagined (Pfurtscheller et al., 2006; Pfurtscheller and Neuper, 2006) and executed (Morash et al., 2008) movements with some success. ERD has also shown capacity to categorize gross lower extremity tasks, including differentiation of right and left leg motor imagery (Boord et al., 2010) and identification of imagined standing (Zhong et al., 2007).

MRPs are slow negative potentials observed in EEG preceding movement. MRPs can be divided into two segments: the first begins as early as 2 s before movement onset and has been observed over the entire pre-supplementary motor area (SMA), and over the SMA and lateral premotor cortex according to somatotopic organization (Ikeda et al., 1992; Hallett, 1993; Shibasaki and Hallett, 2006; Bai et al., 2011). The second, or late, segment typically has a steeper negative slope and is observed in the contralateral primary motor cortex (M1) and lateral premotor cortex according to precise somatotopic arrangement. These potentials are well established in upper and lower extremity movements both real and imagined (Boschert and Deecke, 1986; Shibasaki and Hallett, 2006). Interestingly, MRPs recorded from EEG preceding toe, foot, and ankle movements tend to be larger on the ipsilateral side of the brain, which is the opposite of upper extremity movements that create larger MRPs on the contralateral side (Brunia and Van Den Bosch, 1984; Boschert and Deecke, 1986). This paradoxical lateralization of the MRP during foot movements may be explained by its localization along the midline deep within the precentral gyrus of the motor cortex, thereby directing electrical current from activation of these cell columns to the opposite hemisphere.

The type and sequence of movement affects MRPs recorded from EEG. MRPs appear to be more pronounced during self-initiated movements compared to triggered movements (Jahanshahi et al., 1995; Cui and MacKinnon, 2009); the difference appears to be further enhanced if the timing of the triggered movements is variable (Jankelowitz and Colebatch, 2002). In the case of finger movements, force level (Slobounov et al., 2002), finger sequence (Bortoletto et al., 2011), and task complexity (Shibasaki and Hallett, 2006) all appear to modulate the MRP. MRP amplitude was found to be highly correlated to joint torque and electromyography (EMG) amplitude during isolated elbow flexion (Siemionow et al., 2000). In the lower extremity, the rate of torque development appears to influence the late MRPs preceding isolated ankle movements (do Nascimento et al., 2006). Slow negative shifts in EEG similar to MRPs have been observed during coordinated movements of the lower extremity, including rising onto the toes (Saito et al., 1996) and self-paced forward postural sway (Slobounov et al., 2005). The direction of gait initiation and stepping has been reported to influence both the slope and magnitude of MRPs (do Nascimento et al., 2005). These previously published studies suggest that slow developing, movement related potentials observed prior to movement may contain discriminative information regarding the movement that is being performed. Further, MRPs appear to provide an appropriate measure for timing of afferent feedback to induce long term potentiation of cortical projections. As demonstrated in the tibialis anterior muscle, only peripheral stimulation delivered at the peak of the MRP increased motor evoked potentials from transcranial magnetic stimulation (TMS) targeting the ankle area of the motor cortex (Mrachacz-Kersting et al., 2012).

Because of their small amplitude and low frequency content, the best way to extract MRPs from EEG recording is to average across many trials of the same movement. Single trial classification of movement intention from MRPs is possible, but achieving high accuracy can be difficult. Classification typically involves several steps, including signal pre-processing, feature extraction, dimensionality reduction, and finally feature classification (Bashashati et al., 2007). Numerous approaches to these steps have resulted in application of many machine learning, feature selection, and pattern recognition techniques for classification of movement intent and direction based on EEG signals (Garrett et al., 2003; Peterson et al., 2005; Bai et al., 2007; Lotte et al., 2007). The first example of a BMI-based spelling device utilized slow cortical potentials derived from a motor imagery task to provide individuals with amyotrophic lateral sclerosis control of a cursor on a screen (Birbaumer et al., 1999). Two individuals were able to achieve accuracies greater than 75% after 327 and 288 training sessions. Recent studies have demonstrated success in utilizing MRPs extracted via low frequency or delta band EEG, including classification of finger movement (Liao et al., 2007), joystick direction (Waldert et al., 2008), wrist movement direction (Vuckovic and Sepulveda, 2008), direction of a center out reaching task (Robinson et al., 2013), and movement intention in a self-paced reaching task (Lew et al., 2012). The latter study showed higher detection accuracy using the lower delta band than alpha (7–13 Hz) or beta (13–20 Hz) bands. MRPs have also been successfully deployed for classification of lower extremity movements. At the ankle, MRPs have been used to detect movement intention in healthy subjects with average accuracy of 82.5% for movement execution, and with slightly lower accuracy for motor imagery (64.5%) and attempted movement in stroke patients (55%) (Niazi et al., 2011). Similar accuracies were reported in a study that did not incorporate an individual-specific training phase (Niazi et al., 2013), further supporting the robustness of MRP as a BMI target. In addition to movement intention, MRPs recorded during imagined plantar flexion have also been used to distinguish between two different rates of torque development (do Nascimento and Farina, 2008). Recent studies have demonstrated that MRPs recorded from EEG can be deployed in real-time BMIs. In one, MRPs preceding imagined ankle dorsiflexion were identified online to trigger electrical stimulation of the tibialis anterior (Niazi et al., 2012). Not only did this study show feasibility of MRPs for use in a BMI, but it also demonstrated the potential benefits BMMI-based neurorehabilitation since motor evoked potentials from TMS were enhanced following the intervention in healthy individuals. Another study showed that delta band EEG could reliably ascertain ankle movement initiation in real time with a mean latency of 315 ms (Xu et al., 2014).

In addition to detecting and classifying movement type, sparse networks of low frequency EEG have also been successful in decoding kinematics and EMG activity during various movements, including decoding of hand grasping patterns (Agashe and Contreras-Vidal, 2013), hand and finger velocity (Bradberry et al., 2010; Liu et al., 2011; Paek et al., 2014), and muscle synergies during reaching (Beuchat et al., 2013). Additionally, peri-movement neural activity representative of movement direction has been observed in electrocorticographic (ECoG) signals over primary motor, premotor, posterior-parietal, and lateral prefrontal cortex (Ball et al., 2009). Action intention can also be decoded from fMRI data recorded from a wide cortical network, spanning from the parieto-occiptial sulcus through the prefrontal cortex, both preceding and during movement execution (Gallivan et al., 2011). Taken together these studies suggest non-invasive EEG recorded from large areas of the scalp immediately prior to movement execution could carry useful information about movement.

EEG has been used to examine cortical activity during gait, including studies demonstrating that intra-stride changes in spectral power are coupled to gait cycle (Gwin et al., 2011) and that level of user-involvement in robotic-assisted walking alters gaitrelated patterns of electrocortical activity (Wagner et al., 2012). Low frequency EEG also appears to carry useful information regarding walking. A recent study showed that features corresponding to frequencies less than 2 Hz were the most heavily weighted during single trial classification of walking and pointing direction (Velu and de Sa, 2013). Delta-band EEG was used to classify walking intention in one individual with paraplegia using a robotic exoskeleton with accuracy greater than 98% (Kilicarslan et al., 2013) and to decode lower limb kinematics during walking in healthy individuals (Presacco et al., 2011, 2012). MRPs have also been used with a matched filtering technique to detect single-trial step initiation (Jiang et al., 2014). An important consideration for application of low frequency EEG to the study of whole-body movements such as walking or sit/stand transition is the presence of movement-related artifacts. A recent study showed similar power spectral density patterns from an accelerometer mounted on the head and from EEG electrodes (Castermans et al., 2014). Interestingly, the patterns were similar only at higher walking speeds, while differences between the accelerometer and EEG were observed at slower speeds. The study did not compare spectral patterns from EEG during walking without the rigid plate and linkage assembly used to mount the accelerometer on the head, so the effect of its mass and inertia remains unknown. Also, the study did not employ active EEG electrodes which provide amplification at the electrode to minimize movement artifacts and increase signal-to-noise ratio. Spatial filtering techniques, such as independent component analysis (Delorme et al., 2007), may be used to isolate gait-related artifact, but the effectiveness of these techniques is still under investigation. In one study, gait-related artifact remained in many independent components of EEG, resulting in development of a template subtraction technique to clean EEG collected during walking (Gwin et al., 2010). This type of template regression would not be appropriate for studying cortical contribution to locomotion because all signals coupled to the gait cycle would likely be removed. Another technique utilizes principal component analysis to compare sliding windows of EEG to a baseline recording, thereby removing high amplitude artifacts (Mullen et al., 2013); this approach may be better suited for removing movement artifacts but has not yet been applied to gait. Thus, the feasibility of utilizing EEG to study cortical activations during whole-body movement tasks is an ongoing area of research. Nevertheless, an inherent advantage of MRPs is their presence in EEG recorded before movement, when motion artifacts are minimized.

In this study we examined the use of non-invasive EEG recorded prior to movement execution to discriminate a user's intent to perform two coordinated whole body movements rising from a seated to standing posture and lowering from a standing to a seated posture—in a three class problem, where the third class constituted no movement or "quiet"; this class included data collected during quiet standing and quiet sitting. Based on the previous body of evidence regarding the discriminative nature of MRPs with regards to movement, we utilized delta band EEG to build our features for classification. We trained and tested our classifier using time periods before executed movements, as opposed to cue-based imagery, so we could precisely align EEG recordings with movement onset detected from EMG recordings. We studied classification accuracy during two different paradigms: a self-initiated series of stand-to-sit and sit-tostand transitions and transitions which were cued by an audio trigger. Because triggered movements are reported to produce less prominent MRPs (Jankelowitz and Colebatch, 2002; Cui and MacKinnon, 2009), this protocol allowed us to examine the effect of MRP signal to noise ratio on classification accuracy. We utilized time-embedding and concatenation of EEG channels from the time before movement execution to create a feature vector of high dimension to classify the intended movement (standup, sit-down, or quiet). Given the autoregressive nature of EEG signals (Muller et al., 2003) and the underlying neurophysiology (e.g., volume conduction), we assume that the recorded EEG originates from a system with fewer degrees of freedom than our feature vector dimensions, resulting in a manifold data structure. Recent advances in machine learning have resulted in algorithms which preserve the local structure of a manifold data set in a reduced dimensional subspace (Sugiyama, 2007; Li et al., 2012) thereby enhancing the discriminative power of the data set. Based on the observation that information pertinent to movement is contained in low frequency EEG, we hypothesized that applying a locality preserving dimensionality reduction technique to our high dimensional feature vector derived from time-embedded and spatially diverse delta band EEG would reveal its underlying discriminative structure. We coupled this supervised data reduction with a Gaussian mixture model classifier to test if we could reliably ascertain the intended movement of the user from offline analysis of EEG recordings. We believe such a classifier could eventually be deployed in a real-time BMI system to control an assistive device or as a component of a neurorehabilitation paradigm to restore motor control.

## **METHODS DATA COLLECTION**

Ten healthy adults (6 male, 4 female) with no history of neurological disease participated in the study after giving informed consent. This study protocol was approved by the Institutional Review Board at the University of Houston. Participants completed two trials of 10 alternating sit-to-stand and stand-to-sit transitions; one trial was self-paced and one trial was cued via audio trigger. Each trial began with the participant standing quietly in an upright posture for 15 s. In the triggered trial, an audio cue (beep) was given after which point the participant initiated a transition to a seated posture. The seated posture was held for a period ranging randomly from 3 to 10 s, after which a second audio cue was given to initiate the transition from sit-to-stand. The standing posture was held for another (random) 3–10 s interval, at which point the process was repeated Bulea et al. Decoding sit/stand intention from EEG

until 20 transitions (10 of each) were completed. The procedure for the self-paced trial was similar. After 15 s of quiet standing, the participant was instructed via verbal cue to begin the selfinitiated stand-to-sit and sit-to-stand transitions. The participant was instructed to wait for a random interval of 3–10 s before self-initiating the next transition. Finally, the participant was notified by verbal cue once he/she had completed 20 self-initiated transitions.

Time-locked EMG and EEG data were collected simultaneously using a previously developed data collection system (Bulea et al., 2013). Surface EMG (Biometrics, Ltd, Ladysmith, VA) was recorded at 1000 Hz bilaterally from the tibialis anterior, gastrocnemius, biceps femoris, and vastus lateralis. Whole scalp, active electrode, 64-channel EEG (Brain Products, GmbH, Morrisville, NC) were collected at 1000 Hz and labeled by the 10–20 international system. The impedance of each EEG electrode was maintained below 25 k for the entire data collection.

## **DATA ANALYSIS FOR CLASSIFICATION OF MOVEMENT INTENT** *Preprocessing*

All data analysis and classifier optimization and evaluation were performed off-line using custom software in Matlab (Mathworks, Natick, MA). The data processing and classification methodology is shown in **Figure 1**. Peripheral EEG channels susceptible to eye blinks and facial/cranial muscle activity were removed from offline analysis (all channels labeled Fp, AF, FT, T, TP, O, PO, and F5-8, P5-8) resulting in 28 channels being retained for classification. EEG signals were then high pass filtered at 0.05 Hz using a zero-phase 8th order Butterworth filter. Next, we removed transient, high-amplitude artifacts from stereotypical (e.g., eye blinks) and non-stereotypical (e.g., movement, muscle bursts) using an automated artifact rejection method termed Artifact Subspace Reconstruction (ASR) (Mullen et al., 2013) which is available as a plug-in for EEGLAB software (Delorme and Makeig, 2004). ASR uses a sliding window technique whereby each window of EEG data is decomposed via principal component analysis so it can be compared statistically with data from a clean baseline EEG recording, collected here as 1 min of EEG recorded during quiet standing. Within each sliding window the ASR algorithm identifies principal subspaces which significantly deviate from the baseline EEG and then reconstructs these subspaces using a mixing matrix computed from the baseline EEG recording. In this study, we used a sliding window of 500 ms and a threshold of 3 standard deviations to identify corrupted subspaces. After ASR, the cleaned EEG was band pass filtered with a zero phase, 3rd order Butterworth filter from 0.1 to 4 Hz to isolate the delta band activity. The EEG data were then standardized by channel by subtracting the mean and dividing by the standard deviation (z-score).

EMG recordings from the lower extremity muscles were used to determine movement onset of each stand-to-sit and sit-tostand transition. First, the Teager-Kaiser energy operator was applied to each EMG channel to enhance the signal-to-noise ratio for onset detection (Li et al., 2007). Next, each EMG channel was detrended, band pass filtered (15–300 Hz), rectified, and low pass filtered at 3 Hz to compute the linear envelope. Then, the

quiet periods and periods of movement (sitting and standing). Only pre-movement epochs (1.5 s before movement to movement onset) and quiet epochs (1.5 s after movement completion to 1.5 s before next movement) were retained for analysis. As a control, a separate decoding analysis using movement epochs (movement onset to 1.5 s after onset) was also performed. Artifact subspace reconstruction (ASR) algorithm, available as a plug-in for EEGLAB software (Delorme and Makeig, 2004), was applied to eliminate artifacts from EEG data during pre-processing. Note that the optimization and evaluation data sets are mutually exclusive.

linear envelope of each muscle was thresholded into a binary signal which was equal to 1 when the envelope exceeded its mean baseline value during quiet standing and sitting by more than 3 standard deviations (Hodges and Bui, 1996) and zero when it was within 3 standard deviations of baseline. The baseline period of EMG activity before each movement was identified *a posteriori* by visual inspection starting with the initial 15 s of rest before the first movement. The baseline period between each successive sit-to-stand and stand-to-sit transition comprised at least 2 s. Movement onset for each transition was determined when any of the 8 thresholded EMG envelopes transitioned from rest (0) to active (1). Likewise, the end of each movement was determined when all 8 channels returned to rest (0). The algorithmically determined periods of activity were visually inspected for accuracy. Using prior knowledge of the experimental protocol (i.e., the order of the stand-to-sit and sit-to-stand transitions), the periods of muscle activity were labeled as stand-to-sit or sitto-stand. Note that for some trials, gastrocnemius muscles were active during the quiet stance phase and/or biceps femoris EMG was contaminated by artifact from the leg during sitting, thereby increasing the standard deviation in these channels and limiting the ability to determine the true state using that muscle. When these periods of activity/artifact were observed visually, these muscles were removed from the trial; in this case the user activity was assessed using the remaining 6 muscles.

Next, the time-locked EEG and EMG data were downsampled to 200 Hz. EEG data were then epoched into pre-movement, postmovement and quiet periods based on the thresholded (binary) EMG signal. Each pre-movement epoch consisted of data from 1.5 s before movement onset up to movement onset. EEG data from 1.5 s after movement completion until 1.5 s before the next movement onset, with a maximum of 5 s, comprised the quiet epochs. These epochs were then concatenated into a single time series containing alternate periods of quiet and pre-movement. For control purposes, we also created a second time series of data containing concatenated quiet epochs and epochs of EEG from movement onset to 1.5 s after movement onset (post-movement epochs).

The concatenated EEG data sets comprised the three-class classification problem for each trial; each time point of the quiet epochs was labeled as class 0 (quiet) while each time point of each pre-movement epoch was labeled according to the type of movement it preceded: class 1 (stand-to-sit) or class 2 (sit-to-stand). Next, a time-embedded feature matrix was constructed for each trial. Each time point in the feature matrix was a vector composed of 10 lags, corresponding to 50 ms in the past, of EEG data. The number of lags and embedded time interval was chosen based on previous studies demonstrating accurate decoding of movement kinematics from low frequency EEG (Bradberry et al., 2010; Presacco et al., 2011). The feature vector for each time point was constructed by concatenating the 11 lags (the current time point plus the 10 prior) for each channel into a single vector of length 11 × *N*, where *N* is the number of EEG channels used for classification (for this study, *N* = 28). To avoid the problem of missing data, the feature matrix was buffered by starting at the 11th EEG sample of each epoch, resulting in a feature matrix of dimension [*Mt*−*L*] × [11 × *N*] for each trial of self-initiated and triggered movements where *Mt* is the number of time points in each trial and *L* is the number of past time lags multiplied by the number of epochs in each trial (for this study, *L* = 10∗41 = 410). On average, there were 18,442 ± 2110 time points in each feature matrix, with exactly 2900 time points for class 1 and 2900 time points for class 2 while the remaining time points represented class 0. For all subjects, the original dimensionality of the feature space was 308 (11 × *N*).

#### *Dimensionality reduction*

Since our EEG-based feature vectors were of relatively high dimension and were composed of time lagged and spatially distributed samples, we assumed our original dataset to represent a manifold which may contain multimodal within-class distributions. Furthermore, we sought to classify gross motor intention and therefore had a limited number of classes (in this case there were three: quiet, stand-to-sit, and sit-to-stand). Thus, we performed dimensionality reduction on our feature matrices to eliminate any redundant features, reduce computational complexity, prevent over-fitting during classifier training and increase classification performance. Many techniques have been reported for dimensionality reduction in EEG based classifiers, including principal component analysis (PCA), linear discriminant analysis (LDA), and genetic algorithm (GA) (Bashashati et al., 2007; Lotte et al., 2007). Consideration of the task, neurophysiology and EEG recording system suggests that a supervised dimensionality reduction technique could improve feature selection for classification purposes. EEG data generally have a low signal-to-noise ratio and unsupervised linear dimensionality reduction techniques may be affected by these signal distortions. PCA reduces dimensionality by maximizing data variance in the projected subspace via a linear transformation. The transformation, dictated by the eigenvectors that correspond to the largest eigenvalues of the data covariance matrix, is unsupervised and can discard useful information for classification that is contained in the lower energy dimensions of the original data (Prasad and Bruce, 2008). In contrast, LDA is a supervised dimensionality reduction technique since it attempts to maximize between-class scatter while minimizing within-class scatter in the projected subspace. However, LDA has difficulty doing this if the original data are heteroscedastic or multimodal. Furthermore, the size of the LDA-reduced subspace is limited to *c*-1 (where *c* is the number of classes).

Local Fisher's discriminant analysis (LFDA) combines the strategy of LDA with a locality-preserving projection to provide a linear manifold learning technique that preserves the withinclass structure of the original space in the projected subspace; details of the LFDA algorithm applied in this study are provided in Sugiyama (2007). Briefly, LFDA seeks to find a transformation that preserves local neighborhood information, thereby ensuring that the underlying structure of the data distribution is preserved in the lower dimensional (size *r*) subspace. To accomplish this, the scatter matrices typical of LDA are scaled using an affinity matrix that measures the closeness of any two points relative to their *knn*nearest neighbor. The parameters *knn* and *r* must be optimized in concert with the classifier for each subject. LFDA has been previously deployed as a preprocessing step for classification of walking intention (Kilicarslan et al., 2013) and classification of expressive movement (Cruz-Garza et al., 2014) from EEG. A similar locality preserving projection was also employed for detection of ankle movement intention from low frequency EEG (Xu et al., 2014).

#### *Classification algorithm*

Once a suitable algorithm for dimensionality reduction was determined, we next identified a classification scheme to decode movement intent from our EEG-based features. Gaussian mixture model (GMM) classifiers are common in the fields of biometrics and biomedical engineering because GMMs are capable of representing arbitrary statistical distributions as a weighted summation of multiple Gaussian distributions, termed components (Paalanen et al., 2006). Utilizing a GMM to compute the class-conditional probabilities in a maximum-likelihood classifier could improve performance over the traditional formulation, especially when the within-class feature set may be non-Gaussian, as could be the case for the temporally and spatially diverse EEG based features used in this study. The probability density function for a given training data set in the LFDA projected subspace, *X* = {*xi*} *n <sup>i</sup>* <sup>=</sup> <sup>1</sup> <sup>∈</sup> <sup>R</sup>*<sup>r</sup>* , is given by:

$$p(\mathbf{x}) = \sum\_{k=1}^{K} a\_k \phi\_k \tag{1}$$

$$\phi\_k(\mathbf{x}) = \frac{\exp[-0.5(\mathbf{x} - \mu\_k)^T \Sigma\_k^{-1} (\mathbf{x} - \mu\_k)]}{(2\pi)^{r/2} |\Sigma\_k|^{1/2}} \tag{2}$$

where *K* is the number of components, *α<sup>k</sup>* is the mixing weight, *μ<sup>k</sup>* is the mean vector, and *<sup>k</sup>* is the covariance matrix of the *k*-th component. The parameters of each GMM component *K*, including *αk*, *μk*, and *<sup>k</sup>*, are estimated as those which maximize the log-likelihood of the training set given by:

$$L\_k = \sum\_{i=1}^n \log p\_k(\mathbf{x}\_i) \tag{3}$$

where *p*(*x*) is given in (1). Maximization of (3) is carried out using an iterative expectation-maximization (EM) algorithm (Vlassis and Likas, 2002), with the initial estimate of the parameters *αk*, *μk*, and *<sup>k</sup>* established via k-means clustering (Su and Dy, 2007), until the log-likelihood reaches a predetermined threshold. The number of components *K* is a critical parameter for successful implementation of a GMM classifier. During training, we limited the maximum value of *K* to be 10 and computed the maximum log likelihood from (3) for each model with values of *K* ∈ {1, 2 . . . 10}. We estimated the optimal value of *K* as the model that minimized the Bayes information criterion, which has been reported as an effective measure for optimizing the number of GMM components (Li et al., 2012). In this manner, GMMs representing each movement class were specified for use in a maximum-likelihood classifier.

The parameters for each class-conditional GMM were computed using an optimization data set for each participant (see Classifier optimization section). The parameter space which must be explored in order to fit these mixture models can be quite large, especially if the feature dimension is large. Given the limited time and training data available during EEG studies, this learning task may be impractical, but as indicated in the previous section, LFDA has been shown to effectively reduce data dimensionality while preserving the statistical information. Thus, we applied LFDA dimensionality reduction on our EEG feature set prior to training and testing a GMM model for use in a maximum-likelihood classifier of intended motion.

#### *Classifier optimization*

The EEG feature matrix from each trial was split into two mutually exclusive sets: one for LFDA-GMM classifier optimization and one for classifier evaluation (**Figure 1**). The optimization data set was selected randomly from the full data set, and it comprised 400 samples (2 s) of data from each class. The optimization data set was then split into two equally sized exclusive subsets, one for training and one for testing. The parameters for the LFDA-GMM classifier (the nearest neighbor (*knn*) used in the affinity matrix, the dimensionality (*r*) of the projected subspace, and the number of mixture components (*K*) in the mixture model) were optimized for each subject and trial type self initiated and triggered—using the optimization data set. Optimization involved three steps (**Figure 1**): (i) dimensionality reduction using LFDA for values of *knn* and *r* from 1 to 249 and 1 to 250, (ii) identification of the optimal value of *K* for each class at each grid point in (i) using the training data from the optimization set, and (iii) computation of the accuracy of the LFDA-GMM classifier at each grid point in (i) using the testing data from the optimization set. The optimal parameters {*knn*, *r*, *K*} for each subject were selected as those which produced the highest overall classification accuracy from the testing data.

#### *Classifier performance via cross validation*

The performance of the LFDA-GMM classifier with the optimal parameter set was analyzed for each subject and trial using repeated random sub-sampling cross validation (**Figure 1**). Repeated sub-sampling was chosen because the variable timing of the movements in each trial would result in an unequal number of samples from each class if *k-*fold cross validation scheme was used. The evaluation data set was randomly split into mutually exclusive training and testing data sets (**Figure 1**). Each of the three classes in the training set contained 600 data points representing 20% of the sit and stand classes. (Because the sit and stand classes were composed of ten 1.5 s long pre-movement epochs for each subject, their size was always equal). After training, LFDA-GMM classifier performance was analyzed using the testing data set, which contained all remaining data from the sit and stand classes, and an equal number of data points randomly selected from the quiet class. Thus, each class in the testing set contained 1900 data points. This test set structure was used to control for effects of class population size by assuring an equal number of testing samples in each class. During testing a classification decision was made for each data point, which represented a single time sample from the trial. The posterior probability of each data point was computed using the optimized GMM for each class and the data point was then assigned to the class that returned the largest value. This process yielded a classification decision for 1900 data points per trial. To avoid training bias, the random training and testing process was repeated 20 times and the average classification accuracies were reported for each subject under each condition (self-initiated and triggered movements). We performed *post-hoc* statistical comparisons between conditions using the non-parametric Kruskal-Wallis one-way analysis of variance.

To examine the effects of the ASR algorithm and the potential contribution of motion artifacts, we repeated the optimization and cross validation procedure using EEG data from premovement epochs pre-processed in the same manner as **Figure 1** except that the ASR process was omitted. We also examined the classification accuracy using EEG epoched from movement onset to 1.5 s after movement onset both with and without the ASR algorithm. Finally, we divided the scalp into four major regions of interest (ROI) to assess the classification ability of each area individually. The ROIs included the frontal cortex (F3,F1, Fz, F2, F4, FC2, FC1, FC2, and FC4), the motor strip (C5, C3, C1, Cz, C2, C4, and C6), the parietal cortex (CP5, CP3, CP1, CPz, CP2, CP4, CP6, P3, P1, Pz, P2, and P4) and the central midline (FC1, FC2, C1, Cz, C2, CP1, CPz, and CP2). For each condition, we assessed within subject differences in accuracy across ROIs using the nonparametric Friedman test. The statistical sign test was used to assess if the difference in accuracy between self-initiated and triggered movements for each participant and ROI were significantly different from a distribution with a median of zero.

#### *Demonstration of simulated real-time classification*

We implemented a two-fold approach to demonstrate LFDA-GMM classifier performance in a simulated real-time environment using EEG data from the self-paced trial. The classifier was trained using ASR-cleaned EEG data from the first half of the trial with the optimal parameter set for each subject. Unlike during the cross-validation procedure, the time periods immediately following the movement execution were not trimmed from the data set but instead were included in the quiet class. Data from the second half of the trial, containing five transitions each of stand-to-sit and sit-to-stand, was used to test the controller in a simulated real-time manner resulting in a continuous time series of classification decisions.

#### **OBSERVATIONAL EEG MEASURES**

In addition to classification of movement intent, we computed several observational measures to help assess differences in cortical activity across the experimental conditions. We computed the MRPs from each subject during both the self-initiated and triggered conditions. To compute MRPs, each EEG channel was band pass filtered between 0.1 and 50 Hz and epoched from 2.5 s before movement onset to 1 s after onset. Each channel and epoch was baseline corrected using the mean voltage from 2.5 to 2 s before onset. Each channel was then averaged over all 20 epochs for each condition.

To ascertain differences between periods of quiet (i.e., rest between movements), pre-movement, and post-movement epochs under each condition (self-initiated and triggered) we computed the power spectral density (PSD) for each EEG channel with a frequency resolution of 0.12 Hz using the Thompson Multitaper method in Matlab with a time bandwidth product of 4. The PSD was computed after artifact removal with ASR but before band-pass filtering and standardization. EEG was common average referenced for purposes of PSD computation. The spatial distribution of alpha band (8–13 Hz) ERD was computed for the pre-movement and post-movement epochs under both conditions as was the change in power in the delta band (0.1–4 Hz). The change in power for both frequency bands was computed relative to the quiet epochs for each condition (self-initiated and triggered). We assessed statistical differences across conditions using the non-parametric Kruskal-Wallis one-way analysis of variance with a Bonferroni correction for multiple comparisons.

## **RESULTS**

#### **OBSERVATIONAL MEASURES**

Standardized EEG and the linear envelope of EMG recorded during a typical trial for one subject is shown in **Figure 2**. EEG with and without ASR is shown, demonstrating the removal of high amplitude artifacts, especially in the time periods following movement onset. Although all 64 channels of EEG are displayed, those channels marked with an asterisk (∗) were removed prior to classification of movement intention. The EEG PSD computed during rest (quiet standing) and the pre-movement epochs during the self-initiated and triggered trials is shown in **Figure 3**. The grand mean PSD across all participants and electrodes used for classification (lower inset, **Figure 3**) is shown. Two identifiable peaks are present in the rest condition, during which the subject was standing quietly; one in the theta band at approximately 7 Hz and one in the alpha band at approximately 11 Hz. Power in these bands were significantly greater at rest than during the premovement epochs under both conditions (*p <* 0*.*01 for both). Notably, the delta band power during the pre-movement epochs was greater than rest while the power in the theta and alpha band was greater during rest (upper inset, **Figure 3**). In the premovement epochs, there was significantly less power in the theta band (4–8 Hz) during self-initiated transitions compared to triggered (*p* = 0*.*004), while power in the alpha band (8–13 Hz) was not statistically different between conditions (*p* = 0*.*107). Finally, power roll-off, indicated by the slope of the PSD, was diminished in theta and alpha bands compared to surrounding delta and beta bands for the self-initiated pre-movement; however, roll-off was only decreased in the alpha band for the triggered condition.

The change in delta and alpha band power for the pre- and post-movement epochs, relative to the periods of quiet sitting and standing between movement executions, averaged over all participants is shown in **Figure 4**. In the delta band, we observed slightly increased power in the pre-movement epochs over all electrodes for both conditions, with slightly more delta power present in the self-initiated trials. In contrast, delta band power during the post-movement epochs was much larger, especially for the triggered trials, which showed nearly double the delta band power of the rest condition. The same level of increase was not observed over the full scalp in the self-initiated trials, although delta band power over the central midline electrodes increased by nearly 100%. Alpha band power was similar to quiet periods across most electrodes (note the difference in scale between alpha and delta power in **Figure 4**). Bilateral alpha band ERD was observed in both conditions; however for the triggered trials the ERD was less prominent and restricted to the central sensorimotor and parietal electrodes, while frontal and peripheral electrodes showed a slight increase in alpha power. Conversely, alpha ERD was stronger in the self-initiated condition, especially in the central-parietal areas of the scalp.

We found the presence of MRPs to be variable across subjects and conditions. In 3 subjects, MRPs were prominent across the scalp during the self-initiated movement epochs but not during

change in delta and alpha band power across all electrodes and subjects

and post-movement epoch (movement onset to 1.5 s after onset) relative to the quiet state for both the triggered and self-initiated conditions.

the triggered movements (**Figure 5A**). For the remaining subjects, less prominent MRPs were present at some electrodes for both conditions (**Figure 5B**). We examine the relationship between MRP and classification accuracy in more detail below.

## **CLASSIFIER VALIDATION**

The LFDA-GMM classification accuracy surface followed a similar pattern for most subjects (**Figure 6**), rising sharply as the size of the reduced subspace (*r*) increased. Accuracy typically

**FIGURE 5 | Example of movement related potentials (MRPs) recorded in two different subjects. (A)** MRP from S5 indicating a difference between triggered (black line) and self-initiated (red line) movements. **(B)** MRP from S9 indicating similar, less prominent RPs for both the triggered and self-initiated trials. For each subject, MRPs were averaged across all 20 movements for each condition; movement onset is at 0 s.

peaked for *r* values between 50 and 125 before decreasing slightly, and then reaching a plateau as the value of *r* was further increased. Classification accuracy was generally insensitive to the *knn* parameter with the exception of very low *r* values. The optimal parameter set for each subject and condition is provided in **Table 1**. Across subjects and conditions, the average dimension of the EEG-based feature space following LFDA was 88 (range 30–118), representing a significant reduction from the original size of 308. With few exceptions, the optimal accuracy was achieved using only one mixture component (*K* = 1) and thus, the LFDA-reduced EEG features were generally not strongly multimodal.

The mean overall classification accuracy obtained from the 20 times cross validation procedure for each subject and condition is shown in **Figure 7** along with the overall mean across all



*The table indicates optimal parameter set for the triggered (white background) and self-initiated (shaded background) paradigms.*

subjects for each condition. The mean accuracy across subjects was 74.1 ± 5.7% for the triggered condition and 78.0 ± 2.6% for self-initiated. Testing sample size was equal across the three classes (1900 samples per class for each subject and condition). Interestingly, there was no significant difference in overall accuracy between self-initiated and triggered movements across the entire group of subjects. For subjects S2, S4, S5, and S7 decoding accuracy was significantly greater (*p <* 0*.*01) for the self-initiated sit-to-stand and stand-to-sit transitions compared to the triggered paradigm. Two subjects, S1 and S3, showed significantly better classification accuracy for the triggered movements compared to self-initiated, though with less strength (*p <* 0*.*05). The normalized confusion matrix for each condition was computed by summing the total number of predicted samples for each class across all 10 subjects and then dividing each predicted sum by the actual class sample size (**Figure 8**). We also computed the overall kappa coefficient (Cohen, 1968; Carletta, 1996) for each condition, resulting in *κ* = 0*.*61 for triggered and *κ* = 0*.*67 for self-initiated. For both triggered and self-initiated conditions, the quiet class was decoded with the highest accuracy and misclassifications for the quiet class were evenly distributed between the two types of movement (sit and stand). Notably, classification accuracy for all three classes was slightly, though not significantly, higher during the self-initiated trials. The majority of misclassifications for sit and stand movements were in the quiet class regardless of condition. Classifier confusion between movement types was slightly larger for the triggered paradigm, with 10.2% of sit movements misclassified as stand (as opposed to 4.2% for self-initiated) and 7.6% of stand movements misclassified as sit (compared to 3.0% for self-initiated).

**three class decoding problem for (A) triggered and (B) self-initiated conditions.** The confusion matrices were computed by totaling the predicted number of samples from each class across all 10 subjects

To assess the relationship between classifier accuracy and MRPs we computed the grand median area under the MRP curve for each condition and subject in a three step process. We first computed the area under the MRP of each channel for each movement epoch; a negative number for this area indicated a larger MRP presence. Next, we computed the median area under the curve for each electrode, and then we took the grand median area across all electrodes. We plotted this value against the mean classification accuracy for both the self-initiated and triggered conditions (**Figure 9A**). Surprisingly, we did not find a strong correlation between area under the MRP curve and classification accuracy (*R*<sup>2</sup> = 0*.*09). Based on our prior observation that some subjects showed more prominent MRPs during the self-initiated movement compared to triggered, we computed the individual change in accuracy and the change in median area under the MRP curve across these conditions for each subject (**Figure 9B**). There was a slightly stronger, but still modest (*R*<sup>2</sup> = 0*.*27) correlation between individual change in accuracy and area under the MRP curve. Interestingly, the subject with the most visually prominent difference in MRP between conditions (S5, **Figure 5A**; blue arrow in **Figure 9B**) showed the second largest increase in accuracy between the self-initiated and triggered conditions. However, the subject with the largest increase in accuracy across conditions (S8, red arrow in **Figure 9B**) showed only a moderate increase area under the MRP curve. The two subjects with significantly greater accuracy for the triggered condition also had larger areas under the MRP curve in that condition (**Figure 9B**).

#### **CLASSIFICATION BY ROI**

The mean and subject specific classification accuracy was lower for all four ROIs than with the full set of non-peripheral electrodes for both self-initiated and triggered movements (**Figure 10**), a result that was expected due to the lower number of electrodes used for classification. Of note, however, was that despite the differing number of electrodes within each ROI we observed few within subject significant differences in accuracy for each condition (**Figures 10B,C**). Similarly, when accuracy was averaged across the 10 subjects, there were no statistically

and dividing by the total number of samples from each. For each repetition of the sub-sampling cross-validation procedure there were 1900 samples included in each class. The overall kappa coefficient for each condition is included in parentheses.

indicates the presence of larger MRPs. The coefficient of determination (R2) is indicated. **(B)** The change in decoding accuracy across conditions plotted against the change in area under the MRP curve for each subject. A large negative value for change in area indicates a stronger MRP presence during the self-initiated condition, while a large positive value indicates a stronger MRP presence during the triggered condition; values close to zero indicate similar MRPs for both conditions. The coefficient of determination (R2) is indicated. The two participants with the largest difference in accuracy across conditions are indicated by the arrows.

significant differences between the ROIs for either condition. To assess the effect of self-initiated vs. triggered movements, we computed the within subject difference in accuracy for each ROI between these conditions (**Figure 10D**). A majority of participants (8/10) showed similar or significantly greater accuracy for all four ROIs in the self-initiated condition. The two subjects (S1 and S3) who showed significantly greater accuracy for the triggered movements with the full set of electrodes also showed greater accuracy in several, but not all, ROIs in this condition. Interestingly, when the difference was averaged across subjects, only the motor strip ROI showed significantly increased classification accuracy for the self-initiated condition. Indeed, decoding accuracy of movement intent during self-initiated sitting and standing using the motor ROI was significantly greater than during triggered movement in 7/10 subjects, similar in 2/10 subjects, and decreased in only 1/10 subjects.

#### **EFFECTS OF ARTIFACT REMOVAL**

To examine the effect of the ASR artifact rejection algorithm, and the potential effect of motion or other artifacts on classification accuracy, we repeated the classifier optimization and cross-validation procedure for the self-initiated condition using three control data sets and compared those with the original preprocessing (**Figure 11**). The original data set is termed ASRpre in **Figure 11**. The first control data set was composed of the same pre-movement epochs consisting of 1.5 s of EEG data recorded immediately prior to movement onset, however, ASR was omitted from the pre-processing (**Figure 1**); this data set is termed Rawpre. We decoded movement intent using an equally sized epoch encompassing the 1.5 s time period immediately after movement onset. We processed these data with (ASRmove) and without (Rawmove) the ASR artifact rejection algorithm. We found that the ASR algorithm had no statistically significant affect on accuracy when using the pre-movement epochs to decode movement intent (**Figure 11**). This result was consistent for every subject and when accuracy was averaged across all subjects. When movement type was classified with EEG from epochs immediately after movement onset, a statistically significant increase in accuracy was observed in every subject when the data were not cleaned with ASR (Rawmove). Application of the ASR algorithm (ASRmove) resulted in a statistically significant drop in accuracy for decoding with the post-movement epochs in 9/10 subjects. When averaged across participants, no significant difference in accuracy was observed between ASR cleaned pre- and post-movement epochs, while accuracy was significantly higher for decoding with raw post-movement data.

#### **SIMULATED REAL-TIME CLASSIFICATION**

The results of simulated real-time decoding using cleaned EEG data are shown in **Figure 12**. Class-wise accuracy in this demonstration was different than observed from the cross-validation (**Figure 8**) an effect caused by the training sample bias inherent to the two-fold procedure used for the demonstration. The quiet class (0) contains a larger number of samples than either stand-to-sit (class 1) or sit-to-stand (class 2) resulting in very high accuracies during quiet periods. Confusion between classes 1 and 2 was present during most transitions; the low number of transitions used in this demonstration likely contributed to this confusion. Errors at the beginning and end of the movement periods skewed toward class 0 (quiet).

#### **DISCUSSION**

#### **CLASSIFICATION OF SELF-INITIATED AND TRIGGERED MOVEMENT FROM PRE-MOVEMENT EEG**

Our results demonstrate successful, high accuracy classification of movement intent in healthy individuals from delta-band EEG recorded before movement execution. We framed our experiment into a three-class problem where each time point was classified into one of three states: quiet, stand-to-sit transition, or sit-tostand transition. It is important to note that we trimmed the

**(ROI). (A)** Scalp map indicating the electrodes included in each ROI. **(B)** Average decoding accuracy ±1 standard deviation (*n* = 20) using the optimized LFDA-GMM algorithm for each ROI and subject during the self-initiated condition. **(C)** Average decoding accuracy ±1 standard deviation (*n* = 20) using the optimized LFDA-GMM algorithm for each ROI and subject during the triggered condition. Hash marks (*p <* 0. 05) for a given subject and condition based on Friedman's test. **(D)** The mean difference in pre-movement decoding accuracy between the self-initiated and triggered conditions for each subject ±1 standard deviation. Asterisks (∗) indicate differences which were statistically significant (*p <* 0*.*05) from a distribution with a median of zero based on the sign test.

time periods of actual movement execution—as determined from EMG activity—from our EEG recordings. Thus, our classifier was trained and tested using mutually exclusive EEG datasets recorded during either quiet standing or quiet sitting but when subjects presumably were preparing for the incoming action. We labeled each time point in the 1.5 s epoch before movement onset according to the type of movement that was executed in the future: stand-to-sit or sit-to-stand. All other time points were placed into a single quiet class. Classification ability was assessed in two different movement execution paradigms, one that was cued by an audio signal (triggered) and one that was self-paced (selfinitiated). Interestingly, we observed no statistically significant difference in classification accuracy between these two conditions, though average accuracy across the 10 subjects was slightly higher for the self-initiated condition (78.0 ± 2.6%) compared to triggered (74.7 ± 5.7%) and both of these were significantly better than chance accuracy of 33.3%.

Prominent MRPs were not visible in all subjects (**Figure 5**) and we found almost no correlation between median area under the

and tested for the self-initiated case using pre-movement epochs with the original pre-processing pipeline (ASRpre, green) and using pre-movement epochs omitting ASR from pre-processing (Rawpre, red). As a control, the classifier was also trained and tested using equally sized epochs (1.5 s) immediately following movement onset that were pre-processed with (ASRmove, gray) and without (Rawmove) ASR for artifact rejection.

MRP curve and classification accuracy (**Figure 9A**). For within subject comparisons between conditions, we observed significantly better accuracy in four of ten subjects during the selfinitiated compared to triggered paradigm, while two subjects had higher accuracy for triggered standing and sitting. When examining subject specific changes in accuracy across the two different paradigms, we found a slightly stronger correlation between increased accuracy and area under the MRP curve. And the two individuals that showed a decrease in accuracy in the selfinitiated vs. triggered trials also showed an increased area under MRP curve, indicating less prominent MRPs. These results appear to contradict previous examples which indicated that MRPs may be more prominent in self-paced vs. cued movement paradigms (Jahanshahi et al., 1995; Jankelowitz and Colebatch, 2002; Cui and MacKinnon, 2009). There are several possible explanations. First, our experimental paradigm included a relatively low number of epochs (*n* = 20) for each condition, compared to traditional studies of MRPs which typically utilize close to 100 (Shibasaki and Hallett, 2006). This low number of epochs may be the reason for the large variability in the presence of MRPs (**Figure 5**). Additionally, in the self-paced experiment, participants were instructed to pause 3–10 s between each movement though they were also instructed not to count the seconds between each movement. As a result, participants rarely waited 10 s between self-paced movements; most periods of quiet lasted 5 s or less. Previous studies have observed trial-to-trial variation in timing and power of MRPs relating to self-paced left and right hand movements, making classification of those movements using low frequency features more difficult (Bai et al., 2007). Another study found that while they were present for most—but not all subjects and movements, low frequency features were less critical than ERD/ERS in classifying four different types of movement from EEG (Morash et al., 2008). The latter study utilized the contingent negative variation (CNV), which is a low frequency,

transition.

contains a time series of simulated real-time classification decisions from the

event related-potential entailing a widespread negative shift in EEG observed in paradigms involving conditional and imperative stimuli (Walter et al., 1964). While our paradigm did not involve dual stimuli, it is possible that some participants experienced a similar effect due to the alternating nature of the movements. That is, completing the previous maneuver (sitting or standing) may have created a conditional response in which the subject then began to prepare for the next movement, which would be the opposite of the prior one. This conditional response may be another reason that we did not observe prominent MRPs in some subjects. Indeed, trial-to-trial variation in CNV amplitude has been described previously and this variation may be representative of anticipated events and/or fluctuations in attention to the task (Scheibe et al., 2010). The observed variation in MRPs may also be responsible for the skewed misclassification of sit and stand movement intentions as quiet (**Figure 8**). Note that while the full time series of EEG data contained more samples in the "quiet" class than in the "sit" and "stand" class, an equal amount of data from each class was used for cross-validation, and thus, this pattern of misclassification was not a result of training bias.

Variable timing of movement execution and conditional response may have affected the prominence of MRPs, but it did not hinder classification accuracy. One reason for this may be the time-embedding of our classification features which encompassed information from up to 50 ms before the current time point, helping to alleviate previously reported MRP-based feature variability (Bai et al., 2007). Low frequency EEG has been shown to contain information regarding intention (Lew et al., 2012), direction (Liao et al., 2007; Vuckovic and Sepulveda, 2008; Waldert et al., 2008; Robinson et al., 2013), velocity (Bradberry et al., 2010), and type (Agashe and Contreras-Vidal, 2013) of hand movement. In the lower extremity, the ability to detect voluntary ankle dorsiflexion movement from MRPs with accuracies up to 80% has been reported (Niazi et al., 2011; Xu et al., 2014). During walking, intra-stride changes in electrocortical activity coupled to gait phase have been observed at frequencies as low as 3 Hz (Gwin et al., 2011) and inter-limb and intra-limb kinematics (Presacco et al., 2011, 2012) as well as the intention to start and stop walking (Kilicarslan et al., 2013) have been decoded using delta band EEG. In another recent study, features extracted from the delta band were the most heavily weighted for single trial classification of walking movement intention from EEG recorded prior to movement (Velu and de Sa, 2013). Our results, which classified lower extremity movement type using pre-movement EEG, corroborate these findings and provide further evidence that low frequency EEG contains discriminative information pertaining to lower extremity movement intent.

#### **CLASSIFICATION BY REGION OF INTEREST**

The results from our ROI analysis (**Figure 10**) support the hypothesis that stand-to-sit and sit-to-stand transitions are preceded by event-related activity across a distributed, sparse cortical network. As expected due to the reduced number of electrodes, no ROI reached the classification accuracy attained when all electrodes were included in the classifier. When averaged across subjects, there were no statistically significant differences in classification accuracy between the ROIs for either condition, despite the difference in number of electrodes. The ROI analysis also revealed a statistically significant increase in accuracy for within subject differences across conditions (self-initiated vs. triggered) when using only the electrodes over the motor area. A similar difference was not found for any other ROI or for the entire scalp. This result suggests that the primary motor cortex (M1) region contains more discriminative information for identification of standing and sitting intention when the movements are self-initiated compared to cued. This finding is supported by previous work indicating MRPs from this region differ when the motor task emphasized sequence initiation compared to rhythm (Bortoletto et al., 2011). EEG recorded from these electrodes has also been demonstrated to most accurately track movement initiation using other frequency bands such as mu/alpha ERD and beta ERS (Wolpaw et al., 2002).

#### **ARTIFACT SUBSPACE RECONSTRUCTION**

This study, along with previously mentioned work, establishes compelling evidence for neural correlates of movement within EEG signals recorded immediately prior to movement execution; however, it is important to address the possible role of artifacts, both physiological such as muscle and eye and non-physiological, such as movement. Our signal processing approach for classifier training and evaluation (**Figure 1**) was designed to minimize the effect of artifacts in several ways. First, we eliminated frontal, temporal, and occipital electrodes which can be contaminated by EMG and/or EOG artifacts. Second, we trimmed all EEG that was recorded during periods of movement as indicated by lower extremity EMG from our data set, leaving only EEG recorded during periods of quiet sitting and standing for classification. Third, we applied a PCA-based artifact rejection algorithm (ASR) that was designed to eliminate high amplitude and high variance artifacts, such as those from movement or muscle, from EEG (Mullen et al., 2013). Our pre-processing analysis demonstrated similar power spectral density between rest (quiet standing) and premovement periods under both conditions (**Figure 3**), suggesting that our pre-processing steps were effective in removing artifacts from EEG. We also observed alpha ERDs in the period immediately following movement onset (**Figure 4**), especially during self-initiated trials, an observation that would have been unlikely if muscle activity had remained in the cleaned-EEG signals since EMG tends to have power in this frequency band.

To further elucidate the possible role of artifacts and these steps to mitigate them, we compared the LFDA-GMM classifier performance when it was trained and tested with three different control data sets with our original processing pipeline (**Figure 11**). This analysis showed no statistically significant difference in accuracy, regardless of whether the pre-movement EEG was cleaned with ASR or not, suggesting that artifacts were not present and therefore did not affect classification using the pre-movement epochs. We did observe a significant increase in accuracy when the premovement epochs were replaced with equally sized epochs immediately following movement onset that had not been cleaned using ASR. After ASR cleaning, classification accuracy was commensurate with pre-movement epochs, although with a slightly larger standard deviation across subjects. The increased accuracy using post-movement epochs without ASR suggests that artifacts may have been present during this time and these artifacts may have enhanced decoding accuracy. The decreased accuracy following ASR suggests that this algorithm is effective at removing high amplitude artifacts from EEG data. This conclusion is further supported by the simulated real-time demonstration using ASR-cleaned data. The time periods after movement onset were included in the quiet class during training and were decoded with high accuracy during testing (**Figure 12**). But, caution should be exercised regarding the conclusion that ASR completely eliminates low frequency, high amplitude artifacts. We note that while we did observe alpha ERD in ASR-cleaned post-movement epochs, we also observed enhanced power in the delta band across the scalp, particularly in the triggered condition (**Figure 4**). One possible explanation for the post-movement increase in delta band power in the triggered trials could be residual head movement and/or muscle artifacts as the participant reacted to the audio cue to stand or sit. Further spectral, topographical, and temporal analysis should be undertaken to parse movement related artifacts from true electrocortical sources recording during the actual sitting and standing movements. In particular, the parameters of the ASR algorithm can be optimized to more aggressively remove artifacts at the expense of potentially removing true EEG. We emphasize that our primary analysis involved only EEG from pre-movement and quiet periods, thereby limiting the contribution of these potential artifactual components as indicated by the above analysis.

#### **EEG USE IN REHABILITATION AND RESTORATION OF MOVEMENT**

To our knowledge, this is the first study that classifies this type of gross, full lower extremity movement intention—sit-down, stand-up, or quiet—from non-invasive EEG signals. Previously, surface EMG from leg muscles has been used with an LDA classifier to identify standing and sitting transition in amputees with accuracies greater than 99% (Zhang et al., 2012). Achievement of these high accuracies required the use of a post-processing majority voting step, which resulted in a decision delay of up to 400 ms. Another approach has deployed center of pressure to detect sitting and standing transition in individuals with paraplegia (Quintero et al., 2011). Classification of sitting and standing using EEG offers advantages over these approaches. On average, we were able to achieve 78% accuracy using features extracted from the pre-movement epochs with no post-processing required, thereby minimizing delay between movement intention and classification. It should be noted that our classification accuracy was assessed using single time points that were randomly selected from each trial. This conservative approach was necessary to prevent model over-fitting during training and to assure an equal number of data points in each class during testing due to the relatively low number of movements executed (20 per condition) for each subject. An example of the LFDA-GMM algorithm in a simulated realtime environment is shown in **Figure 12**. We note that classifier training was not optimal for this demonstration; only 5 stand-tosit and sit-to-stand transitions were employed. Further, clinical deployment of the classifier as a component of a BMI could be significantly improved by addition of an aggregate post-processing step—such as requiring a number of consecutive time points to be predicted as the same movement type or a sliding window moving average with a threshold—to trigger a change in state. The parameters of this post-processing step need to be tuned for each subject and application to maximize accuracy and minimize false positives. Future studies will investigate this possibility and the tradeoff between gains in accuracy and increased classification latency from post-processing.

One drawback of utilizing GMM based classifiers is the size of the parameter space which must be learned, which is given by *K* ∗ (1 + *d* ∗ (*d* − 1)*/*2) + *K* ∗ *d*, where *K* is the number of Gaussian components in the mixture, and *d* is the dimensionality of the data to be fit (Li et al., 2012). To fit a GMM to our timeembedded EEG-based feature data set, which includes data from 28 channels of EEG at 11 time points and a maximum of *K* = 10 components for a given class, requires learning a parameter space of dimension 4*.*76 × 105. Our results demonstrate that LFDA is a powerful dimensionality reduction technique; the median dimension of the reduced subspace was 96 (**Table 1**), representing a median reduction of 69% across subjects. LFDA reduced the size of the GMM parameter by an order of magnitude, resulting in a large decrease of computation time to fit the models of the classifier. Classifier optimization and training was performed using custom software developed in Matlab®, including the parallel processing toolbox, run on a dual core PC (2.40 GHz, 24 GB RAM). On average, optimization across the full LFDA-GMM parameter space was complete in less than 15 min per subject, and training of the optimized LFDA-GMM classifier in less than 5. If deployed for control of an assistive device, LFDA-GMM classifier optimization and training may be required before each session of use; these results suggest this is feasible. Examination of the optimization surface (**Figure 6**) shows that gains in accuracy level-off at moderate values of *r* while accuracy is relatively insensitive to *knn*. The same trend is observed in all subjects, with some showing decreases in accuracy for increasing *r*-values, while in others there is no difference in accuracy as the parameter values are increased. Thus, these parameters could be limited to smaller values, thereby reducing the parameter space to be searched during LFDA-GMM optimization. However, the optimal parameter set is expected to vary with the task and also with the ability of the subject to learn how to operate the BMI over time, and so caution should be exercised when determining the upper limits. Also, full covariance matrices ( *<sup>k</sup>*) were deployed for each component of the GMMs; however, if the subspace of the data following LFDA dimensionality reduction was large, employing diagonal covariance matrices could be used as a way to speed classifier training.

The LFDA-GMM classifier presented here could be incorporated into a closed loop BMI system with an exoskeleton to restore function to individuals with paralysis. Such a system would be comprised of a shared control paradigm, whereby the gross motor instruction (in this case, the intention to sit-down or stand-up) is extracted from the user's EEG and the commands to execute the movement are performed autonomously by the exoskeleton. In this setup, the exoskeleton would be triggered at the first time point in which the BMI detected a change in class; a process that would likely include a post-processing step requiring a sequence of consistent classifier decisions to trigger a change in state. The decoding algorithm would then be blanked so that no state changes could be triggered during the execution of a movement. Our observed accuracy of 78% in self-paced movements would need to be improved for clinical viability. However, the data used in this study were purely observational, while operation of a BMI is a learned skill that incorporates feedback to the user regarding performance; thus accuracy of the BMI may increase as the user gains additional experience with the device. In the future, EEG and EMG could be combined to create a comprehensive neural-machine interface for control of advanced prosthetics. The combined EEG-EMG interface could provide intuitive control of artificial limbs while minimizing delay between detection of voluntary movement intention and its execution. Our classification approach could also be used in an intervention to treat phantom limb pain, whereby a descending motor command is determined from EEG and a motorized prosthesis executes the movement providing afferent feedback which could obviate maladaptive cortical reorganization following amputation. EEG-based classification of movement intent could also be incorporated into a neurorehabilitation protocol to recover more normal motor function in individuals with neurologic impairments. For example, the EEG based classifier would activate a device to assist movement, thereby creating more normal afferent feedback, which could enhance brain plasticity and speed motor recovery (Daly and Wolpaw, 2008). Such a strategy requires extraction of motion intent from the motor impaired population; in this study only healthy able-bodied individuals were tested. Future studies will examine the ability to apply LFDA-GMM classification to individuals with central nervous systems deficits with an aim toward neurorehabilitation strategies.

## **ACKNOWLEDGMENTS**

The authors thank Recep Ozdemir, Yongtian He and Shahriar Iqbal for assistance during the experiments. This research was funded in part by the intramural research program at the NIH Clinical Center, NINDS R01NS075889, and NSF IIS-1302339.

### **REFERENCES**


therapy in motor-incomplete spinal cord injury. *Neurorehabil. Neural Repair* 19, 313–324. doi: 10.1177/1545968305281515


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 June 2014; accepted: 04 November 2014; published online: 25 November 2014.*

*Citation: Bulea TC, Prasad S, Kilicarslan A and Contreras-Vidal JL (2014) Sitting and standing intention can be decoded from scalp EEG recorded prior to movement execution. Front. Neurosci. 8:376. doi: 10.3389/fnins.2014.00376*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Bulea, Prasad, Kilicarslan and Contreras-Vidal. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Temporal alignment of electrocorticographic recordings for upper limb movement

## *Omid Talakoub1,2, Milos R. Popovic 2,3, Jessie Navaro4, Clement Hamani 4,5, Erich T. Fonoff <sup>4</sup> and Willy Wong1,2\**

*<sup>1</sup> Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada*

*<sup>2</sup> Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada*

*<sup>3</sup> Rehabilitation Engineering Laboratory, Toronto Rehabilitation Institute, University Health Network, Toronto, ON, Canada*

*<sup>4</sup> Division of Functional Neurosurgery of Institute of Psychiatry, Department of Neurology, University of Sao Paulo Medical School, Sao Paulo, Brazil*

*<sup>5</sup> Division of Neurosurgery, Toronto Western Hospital, University of Toronto, Toronto, ON, Canada*

#### *Edited by:*

*Dario Farina, Georg-August University, Germany*

#### *Reviewed by:*

*Ricardo Chavarriaga, Ecole Polytechnique Fédérale de Lausanne, Switzerland Friedhelm C. Hummel, University Medical Center Hamburg-Eppendorf, Germany*

#### *\*Correspondence:*

*Willy Wong, Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada e-mail: willy.wong@utoronto.ca*

The detection of movement-related components of the brain activity is useful in the design of brain-machine interfaces. A common approach is to classify the brain activity into a number of templates or states. To find these templates, the neural responses are averaged over each movement task. For averaging to be effective, one must assume that the neural components occur at identical times over repeated trials. However, complex arm movements such as reaching and grasping are prone to cross-trial variability due to the way movements are performed. Typically initiation time, duration of movement and movement speed are variable even as a subject tries to reproduce the same task identically across trials. Therefore, movement-related neural activity will tend to occur at different times across the trials. Due to this mismatch, the averaging of neural activity will not bring into salience movement-related components. To address this problem, we present a method of alignment that accounts for the variabilities in the way the movements are conducted. In this study, arm speed was used to align neural activity. Four subjects had electrocorticographic (ECoG) electrodes implanted over their primary motor cortex and were asked to perform reaching and retrieving tasks using the upper limb contralateral to the site of electrode implantation. The arm speeds were aligned using a non-linear transformation of the temporal axes resulting in average spectrograms with superior visualization of movement-related neural activity when compared to averaging without alignment.

**Keywords: electrocorticography, ECoG, arm movement, dynamic time warping, kinematics, movement classification**

## **INTRODUCTION**

The challenge for a brain-machine interface (BMI) is to decode user intent and to transform neural signals into signals which drive an external device like a prosthetic arm. This technology holds enormous potential as an assistive device for individuals with limited ability to perform voluntary movements. Examples of populations that may benefit from this technology include individuals with stroke, advanced stages of amyotrophic lateral sclerosis (Kübler et al., 2001, 2005; Bai et al., 2008; Nijboer et al., 2008), severe cerebral palsy (Pfurtscheller et al., 2000), and high level cervical spinal cord injury (Wolpaw et al., 2000). However, the construction of a BMI platform is predicated first on the ability to identify the salient neural activity associated with upper limb movement (Pfurtscheller et al., 2003; Leuthardt et al., 2004; Foffani et al., 2005; Rickert, 2005; Chin et al., 2007; Schalk et al., 2007; Bai et al., 2008; Ball et al., 2008, 2009; Miller and Ojemann, 2009; Tzagarakis et al., 2010; Zhuang et al., 2010).

One challenge in uncovering the salient neural activity associated with movement is the variability of neural activities and the low signal-to-noise ratio (SNR) that is commonly found for electrophysiological recordings. While gamma power tends to be higher in ECoG recordings, not all frequency bands have high SNR. Moreover, the clarity of gamma activity may depend on individual factors including placement of electrodes. As such developing techniques for applications with low SNR is crucial for uncovering movement-related activities. Typically, noise and variability is dealt with by averaging a large number of repeated trials. To do this, however, one must assume that the neural activity is time-locked to motor-specific events like movement onset. While the evoked brain activity from external stimuli or highly constrained motor tasks can be thought of as being identical on a trial-by-trial basis, this is certainly not true when a subject is performing a complex movement task like reaching. Due to the difficulty in constraining the movement of the arm, it would be unwise to simply take all trials and average them. Instead, we propose a method of realignment through a nonlinear transformation in time. This transformation accounts for differences in movement initiation, arm speeds, and movement durations. After alignment, we expect the neural activities to occur at near identical times and that the related neural activities can now be more effectively brought into salience through averaging.

Other approaches seek to decode movement intention in realtime without use of template matching of templates, (e.g., Wang et al., 2012). Our study is more modest in that our primary focus is with movement classification. This paper concerns the initial application of time warping for movement co-registration. We will discuss the application to classification later in the Discussion Section.

## **BACKGROUND**

Averaging of neural activities is a practice standard in electrophysiology. For example, event-related potentials (ERP) are timeaveraged brain responses to a sensory or motor event. They are simple to calculate and are widely used in clinical neurology for diagnostic purposes. The ERP's show mostly low frequency neural activity since the high frequency components tend to be off-phase from trial-to-trial thereby canceling out through averaging. An alternative to averaging in the time-domain is to average their time-frequency representations. The time-frequency representation of a signal (e.g., a spectrogram) details the spectral density of the signal as a function of time. Similar to averaging over time, averaging over time-frequency space can aid in highlighting the time-dependent spectral density of related neural activities. A triggered, synchronized decrease in band power is known as event-related desynchronization (ERD) and a corresponding increase is known as an event-related synchronization (ERS). These events are measured with respect to a chosen baseline in activity. The baseline is typically set to the rest state.

The electrical activity of the brain can be recorded using a number of different methods including (1) electroencephalography (EEG) where electrodes are placed on the scalp, and (2) single neuron or neuronal ensemble recordings obtained through micro-electrodes placed intra-cortically in proximity of target neurons. Electrocorticography (ECoG) is a method of recording the electrical activities of the brain using macro-electrodes placed surgically over or under the dura. In this study, ECoG contacts were implanted over the dura. The signals obtained using these electrodes generally have higher SNR, a wider bandwidth, and higher spatial resolution when compared to electroencephalography recordings (Schalk et al., 2007; Schalk and Leuthardt, 2011). Additionally, this technology is less invasive than intracortical recordings since the electrodes do not penetrate the brain tissue. In this study, we processed the activity of the motor cortex recorded from two ECoG contacts in a bipolar arrangement.

Typically, neuromotor activities are aligned to a "go" signal (Ball et al., 2009; Reddy et al., 2009; Tzagarakis et al., 2010), to movement onset (Sergio and Kalaska, 1998; Moran and Schwartz, 1999; Rickert, 2005; Miller and Ojemann, 2009) or to movement termination (Jurkiewicz et al., 2006). The alignment strategy is determined in part by the experimental paradigm and by what questions the experimenters would like to answer from their data. For example, activities aligned to the "go" signal would allow for the study of movement preparation. The problem with eventbased alignment is that this does not guarantee that the remainder of the trial is similarly aligned. If we were interested also in, say, movement termination, the trials would then need to be realigned to the end point of movement cycle. To eliminate repeated analyses, we instead introduce a new method of alignment involving a non-linear transformation of time. We believe that this transformation can account for the temporal differences in the way a motor task is conducted.

Temporal alignment of biological signals is not new; for example, such techniques have been employed extensively as part of automatic speech recognition algorithms. Non-linear time warping has also been used to align physiological signals (Munhall et al., 2004) as well as neural signals (Picton et al., 1988; Wang et al., 2001; Casarotto et al., 2003; Cho et al., 2004; Karamzadeh et al., 2013). More recently, Pasley et al. (2012) explored the reconstruction of auditory speech features using ECoG recordings. Their work is quite similar to ours as they use dynamic time warping to realign a transformed representation of the neural signal. We have used warping in the context of aligning movement-related neural activity.

Dynamic time warping (Sakoe and Chiba, 1978) is a graphbased approach to calculate the time transform required to align two signals. The time transformation (or time registration path) is calculated by minimizing a cost function which measures the similarities between the time instances of two signals. Dynamic time warping has been used previously to align sensory evoked responses (Picton et al., 1988; Wang et al., 2001; Casarotto et al., 2003). Picton et al. used dynamic time warping to align the brain-stem auditory evoked response showing improvement in the visualization of components over simple averaging (Picton et al., 1988). In a similar study, Wang et al. showed that the amplitude of the derived visual ERP can be increased by up to 76% after realignment using dynamic time warping (Wang et al., 2001). Casarotto et al. used dynamic time warping to quantify the latencies between the ERPs obtained from normal and dyslexic children (Casarotto et al., 2003). All of these studies show that a simple shift or linear scaling of time is not sufficient to align evoked components.

Earlier studies of movement-related neural activity rely on simple or constrained movements to avoid problems with averaging of trials. However, this is not possible for a study involving complex movements due to the difficulty of constraining the movement of a participant to allow for careful control of arm kinematics. Instead, we introduce this new method of non-linear alignment to correct for the temporal mismatch.

## **MATERIALS AND METHODS PARTICIPANTS**

Four male participants were recruited from Functional Neurosurgery Clinic at the Hospital das Clínicas of University of São Paulo. Subject 1 was 51 years old, Subject 2 was 48 years old, Subject 3 was 42 years old, and Subject 4 was 58 years old. All participants were implanted with unilateral epidural quadripolar electrodes over the motor cortex for the treatment of chronic pain. After the insertion of the electrodes, patients had their systems externalized for 6 days for the selection of optimal stimulation parameters (polarity, amplitude, frequency, duration, etc.). Once these were chosen, the electrodes were connected to an implantable pulse generator during a second surgical procedure. The experiment took place over the 6 days during which the electrode leads were externalized. The study was approved by the University of Sao Paulo research ethics board, and all participants signed a letter of consent prior to taking part in the experiments.

#### **ELECTRODES AND POST-OPERATIVE RECORDINGS**

The placement and choice of number of ECoG contacts were dictated by the clinical requirements unrelated to the purpose and consideration of this study.

The participants were implanted with two quadripolar epidural electrodes Lamitrode 3240 (St. Jude Medical Inc., U.S.A.). Each strip consists of a single row of four platinum discs that were 4 mm in diameter and had center-to-center distance of 10 mm. The electrodes were embedded in a silicon membrane. All participants were implanted with two electrode strips. The electrode strips were placed over the premotor, primary motor, and sensory cortices associated with the upper extremity representations. The first strip was placed on the cortices such that the second contact (electrode #1) was over the primary motor cortex. The location of the electrode was confirmed using electrical stimulation and by observing muscle contractions of the contralateral upper limb. Stimulation parameters were: (i) pulse frequency 50 Hz, (ii) pulse duration 100µs, (iii) monopolar pulses, and (iv) pulse amplitude 3–10µA. Electrode contacts were numbered 0–3 from distal to proximal. Specifically, stimulation of contact #1 implanted over the motor cortex induced finger or wrist movements. Electrode contacts of the second strip were numbered 4–7 from distal to proximal. The second strip was placed dorsal to the first such that contact #5 (the second contact of the second strip) was positioned over the primary motor cortex and dorsal to contact #1. **Figure 1**

**FIGURE 1 | Location of implanted ECoG contacts with respect to head representation.** Contacts of the first strip of electrodes are labeled 0–3 from distal to proximal and contacts of the second strip are similarly indexed 4–7. Primary motor cortex is colored in velvet and the primary sensory cortex is colored in amber. The area associated with the hand representation is marked in green.

shows exemplary illustration of the location of implanted electrodes with respect to the head and the cortical area associated with the upper limb.

In addition to the ECoG measurements, electroencephalography (EEG) signals were recorded at the C3/C4, Cz, Fz, and FP1 locations according to the 10-20 electrode placement system. The purpose for recording EEG was to identify the trials contaminated with eye movement or muscle artifact and to reject the trials contaminated with these artifacts. They were not otherwise used in the analyses. Moreover, electromyography (EMG) signals were obtained from the wrist flexors, wrist extensors, biceps, and triceps. The EEG, ECoG, and EMG signals were recorded using a sampling frequency of 1200 Hz with a 16-channel g.USBamp biosignal acquisition device (g.tec, Graz, Austria). The recording device has a built-in anti-aliasing filter which is dependent on sampling frequency. The anti-aliasing filter is an 8th order digital Butterworth filter with pass-band frequency at 0.1–500 Hz (g.USBamp manual, 2011). The activity of the motor cortex was recorded from two ECoG contacts in bipolar arrangement (contacts #0 and #1). The choice of contact #1 was made because its placement was verified using stimulation of the contact to induce finger or wrist movement. Contact #0 was chosen as we required it to be adjacent to contact #1, and also to lie above the motor cortex which was verified using MRI.

#### **EXPERIMENTAL SETUP**

The participants sat in a comfortable chair. The upper limb movements were recorded using a three dimensional (X, Y, Z) electromagnetic tracker, Fastrack (Polhemus Inc, U.S.A.) and a customized data acquisition software written in C. A motion sensor was placed over the dorsal aspect of the third metacarpal bone of the hand. The three-dimensional position of the sensor was recorded with sampling frequency of 40 Hz and was time stamped. The upper limb kinematics were recorded using the same computer that captured the ECoG, EEG, and EMG data. Thus the kinematic recordings were synchronized with the electrophysiological recordings.

#### **EXPERIMENTAL PROTOCOL**

ECoG, EEG, and EMG signals were recorded while the participants performed a reaching task. The task was carried with the arm contralateral to the site of electrode implantation. The task was to reach a target placed 40 cm away from the chest which the participant could do comfortably. At the start of the task the participant had his or her hand in a resting position where the hand was placed on a pillow located on their lap. Under resting conditions EMG muscle activity was not observed. The participants received an auditory cue ("go" signal) to start the reaching task. After completing this, the participants were instructed to wait for few seconds before returning their hand to the initial resting place (retrieving task). The time between the end of the last trial and the cue signal of the next trial was randomized, and in the range of 8–10 s. Although the movements are voluntary, the reach task was triggered by an external "go" signal. As such they are not strictly self-paced in nature. However, the return motion was selfpaced in that the subject had control as to when to initiate this movement.



Each participant performed at least 50 reaching tasks. Number of trials and averaged length of movements performed by each participant are shown in **Table 1**. Trials were extracted from ECoG recordings for offline analysis. A trial was defined as the period beginning 4 s prior to and ending 8 s after the onset of the reaching task. Movement onset itself was defined as the instance where arm speed exceeded a threshold of "0.5 cm/s" and was designated as *t* = 0.

#### **ANALYSIS**

The data were analyzed in the time-frequency domain using a *spectrogram*. A spectrogram gives the windowed short-time Fourier transform of a signal by describing the frequency content of a signal and how it changes over time. Signals were windowed in segments of 100 ms using a Hamming window. A Fourier transform was then computed for the windowed signal resulting in a spectrum with resolution of 1 Hz. The window was then shifted forward by 10 ms, and the procedure was repeated until the end of the epoch was reached. The resulting spectrogram consists of a matrix where each row represents the power spectrum of a windowed signal. Each column of this matrix represents the time series of power at a particular frequency. Event-related changes (ERD/ERS) were calculated by normalizing each column (frequency) by the baseline power. The baseline was defined as the average power between 1 and 2 s prior to movement onset.

Spectrograms are typically averaged without realignment along the time axis. This is the usual method of finding the spectral density of neural activity and how it changes with respect to movement. We call this method *conventional averaging*. In the second method, the epochs were warped in time according to the arm kinematics prior to averaging. One trial is chosen as the "gold-standard" or reference trial—all the other trials are then warped to this reference trial using a *time-registration path*. Given that warping of the signal in the time-domain distorts its spectral content, the time transformations were not applied to the raw ECoG recordings. Instead, warping was performed over the spectral densities such that the neural activity corresponding to the same arm velocity is identical across all trials. In this case, not only will movement onset and offset be aligned, but the neural activity at each time-point will correspond to the same arm speed/position. We call this method *time-warped averaging*.

### **REALIGNMENT OF THE TRIALS USING DYNAMIC TIME WARPING ACCORDING TO ARM SPEED**

Dynamic time warping algorithm was implemented using a symmetric step pattern with no constraint on the slope according to the method of (Sakoe and Chiba, 1978). Time-warping of ECoG signals was carried out over arm velocity such that the velocity profiles would be identical after alignment was complete. For simplicity, we warped along the X-axis (orthogonal to the chest surface) since it is the largest component over which movement took place. To determine the time registration path, we chose the Euclidean distance as a cost function and found the corresponding time points across the two time axes such that the Euclidean distance in arm velocities was minimized. **Figure 2** shows an example of the registration path obtained by timewarping the two trials. This process aligns the time course of the movements (and therefore the spectral density of the associated neural activity) to a "gold standard."

We performed statistical analyses to evaluate the efficacy of our approach. In this case, the reference or standard trial was permuted across all available trials. We compared the variability of the spectrograms when they were warped versus when they were not. The error obtained when the spectrograms are not warped comprises the null distribution. Error is defined as the RMS value calculated by a sum of square errors across all time-frequency cells. The error values resulted in two distributions and the distance between these two distributions provides a measure of the improvement due to warping. While non-parametric methods can be used in the analyses, we have found that in practice the log transform of the error resembles a normal-like distribution. Hence the *t*-test was used instead.

## **RESULTS**

## **CONVENTIONAL AVERAGING OF ECoG SIGNALS**

Overall, the time-frequency representation of the ECoG response shows a very distinct pattern of activity over the course of the movement. The power in the beta activity (12–30 Hz) is attenuated during the course of movement execution while the power in the gamma activity (65–140 Hz) is increased. These changes in power are statistically significant (*p <* 0*.*05, Kolmogorov– Smirnov test).

However, the variability in the time course of the velocity profile is significant from trial to trial. **Figure 3** shows 25 profiles of arm speed for Subject 1. The movement duration deviated from the average by as much as 600 ms across movement durations that last only 2 s. Spectral components after averaging are visible, but are either blurred or distorted due to temporal misalignment. To illustrate the effects of averaging without non-linear alignment, we carried out our analysis both by aligning to the movement onset of the reach task (**Figures 4A,D** and **Figures 5A,D**) as well as aligning to the movement onset of the retrieval task (**Figures 4B,E** and **Figures 5B,E**). In each case, gamma activity for example is scattered over the time-course of the movement and localized into multiple components. The breakup of the components is due to temporal misalignment, a result which we also observe with the EMG activities. Also the average of the rectified EMG shows a sharply increasing signal at the onset of the reaching task (**Figures 4A** and **Figures 5A**), but as the trials become desynchronized over time they do not exhibit the same sharp decrease at the end of the reaching task, nor is the signal as visible at the onset/completion of the retrieval task. Quantitatively, the EMG signals fall in amplitude by as much as 50% after the completion

of the reaching task rendering the activity during the retrieval task almost undetectable. **Figures 4B,E** and **Figures 5B,E** show alignment with the onset of the retrieval task but there are similar problems here as well.

#### **TIME-WARPED AVERAGING OF ECoG SIGNALS**

It is clear that a translational shift in time is insufficient to properly align voluntary movements. Next we explore the results that can be obtained when a non-linear alignment is used. Results are

task with no time-warping used. Movement onset is denoted by dash dotted line ("-."). The average spectrogram was normalized with respect non-linear warping of the time axis prior to averaging. **(D–F)** Shows the same for Subject #2.

shown in **Figures 4C,F** and **Figures 5C,F**, which we compare to the earlier results obtained through conventional averaging of the same trials. The movement-related components (ERD/ERS) are clearly delineated in the warped average spectrograms. Moreover, we see that the time-course of these components matches that of the muscular activity. That is, the gamma ERS and beta ERD appear only when the muscles are activated. Synchronization of the gamma activity is clearly observed for both the reaching and retrieval tasks as single components.

Gamma activity for the rest period between the reaching and retrieval tasks is more complicated. For Subject 4, the activity returns to the baseline during the hold phase of the transition between the reaching and the retrieval tasks. For two of the subjects (Subjects 1 and 2), the activity for the hold phase is reduced, but does not go entirely to the baseline. Nevertheless we observe that time warping produces a much clearer "quiet period" in neural activity when compared to alignment by movement onset only. Subject 3 did not conduct the trials according to the instructions provided. There was no hold period in between the tasks and as such no clear drop in gamma activity was observed.

Statistical analysis shows that variability in the spectrograms was reduced significantly for all subjects (*p <* 1%) with the exception of Subject 1 (*p <* 40%). While variability for Subject 1 was also reduced after warping, it is of interest to investigate why the results differed dramatically for this one subject. When we examined the trajectories for this subject, we noticed that the subject often overshot the target point (this was observed in more than one third of their trials). Since the time-registration path is calculated from the trajectories themselves, we therefore calculated the RMS error from warping the kinematic trajectories alone. What we found was that the error for Subject 1 greatly exceeded the error of the other subjects (by at least a factor of 2).

## **DISCUSSION**

In this paper, we presented a new method for alignment of neural events over repeated experimental trials. Self-paced unconstrained voluntary movements are prone to movement variability in initiation time, duration and speed. The movementrelated cortical activities would therefore occur at different time instances across different trials. These temporal mismatches add to the difficulty in identifying the salient neural response underlying movement activity. Given that there exists a direct functional relationship between neural activity and arm kinematics, we hypothesize that neural events can be better aligned if the kinematic profiles are identical on each trial. Kinematic signals were used to find a non-linear transformation of the time axis as calculated by the method of dynamic time-warping. The transformations serve to remove any

temporal variabilities in the way the task was performed. The time transformations were then applied to the spectrogram of the corresponding ECoG signal. This resulted in well-delineated movement-related components including event-related synchronization/desynchronization. Finally, the spectrograms that are now aligned were averaged. The outcome is a vastly improved visualization of the neural activity for complex arm movements. When the results are compared to that of conventional averaging with trials aligned only by movement onset, we see instead the movement-related components to be either blurred or absent. The components found through alignment via warping can be traced back to specific events like movement termination and initiation.

Epochs of EEG data are traditionally aligned with respect to an event of interest (e.g., movement onset). However, if the investigator wishes to study another event in the same data set, the data must be realigned to a new marker. Dynamic time-warping eliminates the need for this as an entire trial is aligned on one go. This has the distinct advantage of allowing for the best possible representation of neural activity across an entire trial. Earlier works have indicated that there is a direct, functional linear relationship between cortical activity and arm velocity (Paninski, 2003; Leuthardt et al., 2004; Chin et al., 2007; Schalk et al., 2007; Pistohl et al., 2008; Bougrain and Liang, 2009; Ganguly et al., 2009; Zhuang et al., 2010). We have made use of this relationship to develop our method. The trials are aligned in accordance with the hypothesis that specific patterns of neural activity (particularly in the beta and gamma activity bands) correspondence linearly with movement velocity. Alignment with velocity would therefore result in the alignment of neural activity.

In choosing relatively well-delineated motor tasks, we did this not because of any limitation in our methodology. So long as the association with velocity holds, the method of temporal alignment should work for more complex tasks (e.g., a tasking involving both reaching and manipulation). However, it is entirely conceivable that our current approach is not complete and that future studies will allow for better methods of temporal alignment involving more complicated movement tasks.

Our findings with ECoG data show a very distinct pattern of activity over the course of the movement. The power in the beta activity (12–30 Hz) was found to be attenuated during the motor tasks whereas power in gamma activity (65–140 Hz) increased correspondingly over the same time period. Beta activity is typically believed to have an inhibitory effect on movement while gamma oscillations are believed to facilitate movement (Pfurtscheller et al., 2003; Jurkiewicz et al., 2006). Activity in both bands return to normal levels after cessation of movement although the time-course for recovery is very different from movement initiation where the neural activity is more abrupt. Similar patterns in neural activity have been found in other recording paradigms: single unit recording, local field, and scalp recordings (Graimann et al., 2003, 2004; Pfurtscheller et al., 2003; Mehring et al., 2004).

One might ask whether it is possible to obtain timeregistration paths directly from the spectrograms and to use them to verify the results obtained from kinematics. In theory, it is possible although in practice the neural response is far noisier and thus warping with the spectrogram will not likely result in time-registration paths with the same consistency as the kinematics. Nevertheless, we can attempt such an analysis by restricting warping to only the gamma activity which is relatively clean in our ECoG recordings. Gamma activity is band-passed between 65 and 90 Hz to encompass movement-related gamma activity for all three subjects. We then used gamma to obtain new time-registration paths from which the kinematic signals were then warped. An RMS error is calculated between the warped kinematic signal and the target signal. The RMS error was then compared to the error that would be obtained if only onset alignment was used. We obtained *p*-values of 1%, 13%, and 9% for subjects 1–3 respectively showing results which trend toward significance despite the noisy nature of the neural signals.

We note that subjects with motor impairments often show greater variability in executing motor tasks. As such, the differences between the movement phases can be less marked. In fact, motor coherency studies in the basal ganglia have demonstrated that subjects with Parkinson's disease show more distinctive changes when subjects are on medication rather than off (Cassidy et al., 2002; Levy et al., 2002; Williams et al., 2002; Androulidakis et al., 2007; Jenkinson and Brown, 2011). One might speculate that the disease state introduces variability that would in turn make temporal alignment more difficult. In more extreme cases, where there are imagined but no actual arm movements, our methods would require modification. For such cases, dynamic time warping can be carried out directly on the spectrogram without the use of kinematics. This is typically what happens for speech where one utterance is warped directly into another utterance (Sakoe and Chiba, 1978; Rabiner and Juang, 1993). Although we did not study covert movements, earlier ECoG and fMRI measurements have demonstrated that task-related cortical activity are similar for imagined movements and actual movements (Porro et al., 1996; Graimann et al., 2004; Leuthardt et al., 2004; Shenoy et al., 2008; Miller et al., 2010). For example, Miller et al. (2010) showed that the ECoG gamma response of the motor cortex for imagined movements are initially 25% lower than that of overt movements. However, over time subjects learned to use imagined movements to control a BCI system with induced response eventually exceeding the response of overt movements.

Similarly, Leuthardt et al. (2004) showed that modulations of gamma activity are highly correlated with imagined joystick movements. Because of this correlation, one could attempt to warp gamma activity from one imagined movement to another in the manner shown earlier. However, since neural activity is far noisier than the kinematic trajectory, the warping of neural response will be less consistent. Nevertheless, this approach holds promise for movement classification when no kinematic measurement is available.

The results of this study were obtained through the alignment of the spectrograms for cortical activities. An alternative approach is to carry out the alignment directly on the timedomain signal itself. This is not advisable because warping the signal directly distorts its frequency content. Nevertheless, we show an example of what would happen if such an operation were carried out. Warping was applied directly to the ECoG signal followed by a calculation of its spectrogram. **Figure 6** compares the averaged spectrogram obtained by this new method with the spectrograms obtained from the original (and preferred) method. We note that the results from warping the time-domain signals are less clear and show obvious distortions due to the prolongation of harmonic signals from line noise as seen in **Figures 6B,D,F,H**.

There are a number of ways in which a BMI system can decode user intention. One way is to decode single trials by means of, say, movement onset detection followed by a real-time mapping of neural activity onto kinematic movement (Schalk et al., 2007; Pistohl et al., 2008; Wang et al., 2012; Lew et al., 2014; Xu et al., 2014). A second way is to classify neural activity into a number of distinct states or tasks (e.g., rest vs. reaching vs. grasping) (Chin et al., 2007; Pistohl et al., 2012). Our immediate interest is with the latter type of BMI which is more limited but still holds important potential applications. Activity is classified using a database of templates each corresponding to a different task. Our present study is focused on finding better ways to obtain an optimal set of templates through prior time alignment of individual trials before averaging. While we do not consider directly this problem here, it is a simple extension of our methods to allow for classification. This can be done through one of several ways. One way is to remove the reliance on kinematics and to warp directly one neural spectrogram onto another. The score obtained in time-alignment will indicate to which class the movement belongs. One expects that movements drawn from the same class will yield a lower score than warping two movements belonging to different classes (Jeong et al., 2011). A second approach is to develop an optimal series of templates by which any given movement can then be scored by comparing it to a "gold standard." The highest score then defines the class of movement. Obviously much of our ideas are motivated by the classic work done in speech recognition (Rabiner and Juang, 1993). While most modern implementations of automatic speech recognition systems use statistical models like hidden Markov models or neural networks, the basic principles remain the same. Our choice of dynamic time warping was motivated by the ease of implementation as well as its relevance toward the basic scientific question of finding the underlying neural components of upper limb movements.

## **REFERENCES**


from electrocorticography. *Neurosurg. Focus* 27:E11. doi: 10.3171/2009.4. FOCUS0990


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 March 2014; accepted: 10 December 2014; published online: 13 January 2015.*

*Citation: Talakoub O, Popovic MR, Navaro J, Hamani C, Fonoff ET and Wong W (2015) Temporal alignment of electrocorticographic recordings for upper limb movement. Front. Neurosci. 8:431. doi: 10.3389/fnins.2014.00431*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2015 Talakoub, Popovic, Navaro, Hamani, Fonoff and Wong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## A confidence metric for using neurobiological feedback in actor-critic reinforcement learning based brain-machine interfaces

#### *Noeline W. Prins <sup>1</sup> \*, Justin C. Sanchez 1,2,3 and Abhishek Prasad1*

*<sup>1</sup> Department of Biomedical Engineering, University of Miami, Coral Gables, FL, USA*

*<sup>2</sup> Department of Neuroscience, University of Miami, Coral Gables, FL, USA*

*<sup>3</sup> Miami Project to Cure Paralysis, University of Miami, Coral Gables, FL, USA*

#### *Edited by:*

*David Guiraud, Institut National de la Recherche en Informatique et Automatique, France*

#### *Reviewed by:*

*Jack Digiovanna, École Polytechnique Fédérale de Lausanne, Switzerland Maureen Clerc, Institut National de la Recherche en Informatique et Automatique, France Kaitlin Elizabeth Cassady, University of Minnesota, USA*

#### *\*Correspondence:*

*Noeline W. Prins, Neuroprosthetics Research Group, Department of Biomedical Engineering, University of Miami, 1251 Memorial Drive, MEA 203, Coral Gables, FL 33146, USA*

*e-mail: n.prins@umiami.edu*

Brain-Machine Interfaces (BMIs) can be used to restore function in people living with paralysis. Current BMIs require extensive calibration that increase the set-up times and external inputs for decoder training that may be difficult to produce in paralyzed individuals. Both these factors have presented challenges in transitioning the technology from research environments to activities of daily living (ADL). For BMIs to be seamlessly used in ADL, these issues should be handled with minimal external input thus reducing the need for a technician/caregiver to calibrate the system. Reinforcement Learning (RL) based BMIs are a good tool to be used when there is no external training signal and can provide an adaptive modality to train BMI decoders. However, RL based BMIs are sensitive to the feedback provided to adapt the BMI. In actor-critic BMIs, this feedback is provided by the critic and the overall system performance is limited by the critic accuracy. In this work, we developed an adaptive BMI that could handle inaccuracies in the critic feedback in an effort to produce more accurate RL based BMIs. We developed a confidence measure, which indicated how appropriate the feedback is for updating the decoding parameters of the actor. The results show that with the new update formulation, the critic accuracy is no longer a limiting factor for the overall performance. We tested and validated the system onthree different data sets: synthetic data generated by an Izhikevich neural spiking model, synthetic data with a Gaussian noise distribution, and data collected from a non-human primate engaged in a reaching task. All results indicated that the system with the critic confidence built in always outperformed the system without the critic confidence. Results of this study suggest the potential application of the technique in developing an autonomous BMI that does not need an external signal for training or extensive calibration.

#### **Keywords: brain-machine interface, reinforcement learning, Hebbian, actor-critic, feedback**

## **INTRODUCTION**

In recent years, Brain-Machine Interfaces (BMIs) have been shown to restore movement to people living with paralysis via control of external devices such as computer cursors (Wolpaw and McFarland, 2004; Simeral et al., 2011), robotic arms (Hochberg et al., 2006, 2012; Collinger et al., 2013), or one's own limbs through functional electrode stimulation (FES) (Moritz et al., 2008; Pohlmeyer et al., 2009; Ethier et al., 2012). Studies have shown that the BMI control can be affected by several factors such as the type of neural signals used (Wessberg et al., 2000; Mehring et al., 2003; Andersen et al., 2004; Sanchez et al., 2004), longterm stability of the input signals (Santhanam et al., 2006; Flint et al., 2013), type of training signals used for decoders (Miller and Weber, 2011), type of decoders (linear, non-linear, static, adaptive) (Kim et al., 2006; Shenoy et al., 2006; Bashashati et al., 2007; Li et al., 2011), and cortical plasticity that occurs during BMI use (Sanes and Donoghue, 2000; Birbaumer and Cohen, 2007; Daly and Wolpaw, 2008). Other factors include the type of signal used [local field potentials (LFPs), electrocorticograms (ECoG), single or multiunit activity] and the long-term stability of the signals (Schwartz et al., 2006; Chestek et al., 2011; Prasad et al., 2012). Additionally, the performance can also be affected by perturbations such as loss or gain of neurons, noise in the system, electrode failure, and changes in the neuronal firing characteristics (Maynard et al., 1997; Shoham et al., 2005; Patil and Turner, 2008; Pohlmeyer et al., 2014). These factors occur dynamically in nature and affect long-term BMI performance. Therefore, there is a need to produce more stable, high performance BMIs that are less affected by these daily changes in the neural input space due to the above interactions so that they can be reliably implemented in activities of daily living (ADL).

Traditionally, BMIs utilize a decoder that translates neural signals into executable actions by finding the mapping between the neural activity and output commands. Due to the nonstationarity of the neural data (Snider and Bonds, 1998), many of these decoders need to adapt its parameters in order to find an optimal mapping between the neural control signals and the output motor actions. Commonly used decoders (such as Wiener models and Kalman filters) are trained using supervised learning (SL) techniques that require a training data set and a desired output value, which is usually a real or inferred kinematic signal from a limb (Schalk et al., 2007; Gilja et al., 2012). However, this paradigm poses challenges for paralyzed individuals who may not be able to generate a training kinematic signal in order to create a stable mapping between the motor control signals to BMI command outputs. Maladaptive cortical reorganization occurring due to non-use of the paralyzed limbs further worsens the reliable extraction of training kinematic signals in such individuals (Elbert and Rockstroh, 2004; Di Pino et al., 2012). Studies have used motor imagery, baseline neural activity, random initialization of the decoder, and ipsilateral limb movements to create training signals that can be used to initialize the BMI decoder and then refine the decoder during the experiment (Pfurtscheller and Neuper, 2001; Bai et al., 2010). All these approaches are based on the SL paradigm where the presence of an external training signal is critical to achieve optimal BMI control and requires initial time-consuming calibration (which can range from 10 min to about an hour) of the BMI decoder before each session to adapt to the perturbations in the neural environment.

Unsupervised learning (UL) techniques provide an alternative to SL models as they only rely on the structure of the input data and finds patterns within the data itself (Shenoy and Rao, 2005; Rao, 2010; Vidaurre et al., 2011; Gürel and Mehring, 2012). This is particularly useful for BMI applications where the user may not be able to generate reliable kinematic signals and the input signals are affected by the changing dynamics of the neural environment. However, if the input space changes in an unpredictable manner or there are perturbations present unsupervised decoders may not be mapped appropriately to the behavior since they rely on the structure of the training data. For example, *k*means, an unsupervised clustering method uses the structure of the training data to define clusters. When the statistics of the data change between training and testing, an optimal solution is not guaranteed (Fisher and Principe, 1996; Snider and Bonds, 1998; Antoni and Randall, 2004). Therefore, in order to address these challenges we have utilized a semi-supervised learning technique based on Reinforcement Learning (RL), which depends on performance outcomes and not on explicit training signals (Sutton and Barto, 1998). In comparison to SL techniques, RL uses an instantaneous feedback to modify its parameters but does not require an explicit training signal. Since there is a structure already present (due to its feedback) RL is able to respond to perturbations better than UL. The basic idea of RL is for an "agent" to make actions on an "environment" and receive an instantaneous "reward" in order to maximize the cumulative or long term reward the "agent" receives. In this case, the "agent" is an intelligent system (e.g., BMI decoder), which selects an action out of many available actions with an aim to maximize the long-term reward. An action will change the state of the environment (action space) from one state to another, for example, move left or move up. The "reward" is the evaluation of the action selected depending upon its outcome. A good outcome will lead to a high reward and vice versa.

Theoretical models of learning have been developed for different brain areas which suggest that the cerebellum, the basal ganglia, and the cerebral cortex are specialized for different types of learning (Houk and Wise, 1995). SL, based on an error signal has been proposed to be handled by the cerebellum, while the cerebral cortex is specialized for UL and the basal ganglia are specialized for RL based on the reward signal (Doya, 2000). We used a particular class of RL known as the actor-critic RL in this study, which provides us with a framework to obtain the reward feedback from a different source than that of the action. The "actor" makes decisions of which action to choose from, while the "critic" gives feedback on the appropriateness of this decision. In other words, the critic criticizes the choice made by the actor. In contrast to SL decoders, RL does not need an explicit training signal. RL also gives a framework for adding more biological realism into the structure of the decoder design. We have shown earlier an actor-critic RL as a framework for using an evaluative feedback in neuroprosthetic devices (Mahmoudi and Sanchez, 2011). This framework provides a structure where a user and the agent can both co-exist and work toward a common goal. We have also shown how convergence, generalization, accuracy and perturbations take place in a Hebbian RL framework (Mahmoudi et al., 2013) and that adaptation is necessary for maintaining BMI performance following neural perturbations (Pohlmeyer et al., 2014). In these studies, the actor was driven by the motor neural data and the critic feedback was computed by comparing the action taken to the desired action. The drive is to move toward an autonomous BMI which does not need to know the desired action and would not need an external training signal of any kind. Therefore, to bring biological realism for building a fully autonomous BMI system, we have investigated the possibility of using a reward signal from the brain itself to drive the critic (Prins et al., 2013). There are multiple reward areas in the brain, which can be used to extract such information such as the striatum (Phillips, 1984; Wise and Bozarth, 1984; Wise and Rompré, 1989; Schultz et al., 1992, 2000; Tanaka et al., 2004), cingulate (Shima and Tanji, 1998; Bush et al., 2002; Shidara and Richmond, 2002), and orbitofrontal cortices (Rolls, 2000; Schultz et al., 2000; Tremblay and Schultz, 2000); most notably the striatum that is involved in the perception action reward cycle (PARC) (Apicella et al., 1991; Pennartz et al., 1994; Hollerman et al., 1998; Kelley, 2004; Nicola, 2007), which is the circular flow of information from the environment to sensory and motor structures and back again to the environment completing the cycle during the processing of goal-directed behavior. All adaptive behaviors require the PARC and the control of goal-directed actions relies on the operation of such an information-movement cycle. A critic driven by such a biological source (biological critic) would not only be mimicking a biological system and adding more biological realism, but also render toward an autonomous BMI which does not need a training signal; however, the challenge is how to incorporate a biological critic in to this actor-critic RL framework to maximize the BMI performance. We have found from preliminary analysis that the reward signals and reward representations are diverse and leads to lower accuracy when classified. This is due to the finding that the overall performance of the decoder model is limited by the critic accuracy (Pohlmeyer et al., 2014). This occurs because updating the system with wrong feedback perturbs the temporal sequence of the RL trajectory and can lead to a suboptimal decoding solution. When the critic feedback is less than perfect, the actor is only able to achieve an accuracy with the critic accuracy as its upper limit (Pohlmeyer et al., 2014). Therefore, there is a need to develop a framework that can handle inaccuracies due to uncertainty in the critic feedback so that a biological critic can be used to drive an autonomous BMI.

In this study, we developed a novel method for decoupling the overall performance from the accuracy of the critic by adding a confidence measure in the critic feedback. Using this method, the system only updates when the critic is accurate. The accuracy can be derived from the distance to the boundary for the decision surface for rewarding and non-rewarding actions. We performed simulations for this novel method on both synthetic and non-human primate (NHP) data to show that the overall performance can be increased above the critic accuracy to create high performance BMIs. We used a two-choice task to show proof of concept that a system with built-in confidence measure is able to perform significantly better than a system without the confidence measure. Such a system can be expanded to complex tasks that include a larger number of targets where the critic output is still in the form of two states similar to one shown in this study (Mahmoudi et al., 2013). This new method of confidence driven updates is particularly effective when the accuracy of the biological critic is low.

## **METHODS**

#### **HEBBIAN REINFORCEMENT LEARNING**

We used the actor-critic RL paradigm to test our decoder in which the BMI decoder that decodes the action is embedded within the actor architecture itself. We modified the weight updates according to the Hebbian rule, called the Hebbian Reinforcement Learning (HRL) (Pennartz, 1997). RL learns by interaction to map neural data to output actions in order to maximize the cumulative reward. For this, there are two functions: the value and policy functions. The value function provides the reward value and the policy function provides a method of choosing from a variety of available actions. In actor-critic RL, the structure is such that the policy is independent of the value function. The policy is given by the "actor" and the value function is given by the "critic" (Sutton and Barto, 1998). The actor chooses which action to execute out of the many actions possible and the parameters of the actor is changed according to the evaluative feedback given by the critic (**Figure 1A**).

The Hebbian learning rule specifies how much the weights between two neurons must be changed in proportion to their activation (Pennartz, 1997; Bosman et al., 2004). HRL is a class of associative RL where the local presynaptic and postsynaptic activity in the network is correlated with a global reinforcement signal (Gullapalli, 1991; Kaelbling, 1994). **Figure 1B** shows the network structure we are using for our model where the actor is an artificial neural network (ANN) with 3 layers. The input layer receives motor neural data and the output layer gives the value for each action available. Each processing node in the output layer represents one possible action. The policy we are using is the "greedy policy," which says that the action with the highest value is chosen and implemented. Each node in the hidden and output layers is a processing element (PE). Each of these PE has Equation 2

**FIGURE 1 | Architecture of the actor-critic reinforcement learning (RL). (A)** Classical actor-critic RL architecture as adapted for Brain-Machine Interface (BMI). The actor maps the neural commands into actions to control the external device. The actor is driven by the motor neural commands. The critic gives an evaluative feedback about the action taken based on its reward. This evaluative feedback is used to update the weights of the actor. The critic is driven by the neural data from the striatum for an autonomous BMI. **(B)** Actor network structure in the actor-critic RL; fully connected feed forward neural network with binary nodes, with 5 nodes in the hidden layer. The policy function used is the "greedy" policy which selects the node with the highest value at the output layer and channels that action to the environment. The critic gives an evaluative feedback to all nodes in the output and hidden layers. This modulates the synaptic weight updates based on the local pre- and postsynaptic activity.

in its entirety which is known as the associative reward-penalty algorithm in adaptive control theory (Barto and Anandan, 1985). The input to each PE is *xi* (firing rate of the neuron *i* in a given bin) and the output is *xj*. For the output node *j*, with the transfer function *f (*·*)*, *xj* is given by

$$\mathbf{x}\_{\circ} = \text{sgn}\left[P\_{\circ}\right] = \text{sgn}\left[f\left(\sum\_{i} \mathbf{w}\_{i\circ} \mathbf{x}\_{i}\right)\right] \tag{1}$$

Where *Pj* = *f <sup>i</sup> wijxi* . We have used a hyperbolic tangent as the transfer function. The weight update rule for HRL is given by:

$$
\Delta o\_{\vec{\eta}} = \mu^{+} r \left(\mathbf{x}\_{\vec{\eta}} - P\_{\vec{\eta}}\right) \mathbf{x}\_{i} + \mu^{-} \left(1 - r\right) \left(1 - \mathbf{x}\_{\vec{\eta}} - P\_{\vec{\eta}}\right) \mathbf{x}\_{i} \tag{2}
$$

where the reward, r evaluates the "appropriateness" of the PE's output (−1 ≤ *r* ≤ 1), *xj*, due to the input *xi*. *μ*<sup>+</sup> and *μ*<sup>−</sup> represent the learning rates for the reward and penalty components, respectively (Mahmoudi et al., 2013). The first term corresponds to the reward and the second term corresponds to the penalty. There are two unique cases for this equation. The first case is when *r* = 1, there is contribution only from the first term and the weight update equation (Equation 2) becomes:

$$
\Delta \alpha\_{\vec{i}\vec{j}} = \mu^{+} r \left(\mathbf{x}\_{\vec{j}} - P\_{\vec{j}}\right) \mathbf{x}\_{\vec{i}} \tag{3}
$$

This means that in rewarding trials (*r* = 1), only the positive component contributes to the weight update. But in non-rewarding trials (*r* = −1), both terms contribute and the system is more sensitive to the negative feedback. The second case is when *Pj* approaches *xj* there is contribution only from the second term, hence the weight update becomes:

$$
\Delta \alpha\_{\vec{\text{ij}}} = \mu^- \left( 1 - r \right) \left( 1 - \mathbf{x}\_{\vec{\text{j}}} - P\_{\vec{\text{j}}} \right) \mathbf{x}\_{\vec{\text{i}}} \tag{4}
$$

In this case, the system will only adapt for negative feedback. When both the above conditions are achieved, (*r* = 1*and Pj* → *xj*), the weights will not update further. During instances where there is no weight update, the system has consolidated the functional relationship between input and output. Unless and until there is a negative feedback, the system will not update further.

#### **CRITIC CONFIDENCE**

The decoder in the actor incorporated a confidence measure that indicated the accuracy of the critic. This was motivated by our previous findings that the overall performance of the system was affected by the critic accuracy (Pohlmeyer et al., 2014) and that the accuracy of extracting reward signal from the neural data was less than 90% (Prins et al., 2013). The formulation adds an additional term in the HRL weight update equation (Equation 2), which indicated how much confidence the critic had in the feedback value. We defined this term as the confidence (*ρ*) and hence, the modified HRL weight update equation (Equation 2) becomes:

$$
\Delta o\_{\vec{\eta}} = \mu^{+} \rho r \left(\mathbf{x}\_{\vec{\jmath}} - \mathbf{P}\_{\vec{\jmath}}\right) \mathbf{x}\_{\vec{\imath}} + \mu^{-} \left(1 - \rho r\right) \left(1 - \mathbf{x}\_{\vec{\jmath}} - \mathbf{P}\_{\vec{\jmath}}\right) \mathbf{x}\_{\vec{\imath}} \tag{5}
$$

where *ρ* is the confidence in the feedback, *r*. Here, the critic determines the appropriateness of the action taken by the actor. The critic gives an output of ±1 (*r* = ±1) indicating if it was an action to be rewarded or penalized. In addition, the critic also gives a value of the confidence (*ρ*) it has on the feedback given. If the confidence is high, the actor is updated but if it is low, the actor is not updated. This is to be determined by the value of *ρ* given by the critic. Depending on the confidence given after each action is taken, the actor weights are updated only when the critic confidence is high. Since noise in feedback data can tend to add uncertainty closer to the decision boundary, more noisy data can result in lower levels of confidence and the actor weights are not updated as frequently. This system however, does not address the problem of mislabeled critic trials (i.e., wrong feedback with high confidence). By not updating (i.e., not changing the weights) when the confidence in critic feedback is low, it provides a mechanism for preventing inaccuracies from entering into the system. The trade-off for this approach is that the number of samples needed to train the system can be more since every sample may not be used if the confidence is low.

In the simulations, we varied the critic accuracy from 50 to 100%. An N% accurate critic means that (1-N)% of the time it will be incorrect. The actor is blind to N, but for these simulations we provided boolean confidence information to the actor (*ρ* = {0*,* 1}). Thus, in these simulations, the actor with confidence does not know how accurate the critic is, but knows exactly when the critic provided accurate feedback. This actor does not adapt at all if the feedback was inaccurate (i.e., *ρ* = 0). In contrast, the standard actor (without confidence) adapts fully to both the accurate and inaccurate feedback.

#### **GENERATING NEURAL DATA**

We generated synthetic neural data and tested it on the HRL update equation both without (Equation 2) and with confidence (Equation 5) to compare the system performance. The performance in each session was quantified by the number of correct actions for that particular session. For synthetic data,one session was considered as one simulation and each session consisted of 100 trials (actions). We also included additional noise by changing the stimulus (how the synaptic current, *I*, is generated in Equation 6). For each different set of *I*, we generated data, performed the simulations and tested the performance. Finally, we tested the robustness of the model by using neural data from a NHP performing a two choice reaching task and compared performance. For the NHP data, one simulation consisted of 97 trials collected over 2 consecutive days. The results presented are a mean of 1000 simulations for both synthetic and NHP data.

#### *Generating MI synthetic data for the actor*

The synthetic neural data used to test the model was generated by the standard Izhikevich method (Izhikevich, 2003) where the model was given by

$$\nu' = 0.04\nu^2 + 5\nu + 140 - \mu + I \tag{6}$$

$$u' = a\left(b\nu - u\right) \tag{7}$$

with the auxiliary after-spike resetting

$$\text{if } \nu \ge +30 \, mV \text{ then } \begin{cases} \nu \precsim \mathcal{c} \\ u \precsim u + d \end{cases} \tag{8}$$

Here *v* was the membrane potential of the neuron and *u* represents a membrane recovery variable, which accounted for the activation/inactivation of ionic currents, and it provided negative feedback to *v*. After the spike reached its apex (+30 mV), the membrane voltage and the recovery variable were reset. The synaptic current is given by the variable, *I*, which was calculated from the stimulus of "1" for spike and "0" at all other times. For excitatory cells, *a* = 0*.*02*, b* = 0*.*2*, (c, d)* = *(*−65*,* 8*)* + *(*15*,* −6*)* · *e*<sup>2</sup> where *e* is a random variable uniformly distributed, *e* ∈ [0*,* 1] (Izhikevich, 2003). We generated two motor states (motor state 1 and motor state 2) using the above model to depict two actions. The neural data was generated in 3 ensembles, one ensemble each tuned to one state (activity of the particular ensemble correlated with one state) and the third ensemble not tuned to either state simulating noise in real neural data.

## *Neural perturbations—additional noise in data*

While the synthetic data was generated using a biologically realistic model, there are dynamic factors, which contribute to forms of noise not considered in the model. These are factors such as neurons dropping, electrodes deteriorating or breaking and encapsulation. Without making the model more complicated to mimic the noisy physiological system, we introduced additional noise to the synthetic data by adding a probability component to the stimulus, which generated the *I* in Equation 6. The actual value of noise in the stimulus was decided by a Gaussian distribution instead of the "1" or "0" as before. The number of neurons with this additional noise was varied from 0 to 100% in 10% increments. This additional probability component resulted in overlapping classes; the higher the probability component, more overlapping in the states generated. This was verified graphically using the first two principal components and confirmed that as the probability component to generate *I* was increased, the overlapping of the two classes also increased.

#### *Simulations using NHP data*

To validate our simulation results, a two choice decision making task was designed and neural signals were acquired while the monkey performed the task. We varied the critic accuracy from 50 to 100% in 10% increments and evaluated the performance. The experiments were conducted by a marmoset monkey (Callithrix jacchus) implanted with a 16 channel microwire array (Tucker Davis Technologies (TDT), Alachua, FL) targeting the hand and arm region in the primary motor cortex (MI). Neural data was acquired at 24,414.06 Hz using a TDT RZ2 system and bandpass filtered 300–5000 Hz. Thresholds were set manually by the experimenter and 20 multi-unit signals were isolated in real-time based on waveform and amplitude of the isolated waveforms. We did not distinguish between single unit and multi-unit activity. All the procedures were consistent with the National Research Council Guide for the Care and Use of Laboratory Animals and were approved by the University of Miami Institutional Animal Care and Use Committee.

The task was a two-choice decision making task where the monkey was trained to move a robot arm to one of two targets to receive a food reward (**Figure 2**). A trial was initiated by the monkey when he placed his hand on a touchpad for a random (700–1200 ms) hold period. The trial onset was an audio cue that corresponded to a robot arm moving upwards from behind an opaque shield and presenting its gripper in front of the animal. The gripper held either a desirable (waxworm or marshmallow, "A" trials) or undesirable (wooden bead, "B" trials) object. Simultaneously, the A (red) or B (green) spatial target LED corresponding to the type of object in the gripper was illuminated. For A trials, the monkey had a 2 s window to reach to a second sensor to move the robot to A, while for B trials, he was required to keep his hand still on the touchpad for 2.5 s and the robot would move to B target. If the robot moved to the target illuminated, for both A and B trials, the monkey received a food reward. If the animal either did not interact with the task or performed the wrong action, these trials were removed from the analysis. The firing rate over a 2 s window following the trial start cue was used as input to the decoder.

## **RESULTS**

We tested the model using 3 different data sets in one-step (classification) mode. Data sets used were: (1) synthetic data generated by an Izhikevich neural spiking model, (2) synthetic data with a Gaussian noise distribution, and (3) data collected from a nonhuman primate engaged in a reaching task. We varied the critic accuracy from 50 to 100% and ran two sets of simulations (S1 and S2) for each of the three data sets; S1, updated the actor at every trial and S2 updated only when the critic feedback was correct (i.e., confidence high). This was performed to compare whether it was better to adapt after each trial or only when the critic feedback was correct. For the purpose of these simulations, we used the correct critic feedback to indicate a high confidence of "1" and an incorrect critic feedback to indicate a low confidence of "0." This can be determined empirically by the critic data that would require an in-depth evaluation, which was not the focus of this study. Since the decoder started at a naïve state, we used a pseudo-real time normalizing of the inputs before feeding to the network. This prevented any bias due to the difference in the magnitude of the inputs. This was done by keeping a real time record of the highest firing rate detected for each input, and then used to continually update the normalization parameters throughout the session (Pohlmeyer et al., 2014).

### **COMPARISON OF ACTOR'S PERFORMANCE WITH AND WITHOUT CONFIDENCE MEASURE**

**Figure 3A** shows how the performance level increased as the critic accuracy increased. The actor which was updated every time is shown in blue. The performance was always below the 1:1 curve showing how the actor performance is limited by the critic accuracy. However, the performance of the system where the actor was updated only when the critic was confident (shown in red) was able to perform above the critic accuracy level as seen in the figure. The performance increased from 50% (±6*.*6%) to 70% (±8*.*8%) at critic accuracy of 50% and further improved from 87% (±10*.*4%) to 92% (±6*.*9%) at critic accuracy of 90%. A critic accuracy of 90% means that the critic gave a correct feedback 90% of the trials and wrong feedback 10% of the trials. For example, in our simulations each consisting of 100 trials, a 70% accurate critic gave correct feedback in 70 trials and wrong feedback in 30 trials. If there was no confidence built-in, the actor assumes that the value was always correct. In this new system with confidence built in, we reduced the confidence of the wrong feedback to zero. At lower critic accuracies (50, 60, and 70%), the system with the confidence outperformed the system without the confidence by approximately 20%. The performance of the two systems showed significant difference for all critic accuracy levels from 50 to 90% (Student's paired *t*-Test, with a two-tailed distribution, alpha 0.001—shown with ∗ in the figure). By updating weights accurately, the system learned optimal mapping and stabilized with time. Given that the system began with random

**FIGURE 2 | The experiment where the monkey controls the robot arm. (A)** A trials associated with a motor high and the left target. Sequence of events (a) monkey triggers trial (b) Robot moves out from opaque screen, target A lights up (c) Monkey makes arm movement (d) Robot moves to

target A. **(B)** B trials associated with a motor low and the right target. Sequence of events (a) monkey triggers trial (b) Robot moves out from opaque screen, target B lights up (c) Monkey keeps hand still (d) Robot moves to target B.

**FIGURE 3 | (A)** Performance of the BMI Vs the critic accuracy with and without confidence inbuilt. (mean ± standard deviation. One thousand simulations. One hundred trials per simulation). Red: New update rule with confidence. Blue: Previous method with no confidence. Black: 1:1 relationship. Critic accuracy was varied from 50 to 100% with 100% being the best. <sup>∗</sup>Shows the values which showed statistical significant difference (alpha

0.001). The overall performance of the blue curve is limited by the accuracy of the critic but the overall performance of the red curve is able to go beyond the critic accuracy, decoupling the performance from the critic accuracy. **(B)** Stability of the system without (green/blue) and with (purple/red) confidence. Plot shows the number of simulations that maintained 100% accuracy beyond 50 trials (green/purple) and beyond 70 trials (blue/red).

initial conditions, there was no guarantee that the system would stabilize. **Figure 3B** gives a summary of the number of simulations out of 1000 that stabilized after 50 trials and 70 trials with and without the confidence. The convergence or stability was defined as maintaining 100% accuracy (last 50 trials or last 30 trials). The number of simulations that did stabilize at lower critic accuracies was higher for the system with the confidence measure. At higher critic accuracy levels, the overall performance was no longer limited by the critic accuracy but by the data itself. As the critic confidence increased, the difference in performance between the two systems became smaller and converged to a single value (94 ± 5*.*8%) since at 100% critic accuracy, both systems effectively have the same update equation.

**Figure 4** shows the details of the action selected in each trial and also the critic values for that particular trial. **Figure 4A** has two sets of simulations S1 and S2 and **Figure 4B** also has two sets of simulations S1 and S2. Each simulation started with random initial conditions. **Figures 4A,B** shows two such examples with two different critic accuracy levels. The critic accuracy was changed randomly based on the percentage given to the decoder. In **Figure 4A**, the critic is 60% accurate and the top subplot shows the performance of the system if the actor was updated every time (S1). The overall performance in this case is 47%. The first trial was correct, but the critic gave a wrong feedback and the actor weights were updated with this erroneous feedback causing the second trial to be wrong. When the critic gave a correct feedback

during the third trial, the system started performing correctly. However, due to the erroneous feedback the performance was not stable. Even when the actor chose the correct action, if the critic provided a wrong feedback, it decreased the performance. In contrast, the second subplot shows the performance when the actor was updated with a confidence level (S2). For the same neural data, order of trials and critic feedback, the performance of the second system is 80%. Even though the critic gave wrong feedback at first, the actor learned to ignore this and was able to have a better outcome. **Figure 4B** shows the performance of the two systems when the critic accuracy was 80%. The top subplot shows when there was no confidence measure and the actor updated every time (S1). The bottom subplot shows the actor updating only when the critic was correct (S2). The critic provided a similar output at the beginning. For the first system, the system started with appropriate random weights and continued to do well with correct critic feedback at the beginning. However, an erroneuous critic feedback at trial 3 caused the system to perform wrong in the next trial. In contrast, the second system started with random weights which caused the first trial to be wrong but the system received good feedback and was able to perform correctly in the subsequent trials. In the first 5 trials, the first system performed better than the second. However, since the second system actor weights were only updated when the critic feedback was good, it took longer for the second system to learn the ideal mapping.

#### **NEURAL PERTURBATIONS—ADDITIONAL NOISE IN DATA**

**Figure 5A** shows how the system with the critic confidence level still performed better than the system which updates the actor weights every time even with the additional noise. At lower critic accuracies, the system which updated at every trial performed at chance level (50% performance), while the system with the critic confidence performed better (at critic accuracies 80% and below the difference in the performance was approximately 10%). However, as the critic accuracy increased (beyond 70%), the system accuracy did not increase as expected in both curves (i.e., both systems stayed below the 1:1 curve). This was due to the limitations in the input data as the data to the decoder was noisy and the states were not as clearly separable. As noted in the previous section, the performance of the two systems showed significant difference for all critic accuracy levels from 50 to 90% (Student's paired *t*-Test, with a two-tailed distribution, alpha 0.001—shown with ∗ in the figure). In **Figure 5A**, the probability component used to generate *I* was 40%, which was most similar to the NHP data shown in the next section. **Figure 5B** shows how different noise levels affected the overall performance as the critic accuracy increased. Each colored trace is a different noise level as shown in the legend. With low noise levels, the system was still able to perform amidst the critic inaccuracies. However, as the noise level increased, the system performed at chance (50%) at low critic accuracy levels and performed marginally above chance even at higher critic accuracy levels.

### **SIMULATIONS USING NHP DATA**

These results are shown in **Figure 6** where the blue trace shows the performance of the actor updating every time and the red trace shows the actor updating only when the critic is confident. Similar to the results of the synthetic data, we can see an improvement (from 50 to 63% at critic accuracy of 50% and from 77 to 83% at critic accuracy of 90%) in the overall performance by adding the confidence measure in the update equation. This is more apparent in lower critic accuracies (At alpha = 0.001 critic accuracies 50–90% showed significant difference shown with ∗ in the figure). At higher critic accuracies, the system which only updates when the critic is confident is still able to do better but the difference in the percentages was smaller. At lower critic accuracies (80% and below) the difference in performance is approximately 13% and at 90% critic accuracy the difference in performance is approximately 7%. Ninety percent critic accuracy means that 9 out of 10 feedback given by the critic is correct. When the critic feedback was always correct, the two systems converged to approximately the same performance value.

#### **DISCUSSION**

In this paper, we demonstrated that adding a confidence level in the feedback to a RL-based decoder can be used to deal

with uncertainty in the critic feedback to improve the decoder performance. The introduction of a confidence component in the HRL weight update equation provided guidance on when to update the actor so that the decoder only updated when the feedback was correct with a high confidence. This is important as we seek to utilize biological signals for the critic in order to build autonomous BMIs for use in diverse ADL environments. Preliminary work suggested that the accuracy of extracting this reward signal in animal subjects was less than 90% (Prins et al., 2013) thus indicating that some form of confidence metric will ultimately be needed for real BMI use. In this work, the effects of the critic confidence were tested and the results indicated that the system with the confidence level incorporated outperformed the system without the confidence level at all critic accuracies. This was the case for all 3 different data sets we examined: artificial neural data generated by the Izhikevich method (Izhikevich, 2003), neural data with additional noise, and for data recorded from the MI of a NHP. The system was particularly more effective at lower critic accuracies (*<*80%). For NHP data the system with the confidence built in performed approximately 13% better than the system without the confidence measure at critic accuracy levels of 50, 60, and 70%. At critic accuracy of 80 and 90%, the system with the confidence performed 12 and 7%, respectively, better than the system without the confidence. For synthetic data with no additional noise, the system with the confidence performed approximately 20% better than the system without the confidence at lower critic accuracies (50, 60, and 70%). At 80% critic accuracy, the difference in performance was 15% and at 90% critic accuracy, this value was 5%. When the critic accuracy was low, updating only when the confidence was high resulted in the actor receiving fewer erroneous feedback, thus causing the system to perform better over time. At higher critic accuracies, since the actor gets correct feedback most of the time, the difference between the two systems, though still noteworthy was small. Both systems converged to the same value when the critic is 100%

accurate. As discussed previously, the neural data proposed for the critic input yielded less than perfect accuracies which made it necessary to find an alternate way to deal with the actor update rule.

#### **NOISY NEURAL DATA**

Noisy neural signals as well as complex neural representation of reward make it a challenging task to classify rewarding vs. non rewarding information with a high accuracy (Schultz et al., 1997; O'doherty, 2004; Knutson et al., 2005). Building a confidence in to the critic feedback improved the performance of the system when the data was contaminated with noise and when the multiple neural representations caused difficultly in extracting a single feedback signal required by the actor-critic decoder. We tested how overlapping classes in the motor data can influence the ability of the decoder to predict the correct action; more the classes overlap, lesser the accuracy in decoding. To add noise to the data, we used a Gaussian distribution in the stimulating current, which resulted in reducing the stimulating current of a certain percentage of neurons in the ensembles that were already tuned. Here, we also showed that with limited noise in the motor data, the system was able to maintain performance. When the motor neural data was noisy, the limiting factor became how well the motor neural data represented the task.

#### **OVERCOMING INHERENT ISSUES WITH RL—TIME FOR CONVERGENCE**

Due to the inherent nature of RL that learns through interaction, the time taken to reach an optimal condition in the weights can longer than for supervised decoders (Beggs, 2005). The agent needs to "explore" its environment in order to have a better understanding of how each action changes the state of the environment. Once the agent has learned enough about the environment, it will "exploit" the situation or carry out the optimal action. In RL, there is always a dilemma between exploration and exploitation. Before the agent knows the optimal action and

difference in the performance becomes smaller as the critic accuracy increases suggesting as before that the critic is no longer the limitation, but the nature of the input data itself.

exploit it, the agent has to make several sub-optimal actions in order to explore the environment. The more exploration that takes place, the better understanding it will have of its environment, but the longer it will take to reach an optimal solution. In the case of BMIs, the agent does not have many trials to explore as each trial comes at a cost. Due to this, the experience of an agent in the BMI setting is very limited. In previous studies, we have used real time "epoching" of the data to speed the initial adaptation from the purely random initialization weights to functionally useful ones as a method of increasing experience with limited data. Another method for overcoming RL limitations is to use a memory of past trials. Here, we used a memory size of 1 trial. For more complicated tasks, a memory size of 70 trials has been found out to give the optimum results (Mahmoudi et al., 2013; Pohlmeyer et al., 2014).

## **EXTRACTING OPTIMAL REWARD SIGNAL FOR BIOLOGICAL CRITIC FEEDBACK**

There are several regions of the brain that can be used to extract a reward signal for the critic, which include the striatum (Phillips, 1984; Wise and Bozarth, 1984; Wise and Rompré, 1989; Schultz et al., 1992; Tanaka et al., 2004), cingulate (Shima and Tanji, 1998; Bush et al., 2002; Shidara and Richmond, 2002), and orbitofrontal cortices (Rolls, 2000; Schultz et al., 2000; Tremblay and Schultz, 2000). Whichever region is selected, the critic will need to decode the reward as well as the confidence it has in its decision. One possible method of decoding the confidence is using the distance to the boundary of a decision surface: the closer a data point is to the decision boundary, the less confidence it has in its decision and further away the data point is, the more confidence it has in its decision. This method assumes that the misclassifications are due to overlapping classes and not due to mislabeled trials. This concept will be further developed in future work.

In this paper, we developed a new formulation for an actorcritic BMI decoder in order to be able to use biological feedback signals. Since RL does not need an explicit training signal to train the decoder, it can be used to develop next-generation BMIs that self-calibrate in scenarios where the user is paralyzed and cannot generate a kinematic reference or training signal. The actor-critic RL paradigm also gives us the flexibility to develop a fully autonomous BMI provided the critic can be driven by a biological source and thus reduce set up times and the need for calibrations.

## **ACKNOWLEDGMENTS**

This work was supported by DARPA REPAIR Contract #N66001- 10-C-2008. The authors would like to thank the University of Miami Division of Veterinary Resources (DVR) for the animal care support. The authors would also like to thank the reviewers for their comments and suggestions.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/Journal/10*.*3389/fnins*.*2014*.* 00111/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 February 2014; accepted: 29 April 2014; published online: 26 May 2014. Citation: Prins NW, Sanchez JC and Prasad A (2014) A confidence metric for using neurobiological feedback in actor-critic reinforcement learning based brain-machine interfaces. Front. Neurosci. 8:111. doi: 10.3389/fnins.2014.00111*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Prins, Sanchez and Prasad. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Reinforcement learning for adaptive threshold control of restorative brain-computer interfaces: a Bayesian simulation

## *Robert Bauer 1,2\* and Alireza Gharabaghi 1,2\**

*<sup>1</sup> Division of Functional and Restorative Neurosurgery and Division of Translational Neurosurgery, Department of Neurosurgery, Eberhard Karls University Tuebingen, Tuebingen, Germany*

*<sup>2</sup> Neuroprosthetics Research Group, Werner Reichardt Centre for Integrative Neuroscience, Eberhard Karls University Tuebingen, Tuebingen, Germany*

#### *Edited by:*

*Mitsuhiro Hayashibe, University of Montpellier, France*

#### *Reviewed by:*

*Kyuwan Choi, ATR Computational Neuroscience Laboratories, Japan Inaki Iturrate, Ecole Polytechnique Fédérale de Lausanne, Switzerland Jaime Ibáñez Pereda, Spanish National Research Council, Spain*

#### *\*Correspondence:*

*Robert Bauer and Alireza Gharabaghi, Division of Functional and Restorative Neurosurgery and Division of Translational Neurosurgery, Department of Neurosurgery, Eberhard Karls University, Otfried-Mueller-Str. 45, 72076 Tuebingen, Germany e-mail: robert.bauer@ cin.uni-tuebingen.de; alireza.gharabaghi@uni-tuebingen.de* Restorative brain-computer interfaces (BCI) are increasingly used to provide feedback of neuronal states in a bid to normalize pathological brain activity and achieve behavioral gains. However, patients and healthy subjects alike often show a large variability, or even inability, of brain self-regulation for BCI control, known as BCI illiteracy. Although current co-adaptive algorithms are powerful for *assistive* BCIs, their inherent class switching clashes with the operant conditioning goal of *restorative* BCIs. Moreover, due to the treatment rationale, the classifier of restorative BCIs usually has a constrained feature space, thus limiting the possibility of classifier adaptation. In this context, we applied a Bayesian model of neurofeedback and reinforcement learning for different threshold selection strategies to study the impact of threshold adaptation of a linear classifier on optimizing restorative BCIs. For each feedback iteration, we first determined the thresholds that result in minimal action entropy and maximal instructional efficiency. We then used the resulting vector for the simulation of continuous threshold adaptation. We could thus show that threshold adaptation can improve reinforcement learning, particularly in cases of BCI illiteracy. Finally, on the basis of information-theory, we provided an explanation for the achieved benefits of adaptive threshold setting.

**Keywords: reinforcement learning, classification accuracy, neurofeedback, functional restoration, neurorehabilitation, brain-computer interface, brain-machine interface, brain-robot interface**

## **INTRODUCTION**

Restorative brain-computer and brain-machine interfaces (BCI/BMI)—emerging rehabilitation technologies for neurofeedback training—seek to reduce disease-specific symptoms in a variety of brain disorders (Wyckoff and Birbaumer, 2014). Unlike classical *assistive* BCIs, whose goal is to replace lost functions by controlling external devices, the main focus of these *restorative* approaches is to provide contingent feedback of specific neuronal states, thereby selectively inducing use-dependent neuroplasticity to normalize pathological brain activity and achieve behavioral gains (Daly and Wolpaw, 2008; Birbaumer et al., 2009). However, affected patients—and even healthy subjects—often show a large variability, or even inability of brain self-regulation, referred to as BCI illiteracy (Vidaurre and Blankertz, 2010). This condition is often related to a low signal-to-noise ratio of the targeted brain activity caused by either physiological (e.g., the depth of the signal source in EEG-based approaches) or pathological (e.g., loss of neural tissue after stroke) mechanisms, or is a result of a misalignment of the mental strategy used by the subject and the brain states targeted by the classifier.

This misalignment may occur when the subject explores different strategies in the course of BCI training, whereas the classifier is usually trained on the first strategy only. Alternative strategies applied by the subject therefore become insufficient. To address these shortcomings, various machine learning techniques and coadaptive algorithms have been proposed. These adjust the brain state targeted by the classifier to the strategy switching of the subject so as to maximize the classification accuracy (Vidaurre et al., 2011; Bryan et al., 2013). Such approaches are powerful for *assistive* BCIs which can, for example, detect the subject's intention to move and to operate external devices. However, in these approaches, the classifier adapts (Vidaurre et al., 2011; Bryan et al., 2013), and so the subject has no incentive to achieve specific brain states. These adaptation approaches therefore clash with the goal of *restorative* BCIs to modify neuronal activity via operant conditioning, i.e., to achieve specific brain states regarded as beneficial for motor recovery.

Due to the treatment rationale of modulating specific brain features, the classifier of restorative BCIs is usually constrained. In the case of motor rehabilitation, for example, the feature space might be restricted to event-related spectral perturbation in the β-range (Gharabaghi et al., 2014). Moreover, event-related desynchronization has been shown to reflect the excitability of the corticospinal system (Takemi et al., 2013). This interaction between a constrained classifier and the subject, who should be rewarded for achieving specific brain states, poses a special challenge for the optimization of neurofeedback in restorative BCI approaches. Thus, classifier adaptation might affect the treatment rationale of the intervention. In this context, threshold adaptation might be an alternative approach for restorative interventions.

However, we have no theoretical or empirical knowledge as to how threshold adaptation during an intervention might affect reinforcement learning. In restorative BCIs, classifiers are often based on linear discriminant analysis (Theodoridis and Koutroumbas, 2009), e.g., automatic feature weighting based on common spatial patterns (Ang et al., 2014) or the visual inspection and selection of spatially weighted frequency bands (Ramos-Murguialday et al., 2013). These linear methods are characterized by threshold selection, i.e., the definition of a specific value on a one-dimensional continuum spanned between the two states that are to be differentiated. Changing this threshold will modify the sensitivity and the specificity of the classifier regardless of the feature weights (Thompson et al., 2013). The selection of this threshold is currently determined by the intent to maximize the classification accuracy (Thomas et al., 2013; Thompson et al., 2013). Furthermore, the magnitude of classification accuracy is usually perceived as the measure to determine the subject's ability to perform the neurofeedback task (Blankertz et al., 2010; Hammer et al., 2012).

Within the framework of communication theory, a high classification accuracy pertains to a good signal-to-noise ratio of the feedback, i.e., it represents sufficient specificity and sensitivity of the feedback (Thompson et al., 2013). Since there is evidence that erroneous feedback affects the reward system (Balconi and Crivelli, 2010), training at the threshold which results in maximum classification accuracy might be considered as the optimal instructional efficacy.

However, to date, no theoretical or empirical work is available on the relationship between instructional efficacy, threshold adaptation and classification accuracy. We therefore present a theoretical framework for adaptive approaches in restorative BCIs. More specifically, we analyzed how classification accuracy is related to instructional efficacy and whether this instructional efficacy can be improved by threshold adaptation. This research question is related to three components: (1) The theoretical framework to model a neurofeedback environment. (2) The simulation of neurofeedback learning. (3) Adequate measures for instructional efficacy.

On the psychological level, neurofeedback training is aptly described as reinforcement learning (Sherlin et al., 2011). Several mathematical algorithms, most of which were developed as machine learning algorithms (Sutton, 1998; Strens, 2000; Szepesvári, 2010) are now available for reinforcement learning. For various reasons, the simulation of reinforcement learning in the present study is based on a Bayesian algorithm (Strens, 2000). There is ample evidence that sensorimotor integration and learning can be appropriately simulated with a Bayesian model (Körding and Wolpert, 2004; Tin and Poon, 2005; Genewein and Braun, 2012). Bayesian reinforcement learning includes an implicit balancing of exploitation and exploration without the need for additional parameters (Strens, 2000). It has also been proposed as an optimal calculus for defining the rational action selection of human agents (Jacobs and Kruschke, 2011). We therefore developed a Bayesian reinforcement learning model for restorative brain-computer interfaces, and explored the predictions of this model for different threshold adaptation strategies and classification accuracies.

## **MATHEMATICAL MODEL OF THE NEUROFEEDBACK ENVIRONMENT**

The basic element of any neurofeedback learning environment is that the subject is in a specific state (s), selects one of two possible actions (a), and is rewarded on the basis of the state (s- ) resulting from this action selection. The training action (aT) places the subjects into the training state (sT), which is supposed to be rewarded, and (aF) places the subjects into the false state (sF), which is not supposed to be rewarded.

In any neurofeedback task, the subject can select either the false action (aF) (e.g., rest or insufficient neuromodulation), or the trained action (aT) (i.e., sufficient neuromodulation). In an ideal neurofeedback intervention, the therapist has perfect knowledge about the current state of the subject and can reward accordingly. In a practical neurofeedback intervention, the subject's current state is determined with only limited specificity and sensitivity, resulting in the possibility of reward for both the trained action P(r|aT) and the false action P(r|aF).

In addition, the state space is usually not discrete, but continuous. By including a parameter (δ) for the step size of one action, a continuous state space can be modeled. Assuming that the step size for both actions is equal but that it is taken in different directions, the current state position (σ) in this continuum can be calculated as the number of times the trained action is chosen instead of the false action, i.e., σ = nδ-mδ. The trained action moves the subject one step toward the trained state, whereas the false action moves the subject one step toward the false state (see **Figure 1A**). This enables us to set a threshold (θ) in the state continuum to determine the probability of reward for the trained action P(r|aT) and for the false action P(r|aF).

In any neurofeedback environment, the classification at each threshold will therefore result in particular probabilities for reward, thus leading to the characteristic curve shape (see **Figure 1B**). At each point defined by state (σ) and threshold (θ), the reward rate will adhere to a binomial distribution. The shape across the threshold/state dimension can be adequately modeled by a logistic function (see **Figure 1B**), which is defined by the discriminatory steepness (D) and the relative position, i.e. the distance (*-*) between the two functions.

$$\hat{P}\left(r \mid a\_T; \;\theta, \Delta, \sigma\right) = \frac{1}{1 + e^{D(\theta - \Delta + \sigma)}}$$

$$\hat{P}\left(r \mid a\_F; \;\theta, \Delta, \sigma\right) = \frac{1}{1 + e^{D(\theta + \Delta + \sigma)}}$$

$$\sigma = (n - m)\delta$$

We therefore postulate that any neurofeedback task based on linear discrimination is fully described by the subject's position in a continuous state space σ, i.e., the history of selected actions n and m, the subject's step size δ, the threshold θ set by the instructor, the classifier steepness D and the distance  between the reward probabilities with *D, -*∈ R≥<sup>0</sup> and θ*,* σ ∈ R and *n, m* ∈ N0. This

**FIGURE 1 | (A)** is a depiction of the state-action-element fundamental to any neurofeedback environment on the basis of linear discrimination. At any states, the subject selects one of two actions (aF, aT), resulting in a subsequent state step in the opposite direction (aF:false action; aT:trained action). **(B)** Shows the probability of reward for a given action (blue aT and red aF) as a function of the threshold θ. The dot markers indicate the reward probabilities at different thresholds acquired from a real dataset (a right-handed female subject performing a neurofeedback task based on motor imagery-related β-modulation over sensorimotor regions with contingent haptic feedback, identical to the task described elsewhere, Vukelic et al., 2014 ´ ). The red and blue traces are logistic functions fitted to the raw data.

function returns symmetric curves, with the shape depending on D only, and the location of each curve depending on and δ.

The parameters σ, θ, and *-*, δ are in arbitrary units and point in the same dimension. We propose that D and  are determined by the features selected for the classifier, in particular their signalto-noise rate and their relative weight. Regardless of these two parameters, the probability of reward for each action is a result of the threshold θ, which is set by the instructor, and the state position σ, which is the result of the subject's history of selected actions and the ability to switch between states, i.e., the step size δ. In this respect, δ and  define the shape of classification accuracy across the θ/σ dimension. On account of this common influence, the classification accuracy has ambiguously been interpreted as indicating not only the classifier performance (Thompson et al., 2013) but also the subject's ability (Blankertz et al., 2010; Vidaurre and Blankertz, 2010). However,  is determined by the classifier and δ is determined by the subject. By altering the environmental parameter's discrimination D, step size δ and distance *-*, this parametric model enables us to model specific neurofeedback environments. The hatted-P indicates that the shape of the reward probability function remains fixed by retaining the discrimination D, the step size δ and the distance  constant within the model. It should be noted that, for a fixed environment *P*ˆ, the distribution of reward for any of the two actions is fully defined by the threshold θ and the state σ.

## **MATHEMATICAL MODEL OF NEUROFEEDBACK LEARNING**

By setting the threshold θ, the instructor may therefore influence the probability distribution of reward for both the trained action P(r|aT) and the false action P(r|aT), even without direct knowledge about P(aT) and P(aF). The subject controls P(aT) and P(aF), although he/she has no direct knowledge about P(r|aT) and P(r|aF). As a rational agent, the subject will attempt to increase P(r), i.e., exploring and exploiting the most rewarding action, on the basis of the knowledge about the reward probability distribution gained from earlier attempts (Ortega and Braun, 2010a). This can be simulated with a Bayesian reinforcement learning model (Strens, 2000). Within this framework, the probability of reward for each action is a binomial distribution that is perceived by the subject as a beta distribution. The beta distribution is a conjugate prior for the binomial distribution. Like the binomial distribution, the beta distribution describes a continuous probability distribution in the interval [0,1]. In addition, it is controlled by the parameters α and β, which allow modeling of the subject's belief Pabout the true reward probabilities P.

> *P*- *(r* | *aT)* ∼ *Beta*(*αT,βT*) *P*-*(r* | *aF)* ∼ *Beta*(*αF, βF*)

In practical terms, the anticipated reward rT and rF for each action is determined by relative values of α and β, while the confidence of the subject that the anticipated value is true will be determined by the magnitude of α and β. For the novice subject, the beta distributions parameters about the false and true reward (αF,αT,βF,βT) are set to 1, and the belief is therefore a uniform distribution.

$$r\_T = \frac{\alpha\_T}{\alpha\_T + \beta\_T}$$

$$r\_F = \frac{\alpha\_F}{\alpha\_F + \beta\_F}$$

Since the instructor has only limited knowledge about the action performed by the subject, i.e., the specificity and the sensitivity of the classifier are not perfect, the magnitude of reward has to be identical for aT and aF, and only their probabilities differ. By way of a practical example: a robotic orthosis extending the hand of a stroke patient contingent with specific brain states would provide the same haptic/proprioceptive feedback regardless of whether the control signal is achieved by motor imagery-related brain modulation (the intended neurofeedback training) or by neck muscle artifacts projecting to the scalp (Gharabaghi et al., 2014). The false and the trained action will thus result in rewards of identical quality, but with different probability. This is important because it allows us to run the simulation without any scaling factor for reward (Ortega and Braun, 2010b). The subject's reward belief is therefore sufficiently represented by the belief about the reward probabilities.

In each learning iteration, the subject selects an action on the basis of a higher probability of reward than the alternative action. This can be calculated since the subject's confidence that the reward for an action is higher than a certain value x is given by the cumulative Beta distribution function defined by the action parameters α and β.

$$F(\mathbf{x}; \alpha, \beta) = \frac{Beta(\mathbf{x}; \alpha, \beta)}{Beta(\alpha, \beta)}$$

By comparing the relative confidence of both actions, the probability for each action to be selected can be calculated as follows:

$$P\left(a\_{T}\right) = \frac{F\left(r\_{F}; \alpha\_{T}, \beta\_{T}\right)}{F\left(r\_{T}; \alpha\_{F}, \beta\_{F}\right) + F\left(r\_{F}; \alpha\_{T}, \beta\_{T}\right)}$$

$$P\left(a\_{F}\right) = \frac{F\left(r\_{T}; \alpha\_{F}, \beta\_{F}\right)}{F\left(r\_{T}; \alpha\_{F}, \beta\_{F}\right) + F\left(r\_{F}; \alpha\_{T}, \beta\_{T}\right)}$$

In practical terms, if the subject has little confidence that one action is more likely to return a reward than the other action, both actions will be performed with the same probability, i.e., P(aT) equals P(aF). If the subject is very confident that aT is more likely to return a reward than aF, aT will be more probable, whereas, in the limiting case, P(aT) and P(aF) would equal one and zero, respectively. Learning in a neurofeedback environment is therefore modulated by the subject's beliefs and confidence about the probability for reward by each action.

In each learning iteration, the action is selected at random on the basis of the subjects belief and confidence in the reward probability (Thompson, 1933; Ortega and Braun, 2010a). The state position σ is subsequently updated by taking a step of the size δ in the chosen direction (false action n + 1, trained action m + 1). Depending on the threshold θ set by the instructor within the otherwise fixed environment *P*ˆ, a binomial distribution defines the probability for reward. Sampling from this distribution determines whether the action is rewarded (α + 1) or not (β + 1), and the subject will subsequently adjust his/her belief. Afterwards, the next learning iteration begins. Please note that, in this framework, every iteration has an undefined duration. Later in the discussion section, we will reveal how a learning iteration can be understood in a practical application.

#### **COMPUTATIONAL APPROACH**

The mathematical model presented here would enable us to estimate the anticipated course of learning for different environments and thresholds by a Monte-Carlo simulation. In this study, we were particularly interested in the anticipated course of learning. Directly increasing the parameters of the Beta distribution by the expectation values for the updates is computationally more efficient than a full computational simulation followed by an averaging across simulations. During each learning iteration, the parameters determining the subject's belief and the state position were therefore updated according to the following formulae:

$$\begin{aligned} \sigma\_{i+1} &= (n\_i - m\_i)\,\delta = \sigma\_i + E\left[P\left(a\_{Ti}\right) - P\left(a\_{Fi}\right)\right]\delta \\ \alpha\_{i+1} &= \alpha\_i + E\left[P\left(a\_i\right)\hat{P}\left(r\mid a, \theta, \Delta, \sigma\_i\right)\right] \\ \beta\_{i+1} &= \beta\_i + \left(1 - E\left[P\left(a\_i\right)\hat{P}\left(r\mid a, \theta, \Delta, \sigma\_i\right)\right]\right) \end{aligned}$$

Between subsequent learning iterations, the probabilities for reward were updated according to the following formulae:

$$\hat{P}\left(r\mid a\_T; \theta, \Delta, \sigma\right) = \frac{1}{1 + \mathfrak{e}^{D(\theta - \Delta + \sigma)}}$$

$$\hat{P}\left(r\mid a\_F; \theta, \Delta, \sigma\right) = \frac{1}{1 + \mathfrak{e}^{D(\theta + \Delta + \sigma)}}$$

The subject's probability for action selection is of a dynamical nature, as can be readily recognized from these iteratively updated functions.

### **MEASURES OF INSTRUCTIONAL EFFICIENCY**

The goal of a neurofeedback intervention is to increase the probability of the trained action. As mentioned earlier, this can be affected only by modulating the belief and confidence of the subject about the reward rates for the trained and the false actions, respectively. If the features and thresholds were not adapted, learning would depend on parameters inherent to the subject only, i.e., step size δ. However, the instructor has the option of either adapting the feature weights (affecting D and  directly, and σ indirectly) or changing the threshold θ between iterations whenever the environment is fixed (constant D and *-*) due to a certain treatment rationale. In a restorative BCI environment, threshold adaptation will therefore be used to influence the instructional efficiency of the neurofeedback intervention.

However, to explore the predictions of the simulation, objective measures for the instructional efficiency (IE) of the neurofeedback have to be defined. Since the subject's belief and confidence are dynamical, the most straightforward measure would be to take the probability of the trained action for a given threshold θ at each learning iteration i. This would have the advantage of being directly comparable to the optimal learning outcome, which is *P (at)* = 1. A further advantage of this approach is that the measure can be translated into entropy with regard to the action selection. This, in turn, can be psychologically interpreted as the subject's uncertainty as to which action is more rewarding. During the course of the training, the subject's uncertainty H should be reduced to zero, and, accordingly, the instructor's goal would also be to reduce the action-entropy to zero. The uncertainty or action entropy H can be calculated as follows:

$$H\_{i, \theta} = P\left(a\_{T, i}, \theta\right) \log\_2 P\left(a\_{T, i}, \theta\right) + P\left(a\_{F, i}, \theta\right) \log\_2 P\left(a\_{F, i}, \theta\right).$$

However, this measure does not divulge whether the subject actually learned in the course of the training, since he/she could have started already with a high probability for the trained action, e.g., if he/she were familiar with the task. This means that the degree to which a subject's uncertainty is reduced might serve as an alternative dynamical measure. Such a measure should consider that a subject's maximum reduction of uncertainty is the difference between the current level of uncertainty and the maximum level of certainty. In accordance with this logic, Georges (1931) defined instructional efficiency as the ratio of the actual gain to the maximum possible gain which can be formulated as follows:

$$IE\_{i, \theta} = \frac{P\left(a\_{T, i+1}, \theta\right) - P\left(a\_{T, i}, \theta\right)}{1 - P\left(a\_{T, i}, \theta\right)} = \frac{P\left(a\_{T, i}, \theta\right) di}{P\left(a\_{F, i}, \theta\right)}$$

Due to the fact that the formula of instructional efficiency IE includes a divisor converging to zero, a singularity will, at some point, occur as lim*P(aF,i,*θ*)*→<sup>0</sup> *IEi,*θ. This singularity indicates the transition to zero action entropy, and thus the achievement of the training goal.

## **RESEARCH QUESTIONS**

With these methodical discussions in mind, we now can explore the instructional efficiency of different threshold setting procedures.

### **FIRST STUDY**

The most frequently used threshold in BCI applications is the one resulting in maximum classification accuracy (Theodoridis and Koutroumbas, 2009).

$$1. \ \acute{\theta}\_1 = \arg\max \theta \left( P\left(r \mid a\_T, \theta\right) + P\left(\neg r \mid a\_F, \theta\right) \right)$$

The first research goal was to clarify whether instructional efficiency is optimal at this threshold, or whether alternative thresholds might result in a lower action entropy H or in a better instructional efficiency IE. Furthermore, even if the classification accuracy were maximal for a certain threshold, its magnitude could still vary. A classification accuracy of below 70%, for example, has been proposed as an indicator of BCI-illiteracy (Vidaurre and Blankertz, 2010). Furthermore, accuracies close to chance level and close to perfect classification are of particular interest when seeking to improve restorative BCIs. We therefore simulated different classification accuracies, i.e., 55, 70, and 95%, by using a fixed distance  of 1 and setting the discriminatory steepness value D to 0.4, 1.7, or 5.9, respectively. We termed these the illiterate, moderate and expert environments accordingly (see **Figure 2**).

## **SECOND STUDY**

We went on to hypothesize that threshold adaptation, i.e., purposefully changing the threshold between iterations, improves the instructional efficiency (IE) and results in lower action entropy (H). To explore the effect of adaptive threshold-setting, we first determined which thresholds resulted in minimal action entropy and maximal instructional efficiency at each iteration across a range of thresholds. Then, instead of using fixed thresholds, we applied the resulting vector as a reference table for the simulation.

$$\begin{aligned} 1. \quad \overrightarrow{\theta}\_{i,1} &= \arg\min \theta \left( \overrightarrow{H}\_{i,\theta} \right) \\ 2. \quad \overrightarrow{\theta}\_{i,2} &= \arg\max \theta \left( \overrightarrow{H}\_{i,\theta} \right) \end{aligned}$$

In practice this meant that, for every iteration, we measured the threshold with the best instructional efficiency respectively lowest action entropy, resulting in two vectors of thresholds. We then repeated the simulation. In these adaptive runs, we used the respective threshold vector instead of the fixed threshold.

#### **REALIZATION**

All simulations were performed for each research question and environment using 10,000 iterations (i), for thresholds (θ) ranging from −10 to 10 and a step size (δ) of 0.1. The prior belief of the subject was initialized by setting αF, αT, βF, and β<sup>T</sup> to 1. The computations were realized with a custom written code in Matlab R 2014A on a Windows 7 machine. The pseudocode example (**Figure 3**) provides a clearer description of this algorithm.

#### **RESULTS**

#### **EXPLORATION OF THRESHOLD SELECTION**

We observed a characteristic beam-like shape of progression toward minimal entropy originating from the threshold of maximum classification accuracy (see black trace in **Figure 4**). In all environments, reduction of entropy first commenced at the threshold of maximum classification accuracy, particularly in environments with higher classification accuracy. Interestingly

enough, the range of thresholds that resulted in a reduction of action entropy was narrower for the expert than for the illiterate environment (see **Figures 4A–C**). Later, the transition between high and low entropy was at higher thresholds than at maximum classification accuracy (CA) thresholds. However, once learning commenced, transition to low entropy was more rapid. This was expressed by a highly asymmetric pattern of entropy reduction (see **Figure 4**).

It is also worth mentioning that the thresholds which resulted in minimum action entropy and maximum instructional efficiency were not identical to those for maximum classification accuracy and that they varied during the iterations (see **Figure 4**). The pattern was similar across environments, and was characterized by an early negative and late positive deflection of the


**FIGURE 3 | Shows in pseudo code the computations performed for the reinforcement learning simulation, with the first study exploring the effect of different fixed thresholds, and the second the effect of threshold adaption on the basis of the findings from the first study.**

action entropy minima (blue trace in **Figure 4**), which occurred earlier and more steeply for the instructional efficiency maxima (red trace in **Figure 4**). The negative deflection peaked between iterations 9 and 10 at a threshold of −1.3 for the illiterate environment, between iterations 5 and 8 at a threshold of −0.3 for the moderate environment, and between iterations 3 and 4 at a threshold of −0.1 for the expert environment. The positive deflection peaked between iterations 319 and 322 at a threshold of 9.1 for the illiterate environment, between iterations 198 and 202 at a threshold of 3.7 for the moderate environment, and between iterations 141 and 155 at a threshold of 1.6 for the expert environment. The magnitude of the deflections was therefore higher for low classification accuracy, whereas transitions were faster for higher classification accuracy.

#### **EXPLORATION OF THRESHOLD ADAPTATION**

Threshold adaptation was performed either following the vector of thresholds that resulted in maximum instructional efficiency (see red trace in **Figure 4**) or minimum action entropy (see blue trace in **Figure 4**), and compared to a threshold fixed at maximum classification accuracy. The comparison showed that adaptation based on the instructional efficiency resulted in a phase of comparatively higher action entropy during the training. Subsequently, however, the entropy decreased more rapidly and more steeply, as indicated by a crossing of the trace for adaptation (instructional efficiency) with the trace for fixed threshold (see **Figure 5**). This pattern was most pronounced for the illiterate environment (see **Figure 5A**), and similar in shape, but with lower magnitude for the other environments (see **Figures 5B,C**). Interestingly enough, the final relative entropy was also smaller for the illiterate environment (see **Figure 5A**).

In the illiterate environment, adaptation on the basis of efficiency resulted in higher action entropy, i.e., a less successful performance, between iterations 24 and 931 and in lower action entropy, i.e., a better performance, thereafter. Adaptation based on entropy was less successful than training with a fixed threshold between iterations 3 and 4 and from 48 onwards (see **Figure 5A**).

In the same vein, adaptation based on entropy was not as good in the moderate environment as training with a fixed threshold between iterations 3 and 6 and from 37 onwards, whereas adaptation based on efficiency resulted in a poorer performance at iteration 3 and between 12 and 959 and in a better performance thereafter (see **Figure 5B**). In the expert environment, adaptation based on efficiency result in a poorer performance between iterations 3 and 74 and a better performance thereafter, and adaptation on entropy resulted in a poorer performance between iterations 3 and 15, but in a better performance thereafter (see **Figure 5C**). In summary, efficiency based adaptation was superior to entropy based adaptation in all conditions, with an initial decrease and a subsequent increase of performance. The magnitude of improvement increased from the expert to the moderate environment and peaked in the illiterate environment. In the moderate and in the illiterate condition, these improvements commenced later, i.e., at ∼1000 iterations.

## **DISCUSSION**

In this study, we developed a model of neurofeedback and reinforcement learning that allows—on a theoretical level—an evaluation of different threshold selection approaches and their potential to optimize neurofeedback in restorative BCIs. We pursued two research questions:

## **DYNAMIC vs. FIXED THRESHOLD**

The first goal was to investigate whether thresholds other than the threshold resulting in maximum classification accuracy would be reasonable within the context of neurofeedback. We observed that learning occurred earliest at the threshold of maximum classification accuracy. However, the pattern of entropy reduction was asymmetric, and we detected a dynamic pattern of early negative and late positive deflection for the thresholds, resulting in maximum instructional efficiency or minimum action entropy (see **Figure 4**). Our theory is that these two findings (dynamics, asymmetry) indicate that threshold adaptation can be superior to training with any fixed threshold. Furthermore, we ascertained that the magnitude of the deflection is greater for environments with lower classification accuracy. This indicates that the effect of adaptation might be even more pronounced for illiterate than for expert subjects.

## **ADAPTATION MIGHT IMPROVE REINFORCEMENT LEARNING**

Our second research goal addressed the question as to whether adaptation can theoretically improve the efficiency of the intervention. To answer this question, we used the threshold vectors resulting in maximum instructional efficiency and minimum action entropy derived from the first study, and applied them dynamically during a second training. For this analysis, we used the time course of action entropy as an outcome measure (see **Figure 5**). We ascertained that threshold adaptation based on action entropy was worse than training with a fixed threshold. By contrast, adaptation for instructional efficiency caused a delayed onset of action entropy reduction, but with a subsequently steeper slope, thus resulting in a stronger and faster overall decrease.

Due to this finding, we consider threshold adaptation as potentially superior to training with a fixed threshold. This effect was especially pronounced for the BCI illiterate condition. We also discovered that the late deflection was strongest in this condition. Since a strong deflection leads to a reduced reward rate, this result indicates that subjects can maintain a low action entropy, even under conditions of reduced reward. This is indicative of successful operant conditioning which is resistant to extinction when reinforcement is lacking. This might be an important asset with regard to the long-term clinical efficacy of restorative BCIs.

## **ASYMMETRIC DIVERGENCE OF REWARD PROBABILITY**

Furthermore, our first study suggests that the effect of adaptation is linked to the transition from negative to positive deflection and to the asymmetry of learning across different thresholds (see **Figure 4**). Such asymmetry might be relevant for a number of reasons. The probability of reward is the information that is essential to the subject if he/she is to learn which action is more rewarding (Ortega and Braun, 2010b). The distance between the reward probability distribution for the trained and the false action therefore constitutes the most important piece of information for the subject with regard to the question as to which action is better. While classification accuracy is symmetric, measures for the distance of two distributions usually are not, as indicated by the Kullback-Leibler divergence that can be calculated as follows:

$$\begin{aligned} 1. \quad KL\left(P\left(r\mid a\_T, \theta\right), P\left(r\mid a\_F, \theta\right)\right) &= P\left(r\mid a\_T, \theta\right) \log\_2 \frac{P\left(r\mid a\_T, \theta\right)}{P\left(r\mid a\_F, \theta\right)}\\ 2. \quad KL\left(P\left(r\mid a\_F, \theta\right), P\left(r\mid a\_T, \theta\right)\right) &= P\left(r\mid a\_F, \theta\right) \log\_2 \frac{P\left(r\mid a\_F, \theta\right)}{P\left(r\mid a\_T, \theta\right)} \end{aligned}$$

This point-wise Kullback-Leibler divergence for each threshold measures the relative informational content of the reward gained by preferring the trained action (see **Figure 6A**) or the reward lost by preferring the false action (see **Figure 6B**). The visualization for different classification accuracies shows that the gain information peaks at positive thresholds (see **Figure 6A**), while the loss information peaks at negative thresholds (see **Figure 6B**). As classification accuracy increases, the divergence becomes stronger and narrower without affecting the peak location. We postulate that these two stable peaks explain not only the asymmetry and the decreased magnitude of deflection but also the narrow learning space for the expert environment (see **Figure 4**). In the same vein, classification accuracy narrows down and assumes a more peaked shape in the expert environment (see **Figure 2**). This indicates that the classification accuracy encompasses a zone in which learning may occur, while the ideal threshold within this zone would have to be selected dynamically in accordance with the subject's current bias. This perspective would tally with the theory that the classification accuracy is the zone of proximal development (Schnotz and Kürschner, 2007; Bauer and Gharabaghi, 2015).

#### **LIMITATION TO SIMULATION AND LINEAR CLASSIFICATION**

It should be noted that our study is based on simulated—and not on empirical—data. However, our findings suggest that threshold adaptation is capable of increasing the instructional efficacy of a restorative BCI. Furthermore, we show that threshold adaptation might improve learning, particularly for conditions with low classification accuracy. However, this threshold adaptation is specifically applicable in linear classification approaches. Classification algorithms which are non-linear or which classify in multiple dimensions (Theodoridis and Koutroumbas, 2009) might well show different behavior. Additionally, reinforcement learning might be of less importance for assistive or communication BCIs. In these approaches, the performance of the classifier will probably remain the most important design factor (Thompson et al., 2013). We therefore propose the hypothesis that threshold adaptation is particularly suitable for approaches dealing with linear classification in the constrained feature space of neurofeedback training and restorative BCIs (Vidaurre et al., 2011; Bryan et al., 2013).

### **FUTURE APPLICATIONS AND VALIDATION**

The simulation applied in this study is based on the theory of reinforcement learning, meaning that the subject continually updates his/her beliefs about the most rewarding action. Learning iterations are an essential aspect of this conceptual framework. But how do these learning iterations translate into the practical world of neurofeedback training and restorative BCI?

We argue that the duration of a single iteration is not an *absolute* measure such as, for example, one feedback trial or 1 iteration/min of training. Instead, we suggest that it be considered as a *relative* measure of information processing that is performed by the subject in a given training environment. This being the case, every iteration is based on the processing of one unit of reward, while the instructional efficiency of one iteration serves as a measure for the efficiency of one bit of reward to reduce entropy, i.e., to change the belief of the subject toward

the training goal (Ortega and Braun, 2010b). Accordingly, the duration of a single iteration may be considered as the time required to communicate one bit of information to the subject and for the information to be processed by the subject. It therefore stands to reason that the bit-rate of restorative BCIs may differ in the same way as the one of assistive/communication BCIs (Thompson et al., 2013). In this context, both quantitative and qualitative influences might affect the bit-rate. Longer interventions might be more effective as they transfer a larger amount of information, resulting in a dosage effect. Moreover, some feedback modalities, such as visual or haptic/proprioceptive feedback, might be more informative than others (Gomez-Rodriguez et al., 2011; Parker et al., 2011). Furthermore, the rate at which information could be processed might be determined by specific traits of the subject, e.g., psychological traits such as cognitive resources (Schnotz and Kürschner, 2007) or physiological and anatomical traits such as the parietofrontal network (Buch et al., 2012; Vukelic et al., 2014 ´ ). In this respect, both physiological and pathological aspects might limit the capacity of a communication channel. In healthy subjects, for example, the extraneous load caused by distractions or feedback overload from multiple senses might impair information processing (Clark, 2006). In pathological conditions, e.g., following a stroke, patients with impaired afferent pathways (Szameitat et al., 2012) might benefit less from proprioceptive feedback than stroke survivors without this impairment. Furthermore, technological limits, such as the time-resolution of the classifier or the inherent signal-tonoise ratio, may also limit the maximum attainable rate (Sanei, 2007).

On a more positive note, according to our theory, limitations in one domain might be compensated by achievements in another. Such additional measures to increase the learning rate might include the coupling of the neurofeedback training with brain stimulation (Lefebvre et al., 2012; Gharabaghi et al., 2014), the monitoring of cognitive resources and engagement based on physiological measures (Smith et al., 2001; Novak et al., 2010; Koenig et al., 2011; Grosse-Wentrup and Schölkopf, 2012), and/or patient screening for treatment eligibility (Stinear et al., 2012; Bauer et al., 2014).

The model presented here might serve as a theoretical basis to integrate this abundance of research into the framework of Bayesian reinforcement learning. Further research will be required to confirm our predictions. Most importantly, however, these findings serve to stimulate empirical studies to seek alternatives to the "maximum classification accuracy" paradigm and to explore threshold adaptation as a tool for increasing the instructional efficiency of restorative BCIs.

## **ACKNOWLEDGMENTS**

RB was supported by the Graduate Training Centre of Neuroscience, International Max Planck Research School, Tuebingen, Germany. AG was supported by grants from the German Research Council [DFG GH 94/2-1, DFG EC 307], and the Federal Ministry for Education and Research [BFNT 01GQ0761, BMBF 16SV3783, BMBF 03160064B, BMBF V4UKF014].

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 July 2014; accepted: 24 January 2015; published online: 12 February 2015. Citation: Bauer R and Gharabaghi A (2015) Reinforcement learning for adaptive threshold control of restorative brain-computer interfaces: a Bayesian simulation. Front. Neurosci. 9:36. doi: 10.3389/fnins.2015.00036*

*This article was submitted to Neuroprosthetics, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2015 Bauer and Gharabaghi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

## OPEN ACCESS

Articles are free to read, for greatest visibility

### TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

#### COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org