Edited by: Markus Diesmann, RIKEN Brain Science Institute, Japan
Reviewed by: Michael Hines, Yale University, USA; Abigail Morrison, Bernstein Center Freiburg, Germany
*Correspondence: James Kozloski, Computational Biology Center, IBM T. J. Watson Research Center, 1101 Kitchawan Road, Room 05-144, Yorktown Heights, NY, USA. e-mail:
This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
Neural tissue simulation extends requirements and constraints of previous neuronal and neural circuit simulation methods, creating a tissue coordinate system. We have developed a novel tissue volume decomposition, and a hybrid branched cable equation solver. The decomposition divides the simulation into regular tissue blocks and distributes them on a parallel multithreaded machine. The solver computes neurons that have been divided arbitrarily across blocks. We demonstrate thread, strong, and weak scaling of our approach on a machine with more than 4000 nodes and up to four threads per node. Scaling synapses to physiological numbers had little effect on performance, since our decomposition approach generates synapses that are almost always computed locally. The largest simulation included in our scaling results comprised 1 million neurons, 1 billion compartments, and 10 billion conductance-based synapses and gap junctions. We discuss the implications of our ultrascalable Neural Tissue Simulator, and with our results estimate requirements for a simulation at the scale of a human brain.
Techniques to simulate the electrophysiology of neurons have progressed steadily since the middle of the last century, from single compartment models of Hodgkin and Huxley (
With each technique now established in the field, a next step in extending simulations of the nervous system is to impose constraints derived specifically from neural tissue (Markram,
We define neural tissue simulation first to include multi-compartment Hodgkin–Huxley models of neurons derived from anatomical reconstructions of real neurons. Second, simulations must support synaptic coupling between compartments and attempt to match synaptic distributions from real tissue. Finally, neural tissue simulations must meet the following additional requirements, which distinguish them from other neural circuit simulations:
Every model in the simulation is embedded within the three-dimensional coordinate system of a neural tissue;
Coordinates for all models are available during initialization and simulation;
Model dependencies, communication, and calculation are some functions of these coordinates.
These requirements affect not only how a neural tissue simulation is initialized and calculated, but also its possible outcomes and the types of scientific questions it can be used to answer. We therefore review essential constraints derived from these requirements together with their associated simulation techniques and expected effects on a simulation's outcome and scientific value.
Structural constraints on neural tissue simulation guide the arrangement and coupling of compartments, channels, and synapses to compose neurons and neural tissue. These constraints permit only certain compositions of structural elements and thus aim to create a three-dimensional replica of a real neural tissue.
Consider first a neuron model's branch topology. Branches constrain the geometry and number of compartments coupled to create a compartmental neuron model. Compartmental models of neurons simulate the currents that flow within and across a neuron's membrane in order to calculate the voltage of each compartment. The distance between branch points in a compartmental model directly impacts the model's simulated electrophysiology, and thus its signaling properties in a circuit (Krichmar et al.,
The placement of channel models within the topology of a neuron contributes to the topology's impact on simulation outcomes (Schutter and Bower,
Beyond topological constraints, neural tissue simulations impose geometrical constraints on neurons, branches, and compartments, derived from measurements of real neuron reconstructions. These constraints first assign three-dimensional tissue coordinates to each compartment and branch point. Coordinates do not change a neuron's simulated physiology directly, however, since the solution to the Hodgkin–Huxley model for a cable depends not on the precise spatial locations of its compartments, but only their size and coupling. Instead, geometrical constraints affect a neuron's physiology when other models (such as other neuron branches) are coupled to it (such as through synapses) by some function of its tissue coordinates (such as a proximity measure).
For example, in neural circuit models, a synapse is generated by first identifying a specific pair of compartments to which the pre- and post-synaptic components of the synapse are coupled. Similar to branch junctions and channels, the precise distance along branches where synapse models occur may affect the physiology and signaling properties of certain neurons (Ascoli and Atkeson,
In addition to compartment, channel, and synapse placement, other relationships between neuron models can be derived from neural tissue coordinates. For example, when every simulated membrane conductance is associated with a coordinate in the tissue, the ability to calculate extracellular field potentials Traub et al. (
Modeling the functional properties of neural tissue involves simulating tissue dynamics at many scales, from the electrodynamics of individual cell membranes, to emergent neuron, circuit, and whole tissue phenomena. Functional constraints, such as the types and parameters of ion channel, axonal compartment, and synapse models used, each have significant effects on what results a simulation can achieve.
To capture the varied electrical properties of the membrane of a single neuron (Achard and Schutter,
Channel models often exhibit highly non-linear relationships between conductance changes and their dependencies (for example, the voltage-dependent conductance of the fast sodium channel model). In practice, this makes neuron models’ simulated physiology susceptible to small changes in the parameters of their ion channel models (e.g., time constants, peak conductances). These susceptibilities are pronounced in dendrites (Segev and London,
Functional constraints on models of presynaptic axonal compartments vary by approach. Most neural circuit and neural tissue simulations do not model presynaptic compartments explicitly but instead assume axons are independent of the electrical integration properties of the neuron and never fail to transmit an action potential (or “spike”) generated at the soma. Greater functional constraints can be imposed on neural tissue simulations by solving the Hodgkin–Huxley equations for compartments representing the complete or partial axonal arbor. These constraints then allow certain phenomena to emerge that could influence a neural tissue simulation. First, failures of action potential propagation can occur at certain points along an axon, introducing uncertainty surrounding the signaling role of action potentials transmitted through otherwise reliable axons (Mathy et al.,
Functional constraints on synapse models also vary by approach and are closely related to the chosen model of the presynaptic compartments. When presynaptic voltages are calculated, kinetic models of presynaptic voltage-dependent neurotransmitter release (Destexhe et al.,
Certain simulation packages allow users to perform neural tissue simulations, including GENESIS (Bower and Beeman,
Briefly, these applications initially calculated large networks of many neurons by solving each wholly on a single computational node of a parallel machine, then communicating spikes logically over the machine's network to nodes where others are also wholly solved. Communication between nodes in these applications, as in the Blue Brain Project, models the
Because the propagation of spikes along the axon is not modeled physiologically, as in a compartmental solution of the axonal arbor's voltages, logical spike passing requires parameterizing each synapse with an estimate of the interval between spike initiation at the soma and arrival at the synapse. The technique makes use of computational event buffers for integrating synaptic inputs in the order in which presynaptic neurons spiked. These buffers receive spike times from presynaptic neurons, sent at simulation intervals longer than the numerical integration time step, and corresponding to the presumed minimum spike delay in the network (Morrison et al.,
The assumptions about axons made by logical spike passing (i.e., fully reliable action potential transmission and complete isolation from dendritic and somatic integration) are known to be false for axons exhibiting certain functional constraints, as discussed above (Schmitz et al.,
In addition to integrating inputs across chemical synapses, neural tissue simulation requires the ability to simulate currents across electrical synapses created by gap junctions. The numerical approach employed in previous neural tissue simulation applications to solving the dendritic compartments’ voltages requires that electrical synapses between compartments be solved differently than chemical synapses and separately from the integration of the neuronal arbor, since they affect coupled compartments instantaneously. Because the numerics of electrical synapses are not very stiff, however, the computation of gap junctional currents at each time step (which ignores the off diagonal Jacobian contribution) is sufficient for numerical stability using fixed-point iteration methods (Hines and Carnevale,
One important measure of a parallel computational approach is scalability, and we therefore considered these solutions’ scalability and efficiency. Strong scaling refers to a solution's ability to solve problems of constant size faster with more processors. Weak scaling describes how a solution's runtime varies with the number of processors for a fixed problem size per processor. Speedup is the ratio of the runtime of a sequential algorithm to that of the parallel algorithm run on some number of processors. A solution that exhibits ideal strong scaling has a speedup that is equal to the number of processors, whereas one exhibiting ideal weak scaling has a runtime independent of the number of processors. Efficiency is a measure of a solution's ability to use processors well, and is defined as a solution's speedup relative to the ideal speedup. Solutions that exhibit excellent scaling are those with efficiencies close to 1.
The problem of balancing the load of computation between processors on a parallel machine has also motivated techniques for splitting compartmental models of neurons across multiple processors in certain simulators (Hines et al.,
To provide for efficient communication between parallel calculations, an algorithm must exploit knowledge of the physical network that carries messages between nodes in the parallel machine. Ideally a simulation will minimize the number of nodes a message must traverse to arrive at its destination (Almasi et al.,
Limiting long term scalability of certain approaches has been their implementation of communication between computational nodes. Both point to point strategies (Goddard and Hood,
A more reasonable choice when this inevitability arises is the MPI collective
We have exploited the structural constraints of neural tissue simulation to create new methods that decompose a neural tissue into volumes, then map each volume and the models it contains directly to a computational node of a parallel machine to create the ultrascalable Neural Tissue Simulator. Our approach computes the voltage in every axonal compartment, allowing spikes to propagate between nodes of a parallel machine over the shortest possible paths, ultimately arriving at synapses based on the dynamics of our physiological model of the whole axon. This novel approach to communication in the tissue is ultimately scalable, since no long range communication in the machine's communication network is required. Furthermore, the approach allows for a broader range of scientific questions to be addressed in more accurate models of the tissue, for example by obviating the need to estimate conductance times between somata and synapses or to restrict gap junctions to only certain regions of the neuron.
The simulations we employed to test the Neural Tissue Simulator included several physiological models, coupled according to the structural and functional constraints of neural tissue simulation. Our simulations deviated from biological accuracy (for example, our cortical columns were stacked along the radial axis of the tissue) whenever necessary to provide a better test of the simulator (for example, to examine scaling in all three-dimensions). The simulator supports all requirements of neural tissue simulation and therefore can be used to create biologically validated simulations of neural tissue.
To create the simulator, we employed a model graph simulation infrastructure (Figure
Models include declared types and computational phases, executed in parallel across multiple threads referencing shared memory, multiple processes referencing distributed memory, or both. Each model is implemented once during initialization on a single computational node. It then connects to other models throughout the simulation. When a model attempts to connect to a model implemented on another node, it is noted as a sending model to that node. When a model attempts to receive a connection from a model on another node, a model proxy is created, and the connection is made from the model proxy to the model. Models sharing a computational phase have no data dependencies and reference other models or model proxies through MDL-declared model interfaces. Interprocess communication occurs only on declared phase boundaries, when data changed by models within that phase is marshaled, communicated to other processes, and demarshaled into model proxies (Figure
To create the extensible Neural Tissue Simulator, we defined a core set of models in MDL (available as Supplementary Material). This list can be extended using MDL and the existing application to include any number of additional tissue components from neuroscience. The core models included:
A model of a neuron branch, parameterized with the number of branch compartments, and solved implicitly at different Gaussian forward elimination and back-substitution computational phases.
Two models of a junction between neuron branches (including somata), one solved explicitly at different predictor and corrector computational phases preceding and following Gaussian elimination in branches, and the other solved implicitly as part of its proximal branch.
Fast sodium and delayed rectifier potassium channel models, based on the original Hodgkin–Huxley model. These channel models were coupled to all axon and soma compartments and solved in a single phase prior to all branches and junctions.
Conductance-based AMPA and GABAA synapse models (Destexhe et al.,
A gap junction model comprising two connexons, for electrically coupling compartments from different neurons through a fixed resistance and solved together with chemical synapses and channels.
A “functor” in our simulation infrastructure is defined in MDL and expresses how the simulator instantiates, parameterizes, and connects specific models based on arguments passed to it in GSL. All functors are therefore executed during simulation initialization and iterate over specified sets of models. We designed a Neural Tissue Functor as a key component of the Neural Tissue Simulator (MDL definition available as Supplementary Material). Its arguments include a file containing a neural tissue structural specification and parameter files targeting channels and synapses to specific components of the tissue (examples available as Supplementary Material). The structural specification comprises neuron identifiers (structural layer, structural type, and physiological type) and coordinates that embed the neuron in the three-dimensional coordinate system of the tissue.
We derived the current tissue specification from cortical tissue. Unlike cortex, which scales in two-dimensions, our tissue specification scaled in three, with the sole purpose of studying the performance of the simulator and evaluating the novel numerical methods reported here for accuracy and stability. The specification included the following structural patterns:
Minicolumns, comprising 20 reconstructed neurons from
Columns, comprising 20 × 20 minicolumns, spaced at 25 μm intervals within the two-dimensions of the cortical sheet, whose component neurons’ axons and dendrites extended hundreds of micrometers beyond the boundaries of the minicolumn or column.
Tissue blocks, comprising
Neuron | Segments | Branches | Compartments | Per Branch | Length (× |
---|---|---|---|---|---|
C050896A-P3 | 1794 | 187 | 833 | 4.5 | 20 |
C261296A-P1 | 3308 | 437 | 1400 | 3.2 | 20 |
C261296A-P3 | 2955 | 376 | 1154 | 3.1 | 20 |
C261296A-P2 | 3652 | 302 | 1171 | 3.9 | 30 |
C040896A-P3 | 2910 | 241 | 1074 | 4.5 | 30 |
C120398A-P3 | 2140 | 135 | 644 | 4.8 | 40 |
C010600A2 | 3120 | 303 | 1473 | 4.9 | 30 |
C050800E2 | 2965 | 178 | 861 | 4.8 | 40 |
C200897C-I1 | 3863 | 399 | 1495 | 3.7 | 30 |
C120398A-P2 | 1490 | 80 | 437 | 5.5 | 40 |
C010600B1 | 9436 | 720 | 2868 | 4.0 | 40 |
C120398A-P1 | 1400 | 61 | 273 | 4.5 | 40 |
C190898A-P2 | 1873 | 122 | 540 | 4.4 | 30 |
C190898A-P3 | 2268 | 146 | 699 | 4.8 | 20 |
C250500A-I4 | 3097 | 191 | 1375 | 7.2 | 20 |
C180298A-P2 | 2863 | 186 | 1326 | 7.1 | 20 |
C040600A2 | 3871 | 301 | 1401 | 4.7 | 40 |
C240797B-P3 | 1407 | 110 | 468 | 4.3 | 30 |
C050600B1 | 4626 | 539 | 1987 | 3.7 | 30 |
C280199C-P1 | 2390 | 149 | 592 | 4.0 | 40 |
3071.4 | 258.2 | 1103.6 | 4.6 | 30.5 | |
1744.0 | 167.9 | 611.9 | 1.1 | 8.3 |
Finally, the simulation graphs we declared using GSL (example available as Supplementary Material) express the tissue as a set of contiguous, regularly shaped volumes, each mapped to a coordinate in a three-dimensional grid. This mapping provides a means to automatically decompose the tissue into network partitions that are easily mapped onto the computational nodes and communication network topology of a supercomputer such as Blue Gene (e.g., one volume per grid coordinate per Blue Gene node). In addition, these volumes allow access to tissue components by grid coordinates. For example, recording and stimulation electrode models declared directly in GSL (i.e.,
The model initialization methods we employ build on existing computational techniques. First, we utilize a touch detection technique developed in our lab in collaboration with the Blue Brain Project (Kozloski et al.,
Touch detection requires performing a costly calculation of distance between two line segments up to
Tissue volumes are right rectangular prisms created by slicing a neural tissue multiple times in each of its three-dimensions. The number of volumes created in this way equals the number of computational nodes of the machine. Slicing planes are chosen to accomplish histogram equalization of branch segments across slices in each of the three-dimensions of slicing (Kozloski et al.,
During work partitioning, each node of Blue Gene loads some number of unique neuron reconstructions from the structural specification, then all nodes in parallel calculate which volumes in the tissue are intersected by each of the neurons’ branch segments. Because each machine node is assigned a single tissue volume to aggregate segments for touch detection, partitioning determines which nodes receive each branch segment during redistribution of the tissue data. Whenever a segment traverses a slice plane, all nodes assigned the volumes it intersects aggregate it for touch detection. This ensures that every touch will be detected at least once. We call this method of distributing data according to neural tissue volumes a “tissue volume decomposition.”
The touch detection algorithm proceeds within each volume and on each node in parallel, creating a unique set of touches across the distributed memory of the machine, then writes touch data to disk or redistributes it for use by the Neural Tissue Functor to initialize synapses. The parallel algorithm can detect billions of touches per hour (for our largest calculation, 25.5 billion touches in 2.5 h on 4,096 nodes of Blue Gene/P (Sosa,
Geometric constraints on touch detection may be relaxed by increasing the minimum distance between branch segments required for a touch to occur. Typically this criterion distance is defined as the sum of the two segments’ radii, but our application allows it to be increased arbitrarily. Increasing the touch criterion distance allows for the identification of more compartment pairs for possible selection and synapse creation, and thus the possibility of constructing a greater diversity of neural circuits from a single tissue by sampling a larger distribution of touches. For certain distances and certain branch types, the approach is also a reasonable model of the constraints on neural circuit development within a tissue, since dendritic spines have been identified as a critical mechanism for neurons to increase the distance over which synapses may form (Chklovskii,
An additional approach to modifying touch distributions and thus synapse creation and potential circuit configuration involves the simulation of neural tissue development. In real neural tissue, concentration and electrical gradients, and fields generated by surrounding neurons are capable of deforming the trajectory of a growing branch in predictable and stereotyped ways. They accomplish this by their interaction with sensing and motility components packed into the specialized tip of a growing branch, known as the growth cone (Hong and Nishiyama,
We created a modeling abstraction of these mechanisms and implemented a neural tissue growth simulator (Kozloski,
The neurons in the neural tissue simulations we used to test the Neural Tissue Simulator (Table
Unlike previous approaches, neurons are divided at points within branches where a single branch intersects a slicing plane and is divided into two branches (i.e., at “cut points”). A single neuron is therefore most often mapped to multiple machine nodes in our data decomposition. The neural tissue is first divided into volumes by the Neural Tissue Functor and each volume assigned to a machine node. Slicing planes are chosen to accomplish a weighted histogram equalization of compartments across slices in each of the three-dimensions of slicing. Weights for each compartment are calculated based on a sum of weights of its associated channel and branch models. These weights were determined experimentally and reflect the expected computational load of each model. In this way, we approximate an optimally balanced distribution of work for solving branches on the computational nodes of Blue Gene. Each compartment is then assigned to the volume containing the compartment's proximal endpoint (except for branch points, which are assigned to the same volume as their proximal compartment) and initialized on the machine node assigned their volume (Figure
For a parallel architecture, link bandwidth typically measures the number of bytes per second that can be communicated from one machine node to another through any number of wires that connect the pair directly. Despite some overhead involved in sending data in packets, link bandwidth is a good estimate of the effective bandwidth between adjacent nodes on the machine's communication network (Al-Fares et al.,
Because not all nodes are adjacent on today's large parallel machines, a network topology determines the number of links that separate nodes. For Blue Gene/P, this topology is a three-dimensional torus, which attempts to maximize the bisection bandwidth of the machine, while ensuring that a large number of adjacent nodes (nearest neighbors) exist for any node on the network. Bisection bandwidth is the bandwidth across the minimum number of links that divide the machine into two partitions with equal numbers of nodes (Al-Fares et al.,
The Neural Tissue Simulator exploits nearest neighbor communication in the massively parallel architecture of Blue Gene to create fixed communication costs that are wholly dependent on the local structure of neural tissue (Figure
Because communication across synapses by simulation elements imparts almost no additional machine network communication cost, the marginal cost of adding a synapse to a simulation is therefore the constant cost of computing the synapse's state and current (Figure
Thus, in our tissue volume decomposition, nearly all communication in a simulation occurs between adjacent nodes of Blue Gene/P and traverses only one link. Communication efficiency in our simulation is therefore determined by link bandwidth not bisection bandwidth. For this reason, the amount of data communicated over links remains constant as the size of the simulation grows proportionally with the size of the machine (i.e., its number of nodes). In contrast, poor scaling almost always results from network congestion due to irregular communication patterns that require longer communication paths and/or greater message lengths as the simulation and machine size grow, though at the scale of present neural tissue simulations, this does not appear to be the case for other approaches (Hines et al.,
Our tissue volume decomposition is able to exploit Blue Gene's innovative toroidal network topology because it provides a natural mapping from nearest neighbor volume to volume communication in a neural tissue to nearest neighbor node to node communication on the machine's network (Figure
Despite the constraints and advantages of nearest neighbor communication, the communication engine we devised for the Neural Tissue Simulator supports communication across all possible paths in the network and uses only the
Initialization of the simulation has three stages: tissue development, touch detection, and physiological initialization (each performed by the Neural Tissue Functor). We showed previously that both tissue development and touch detection scale with the number of branch segment interactions (i.e., touches, forces, etc.) calculated within a volume (Kozloski et al.,
Physiological initialization requires iterating through branch segments and touches, and identifying which segments require connections to local instantiations of the neural tissue models, and which must connect to model proxies (Figures
Our model simulation methods build on existing numerical approaches for solving Hodgkin–Huxley type mathematical models of branched neurons, as described in Supplementary Information. Here we report our numerical approach to solving these equations, which can be considered a tunable hybrid of Hines’ method (Hines,
The “Gold Standard” for numerically solving Hodgkin–Huxley neuron models is the Hines algorithm (Hines,
These observations, combined with a fully implicit, nearest neighbor numerical method (Figure
In contrast to Hines’ approach to solving the neuron's membrane potential with an implicit numerical scheme and a single linear system for the entire neuron, Rempe and Chopp (
It is important to note that Rempe and Chopp (
The sizable approximations made in constructing these models, combined with the stability properties of Rempe and Chopp's method, permit a reasonable choice between the accuracy found in Hines’ method and the speed increase that Rempe and Chopp's method allows via parallelization and spatial adaptation. Our method implements this choice in a manner tunable by a single parameter.
We developed a hybrid method that trades off the accuracy and stability afforded by Hines’ fully implicit numerical method (Hines,
Now consider a fully implicit parallel solver that replaces all explicit junctions between branches with implicit junctions (Figure
The advantage of this approach is that all branches of the same branch order can be solved fully implicitly and in parallel, allowing a tissue volume decomposition to assign branches to machine nodes based on volume, where volume boundaries introduce additional implicit or explicit junctions at cut points. The disadvantage is that the proliferation of computational phases as neurons grow in complexity (approaching the physiological range of 10–102 branch orders) requires a synchronization point in the parallel algorithm wherever communication between contiguous branches of different orders occurs, each of which imposes a cost on performance of the parallel simulation.
An alternative approach incorporates Rempe and Chopp's explicit junctions at select points in the neuron's branched arbor (Figure
For a test simulation of 128,000 neurons, 42 million branches, and 135 million compartments on 512 nodes of Blue Gene/P, we varied the maximum compute order from 0 to 7 (Figure
Our test simulations used a time step of 10 μs, which compares favorably to that used in previous work when compartment size is taken into account. The largest published simulation to date, for example, utilized a time step of 25 μs in simulating approximately 300 electrical compartments in 300 branches per neuron, and did not explicitly model the axon (King et al.,
Here we provide a step by step sequence for the workflow of simulation initialization and execution, complete with required data inputs, calculations, and outputs. Note that all steps following model code generation are executed by a single executable run on Blue Gene/P.
The MDL parser compiles MDL (see
A configure script initiates C++ code compilation, targeting the executable for a specific platform (Blue Gene/P for this study).
The
The GSL parser reads GSL (see
A volume mapper assigns each volume to a node of Blue Gene/P, ensuring that tissue volume and node coordinates preserve nearest neighbor communication patterns on the hardware.
The Tissue Functor reads the tissue specification file and computes a globally consistent scheme for dividing neuron-related initialization work. The functor loads different neurons from
The Tissue Functor performs a resampling algorithm, transforming neurons in memory by creating points spaced at regular multiples of their containing fibers’ diameters.
Communication between all nodes of the machine enumerates all points in the tissue, constructing a global point histogram, which is then equalized in three-dimensions to create a volume slicing.
Neuron segments are communicated to all volumes they traverse, according to the volume slicing scheme.
The Tissue Functor executes a neuron growth algorithm (optional), sequentially adding each segment of each neuron while subjecting each segment to specified forces from segments already in the tissue.
The Tissue Functor executes a touch detection algorithm, and computes touches between all neuron segments in each volume. Touches are detected using a parameterized probability, which may save memory in extremely dense tissues. Globally consistent random number generation allows touch detection to remain consistent across all nodes of Blue Gene/P.
The Tissue Functor aggregates computational costs associated with each neuron segment based on cost estimates for compartments, channels, and synapses, provided by a simulation designer in a
With this scheme, the Tissue Functor communicates touches and neuron segments to nodes of Blue Gene/P responsible for implementing models or model proxies that depend on these data (for example synapses, each of which depends on one touch and two segments, and compartments, each of which depends on a segment). A simulation designer provides parameterized mappings in
The GSL parser creates all tissue models, including branches, channels, and synapses. A simulation designer parameterizes in a
The GSL parser initializes other models such as stimulation and recording electrodes, connecting each to specified models within the tissue. In addition, specified simulation triggers (for example, to control data collection and simulation termination) are initialized.
The GSL parser interprets the specified phase structure of the simulation and maps different models’ phases to declared simulation phases.
The GSL parser initiates the simulation, in which each iteration comprises a sequence of phases that: solve ion channel and synapse states, predict branch junction states, forward eliminate branches of appropriate compute orders, back-substitute branches of appropriate compute orders, correct branch junction states, evaluate all triggers, and collect data if necessary. Between each phase, model state modified during a phase and required by other models is marshaled, communicated by
The simulation terminates when the end criterion is satisfied. All models and simulation data are destructed on all nodes of Blue Gene/P.
All neuromorphological data presented derive from the publicly accessible database found at
Here we report the results of scaling the Neural Tissue Simulator using several methods in order to evaluate its performance over a variety of simulation and machine sizes and configurations.
The Blue Gene/P runtime system allows applications to run in symmetric multiprocessor “(SMP) Mode,” wherein each of its 4 core nodes is equivalent to a SMP machine (Sosa,
We observed thread scaling in our test simulations of 128,000 neurons, 40 million branches, and 135 million compartments on 512 nodes of Blue Gene/P (Figure
Strong scaling refers to the ability of an application to perform the same sized calculation faster on machines of larger sizes. We observed strong scaling of the Neural Tissue Simulator when we performed a simulation of 16,000 neurons on Blue Gene/P machine sizes ranging from 64 to 4,096 nodes (Figure
To express performance results in units more relevant to neural tissue simulation, we normalized the simulation times and expressed them in terms of the amount of time each processor of any machine must compute in order for the entire machine to compute a given amount of physiological time for a single neuron. We measured this time at 1.0 processor seconds per neuron millisecond when there is an average of 250 neurons per node, and 1.8 for an average of 4 neurons per node. Thus, while strong scaling could likely continue to produce shorter overall simulation times on machines larger than those we used, the normalized measurements indicate that the benefit of larger machines for problems of this size will likely diminish.
Weak scaling refers to the ability of an application to perform calculations of varying sizes in the same amount of time, provided that as the calculation size grows, the machine size grows proportionally. We observed excellent weak scaling of the Neural Tissue Simulator when we performed simulations of different sizes while holding the average number of neurons per node constant at 250 on Blue Gene/P machine sizes ranging from 64 to 4,096 nodes (Figure
Total | Per node (±SD) | |
---|---|---|
Nodes | 4,096 | N/A |
Threads | 16,384 | 4 |
Neurons | 1,024,000 | ∼250 |
Branches | 344,474,059 | 84,100 ± 7,406 |
Junctions | 208,947,659 | 51,012 ± 4,026 |
Compartments | 1,083,289,600 | 264,475 ± 7,582 |
Na channels | 330,613,914 | 80,716 ± 7,440 |
KDR channels | 330,613,914 | 80,716 ± 7,440 |
AMPA synapses | 8,186,972,360 | 1,998,772 ± 720,155 |
GABAA synapses | 2,255,068,948 | 550,553 ± 169,064 |
Connexons | 7,626,124 | 1,861 ± 820 |
The simulations we performed, while intended to test the performance and scaling of a parallel application for calculating the Hodgkin–Huxley equations for branched neurons using a novel tissue volume decomposition, has provided additional physiological results worth noting. Specifically, we observed by recording simultaneously from many basket cell interneurons in the upper layers of our tissue simulations, action potentials that originated in the axons then back propagated into somata, where they failed to regenerate full scale action potentials (Figure
Furthermore, while previous neural tissue simulations employ fewer synaptic inputs than we show here, the Neural Tissue Simulator clearly achieved physiological scales of synaptic inputs (∼10,000 synapses/neuron), in part due to our tissue volume decomposition's local calculation of synapses. Furthermore, we note that an average additional 510 synapses/neuron had a significant effect on recordings from the same neurons’ somata (Figure
Finally, these recordings represent the first from a tissue simulated at this scale (>1 million neurons, comprising on average 1,104 compartments/neuron; Table
Estimating the computational requirements for simulating an entire human brain is difficult for a number of reasons. First, though neural tissue simulation is technically most suited to making this estimate (for reasons we list below), neuroscience still has no predictive models for global brain function. Without consensus on which modeling approach is most functionally appropriate for whole-brain simulations, any estimate will require an assumption-based approach to whole-brain simulation. Second, connectivity in the brain is not fully known even among animal models, where the quest for a complete connectome has just begun (Eisenstein,
The assumptions and arguments we make for the following analysis address each of these difficulties. First, we base our analysis on the assumption that neural tissue simulation is an appropriate approach to whole-brain simulation. This approach has several benefits, including its biophysical grounding, its predictive nature, and its implementation in connected compartments with known physical extent. The difficulties of this approach, however, derive from insufficient simulation constraints (and therefore a large number of free parameters) in existing data sets. Each constraint and parameter, however, is at least in principle measurable in real tissue.
Second, we exploit the volume-filling nature of neural tissue to make the assumption that all connectivity in the brain is regular and local (Figure
Finally, our maximum target simulated time duration is 1 day, since a diurnal cycle captures almost all brain dynamics. Our minimum target simulated time duration is 200–500 ms, which constitutes an appropriate timescale for modeling a single brain state transition (such as one involved in solving a simple perceptual-behavioral task).
We derive the computational requirements for neural tissue simulation targeting a whole human brain, whose volume equals ∼1 liter, by first decomposing it into 107 tissue volumes, yielding:
108 μm3/volume.
106 compartments/volume, assuming 100 μm3/compartment.
1010 flop/50 μs simulated time step/volume, assuming 104 flop/timestep/compartment (includes Hodgkin–Huxley branched cable solutions, plus 10 ion channels or synapses/compartment).
64 kB communicated/volume face/timestep, assuming 32 bytes communicated/spanning compartment in each direction, and 10 μm2/compartment cross section.
2.25 GB of memory/volume, including 250 MB of simulation overhead, plus 1.60 kB/compartment and 64 bytes/channel or synapse.
These tissue volume requirements for 107 volumes are then directly mapped to a massively parallel machine architecture such as a hypothetical Blue Gene possessing 107 computational nodes (∼10× larger than Blue Gene/P's scaling limit), yielding the following machine requirements:
109 flops/node (Blue Gene/P scale), with a simulation time to real time factor of 2 × 105 yielding a simulated time duration of ∼400 ms per day of computation, or ∼3 s per week of computation.
6.4 kB/s of link bandwidth (in each direction, to accommodate packet overhead, well within Blue Gene/P scale).
3 GB of memory/node (Blue Gene/P scale); 30 PB of total machine memory.
6 GB/s memory bandwidth, assuming all simulation state must be traversed ∼3 times in each simulation time step (Blue Gene/P scale).
The estimated computational requirements for simulating a human brain described here are reasonable, and could be accomplished on a machine such as Blue Gene/P using an application such as the Neural Tissue Simulator. Today, Blue Gene/P is designed for a maximum size approximately one order of magnitude smaller than what we estimate is required. Furthermore, to achieve a full day of simulated time (our upper bound for scientifically useful simulations) in the same run time (1 day to 1 week) a machine that can achieve a 104× increase in speed is required (i.e., exaflop), which is expected within a decade.
We extrapolated these requirements from our measurements of neural tissue simulations on Blue Gene/P using the Neural Tissue Simulator. Simulations that explicitly model molecular diffusion or extracellular field potentials may have a significantly higher computational cost that is not included in this estimate. We also recognize the possibility that through other efforts, a less costly modeling approach to explaining global brain function (i.e., a more abstracted, less detailed simulation) may provide a theoretical basis for predictive whole-brain simulations. In that event, computational requirements would be smaller.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at
Our model simulation methods build on existing numerical approaches for solving Hodgkin–Huxley type mathematical models of branched neurons (Segev and Burke,
where
where
The current across the membrane at any point along a neuron is the sum of all channel and synaptic currents, as well as any externally applied currents:
where
and
where the rate functions are taken to be:
We also assume a synaptic current
In these equations,
where
are the concentrations of neurotransmitter released as functions of presynaptic voltage. In all calculations, we used the values αq = 0.0011 ms−1, βq = 0.19 ms−1, αr = 0.005 ms−1, and βr = 0.18 ms−1.
We describe neurons as a collection of points (
Following Hines (
where
This can be rearranged to yield
where
and
To complete the full Crank–Nicolson step, we then update as follows:
Following Hines (
which can be solved for
Synapse states are solved similarly. Assuming
which can be solved for:
At implicit branch points, we have
which can be rearranged to yield
Again, to complete the full Crank–Nicolson step, we then update as follows:
Following Rempe and Chopp (
which can be rearranged to yield
Once this predicted value has been used to update all of the branches associated with the junction, the junction's membrane potential is corrected with the new branch values:
Finally, to complete the full Crank–Nicolson step, we update as follows:
We would like to acknowledge Charles Peck, IBM Research, and our two reviewers, for their critical input on the manuscript and Kathleen Falcigno for help with its preparation. Michael Pitman, IBM Research, helped conceive and implement the Neural Tissue Development algorithm. Blake Fitch, IBM Research, consulted and provided critical input on the tissue volume decomposition design.
1Note that
2Note that the Model Graph Simulator allows easy replacement of any model with another. For example, the axon branch model could easily be replaced by a simpler model of logical spike propagation, should this be desirable to a user. Note that such an approach would propagate a spike to each branch point in the axonal arbor (rather than to each synapse), at which point it would continue as two spikes, thereby potentially reducing communication required by the standard logical spike passing approach.
3All simulations used to test the Neural Tissue Simulator simulated 100 ms of neuron physiology.