Edited by: Eilif Benjamin Muller, École Polytechnique Fédérale de Lausanne, Switzerland
Reviewed by: Jeanette Hellgren Kotaleski, Karolinska Institute, Sweden; Daniel Keller, École Polytechnique Fédérale de Lausanne, Switzerland
*Correspondence: Weiliang Chen, Computational Neuroscience Unit, Okinawa Institute of Science and Technology, 1919-1 Tancha, Onna-son, Kunigami, Okinawa, 904-0495, Japan e-mail:
This article was submitted to the journal Frontiers in Neuroinformatics.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
STEPS is a stochastic reaction-diffusion simulation engine that implements a spatial extension of Gillespie's Stochastic Simulation Algorithm (SSA) in complex tetrahedral geometries. An extensive Python-based interface is provided to STEPS so that it can interact with the large number of scientific packages in Python. However, a gap existed between the interfaces of these packages and the STEPS user interface, where supporting toolkits could reduce the amount of scripting required for research projects. This paper introduces two new supporting toolkits that support geometry preparation and visualization for STEPS simulations.
Advanced research on neuronal signaling pathways frequently requires assistance from computational modeling and simulations, causing the development of several molecular reaction-diffusion simulators in recent years. In this domain, the general assumption of mass action kinetics in a well-mixed volume is often invalid, whilst stochasticity and spatiality have been demonstrated to play essential roles in regulating behaviors of the system (Santamaria et al.,
STEPS, short for STochastic Engine for Pathway Simulation, is a GPL-licensed, reaction-diffusion simulator that implements a spatial extension of Gillespie's SSA in tetrahedral geometries (Hepburn et al.,
One feature that distinguishes STEPS from other spatial SSA simulators is its extensive Python interface. Python (
As a stochastic pathway simulation engine, the user interface of STEPS is mostly generic and focuses on simulation control and data access. Thus, STEPS users rely on the massive number of scientific Python packages to achieve varying research objectives, from simple plots of molecule distribution, to complicated results analysis. However, some customized toolkits are required to reduce the amount of scripting by the user. The collection of these Python-based, customized toolkits is called the “STEPS supporting environment,” part of which has been described previously (Hepburn et al.,
This paper describes two new supporting toolkits in this environment: the geometry preparation toolkit handles production of geometry data for STEPS simulations, while the visualization toolkit provides runtime visualization of simulations. In the following sections we will describe the details of both toolkits, including their functionality and the underling mechanisms and development principles. We will also provide examples to showcase their use, and discuss plans for the future improvements of the toolkits and the overall supporting environment.
Geometry preparation is an important prerequisite for reaction-diffusion simulation. It involves multiple procedures, starting with “geometry construction,” where a surface/volume mesh, or a set of geometry boundary representations, is created. Substructures and specific regions in the geometry that are of research interest or require extra simulation controls are then identified in a procedure of “component identification.” This is followed by “model association,” where biochemical models are assigned to the geometry components. Finally, the outcomes of these procedures are integrated together and prepared for simulation.
Reaction-diffusion simulators commonly accept formatted text files as data input, where geometry is described either as a combination of predefined primitives like spheres and cubes, or as a surface or volume mesh. The data is then dealt with differently among simulators. SSA based simulators like MesoRD and NeuroRD generate cubic meshes according to the input primitive geometries, while particle based simulators like Smoldyn and MCell establish mathematical boundary representations of the geometries. Data files for simplified geometries can be produced manually, but the generation of complex or realistic geometries, like those based on reconstructions from electron microscopic imaging, often relies on third party professional applications. Therefore, toolkits that integrate the geometry generator and the simulator can be beneficial. One example is CellBlender (
Different from MesoRD and NeuroRD, STEPS does not generate meshes itself, but makes use of professional mesh generators. A generic mesh importing mechanism is provided, together with importing functions for common mesh formats such as Abaqus (
As mesh generation is mostly controlled by CUBIT itself, the geometry preparation toolkit focuses on facilitating the remaining procedures that support five major functionalities in CUBIT and STEPS.
In STEPS, geometry components are identified as groups of tetrahedrons and triangles. Technically this means that to create a component one has to select mesh elements and produce a list of their indices. There are several ways to select mesh elements in CUBIT depending on the condition. For simple geometries, components can be predefined before mesh generation and used to guide the generation process. They can also be separated and identified easily by simple mathematical spacing after mesh construction. However, these approaches become inadequate as the complexity and irregularity of the geometry increase, where extra support is necessary to ensure accuracy and efficiency.
In order to explain the element selection mechanism, we first classify mesh elements into two different categories, skin and inner elements. Skin elements are directly visible from the outside, while inner elements are covered by skin elements and are thus not directly visible. In CUBIT, skin elements can be selected directly using box/polygon range selection. CUBIT also provides an x-ray option which, together with range selection, is able to select all elements within the range, regardless of whether they are skin or inner, as shown in Figure
Elements selected in CUBIT can be output as a Python list that contains their indices. However, for reasons of computational efficiency STEPS uses an internal indexing system, which is different from the one generated in CUBIT. Thus, a mapping mechanism is necessary for data enquiry and index conversion between the two systems. This mapping is handled by “ElementProxy,” a Python-based utility object implemented in STEPS. In short, the ElementProxy is a generic object for the storage of both geometry data as well as the mapping of indexing systems for a given type of geometry elements such as vertex, triangle and tetrahedron. During mesh import, coordinates, connectivity and other geometry data of an element are recorded in the associated ElementProxy object, together with its original index. The ElementProxy then assigns a STEPS index for the element and stores the mapping between them. The object is implemented purely in Python so that it can be archived and retrieved by the standard Pickle module in Python. This mechanism allows the toolkit to translate element indices between STEPS and CUBIT, which is necessary for the construction of the steps.geom.Tetmesh geometry object used for reaction-diffusion simulation in STEPS (Hepburn et al.,
STEPS spatial geometry consists of two basic components: “compartment” and “patch.” A compartment, described by a collection of tetrahedrons in the mesh, is a 3D volume within which molecules can diffuse and react. A patch, described by a collection of triangles, is a 2D surface connected to one or two compartments, where molecules may be embedded. “Surface reaction” and “surface diffusion” can be assigned to a patch to describe membrane-related phenomena such as molecular channeling, transportation, and lateral diffusion. Compartments and patches can be based on lists of, respectively, tetrahedral and triangle indices from the element selection process shown in Figure
Beside compartments and patches, smaller geometry element groups, for example, tetrahedrons that form a spine in a spiny dendrite, often need to be accessed repeatedly, either for the change of simulation parameters or for recording of results. In general, they can be stored as Python lists in an external file and loaded from the file during simulation. However, the manual creation and maintenance of external files can be exhausting, particularly for large simulations. Alternatively, the Tetmesh object provides an auxiliary “Region of Interest” (ROI) dataset where element lists can be named and stored. ROI datasets are accessible by name once created. A set of ROI operation APIs are also implemented in STEPS so that stored elements can be reused in the simulation.
To form a spatial reaction-diffusion system, groups of reaction and diffusion rules (“volume systems”) defined in the biochemical model need to be added to corresponding compartments in the geometry, and groups of defined surface reaction rules and other surface phenomena (“surface systems”) need to be added to related patches. Volume systems and surface systems are defined separately in a steps.model.Model object. STEPS associates biochemical systems with geometry components by storing system ids in corresponding components in the geometry object. The model and geometry objects are then combined to construct the stochastic spatial solver (steps.solver.Tetexact). The separation of biochemical model definition and geometry description not only helps modelers to maintain focus, but also enhances the reusability of scripts as a single model definition can be reused with different geometries, and vice versa.
In practice, biochemical model and geometry are often prepared by different individuals, therefore it is necessary for a Tetmesh to be stored in a file and retrieved later for simulation. This functionality is provided by the MeshIO utility, which saves and loads a Tetmesh object, compartment and patch definitions, biochemical model association and lists of element groups, to and from an xml file.
Though geometry preparation can be accomplished manually using the above mechanisms, the toolkit combines these mechanisms and provides flexible pipeline functions in Python that significantly reduce the labor required. For example, selected tetrahedrons in CUBIT can be directly used to create a compartment with biochemical system association in Tetmesh geometry within a single function call in the toolkit, instead of going through the steps of index translation, compartment object creation and model association. This is particularly beneficial when using complex geometries.
The importance of visualization for spatial reaction-diffusion simulations is a matter of debate. Though visualization provides an intuitive way for understanding simple biochemical models, its value for simulations with complex biochemical systems and geometries is unclear. This leads to divergent strategies in existing simulators. Some simulators, for example Smoldyn and MesoRD, implement built-in runtime visualization support. Other simulators such as MCell focus on post-simulation result playback using third-party applications. Both approaches have their advantages and disadvantages, thus whether a simulator supports one over another mainly depends on developer preference and application focus. Runtime visualization provides immediate information of how the simulated system behaves, important for model debugging and runtime simulation adjustment. However, a considerable amount of computational resource is required, reducing the overall efficiency of the simulator. Moreover, modern neuroscience simulations are often executed on clusters where no visualization is allowed. Therefore, runtime visualization is often implemented as an optional feature that can be switched off when necessary. Post-simulation result playback does not affect runtime performance of the simulation significantly, although history data storage is required. The amount of history data increases proportionally to simulation time, making this approach resource-consuming for long simulations. In addition, result playback can only be visualized after a simulation is completed, so it is unable to perform runtime adjustment of the simulation.
STEPS implements a Python-based, interactive 3D visualization toolkit for spatial reaction diffusion simulations. Currently, the toolkit focuses on supporting runtime visualization, but simulation recording and playback will be added as extensions in the future. Despite the general understanding that visualization is limited to simulations with simple models and geometries and mostly for demonstration purposes, the STEPS visualization toolkit attempts to provide efficient, accurate and comprehensible visualization support for simulations with complex biochemical models and geometries, a goal that is not trivial to achieve. Here we detail the challenges encountered during the toolkit development and explain the solutions taken to tackle those challenges.
The fundamental goal of the visualization toolkit is to visualize simulations with complex biochemical model and geometry. A major challenge lies in the presentation, that is, how to produce human comprehensible visual output of a complex system. Visualization support in existing simulators often adopt an “All-In-One” strategy, where all molecules as well as the full geometry are displayed in a single window. Although this approach may be adequate for models with several reactions and simple geometries, due to the limitation of human perception, the visual output of such a presentation soon becomes incomprehensible as the complexity of the system increases.
To address this problem, the STEPS visualization toolkit abandons the “All-In-One” approach and introduces the “Component Assembly” concept to the implementation instead. Figure
Visual components are then assembled in a “display,” an interactive 3D window environment that displays assigned components. One pivotal feature of the visualization toolkit is the “Many-To-Many” association between visual components and displays: instead of creating a single display window, the toolkit allows multiple displays to be created for a single simulation. Multiple visual components can be assembled in a display and each visual component can also appear in multiple displays. Visual components that appear in multiple displays maintain a single instance of internal data and synchronize their visual appearance among all displays when the data is updated during simulation, thus the increase of memory cost is insignificant.
This implementation provides flexible solutions for different visual scenarios that may be encountered in practice. One common example is the “Global-ROI” scenario, where a single window displays all geometry components and molecule changes as a global view of system behavior, while a number of displays highlight changes of specific molecule species in different geometry regions. Another example is the “Species of Interest” scenario. In a complex simulation, molecules in different parts of the same geometry region often visually overlap with each other, significantly reducing the comprehensibility of the visualization. With the visualization toolkit, molecule species that are of interest can be isolated from the others and visualized in several displays separately, with the same static component as the geometry background of all displays.
Visual components and displays are extensions of generic OpenGL visual items provided by PyQtGraph (
Displays with dynamic visual components need to be further assigned to a “simulation control” so that visualization can be synchronized with the simulation. Multiple simulation instances with different biochemical models and geometries can be assigned to a controller, where a background execution thread is generated for each of the simulations using Python's multithreading mechanism. Although all simulations are executed in parallel, they share a global configuration of simulation end time and visual update interval, which can be overwritten during simulation. The simulation control also unlocks the Global Interpreter Lock (GIL) in Python, thus users can interact with the visual system freely even when the simulations are in execution in the background.
Visualization of SSA-based spatial reaction diffusion simulation faces an intrinsic representation challenge that seldom appears in particle-based simulations, where the spatial position of each molecule is tracked and recorded accurately through simulation. The fact that SSA-based simulators do not track molecule movement but monitor the quantity changes of molecules in each subvolume means that the exact position of individual molecules is not known. Different approximations have been used to solve this problem. For instance, MesoRD allows users to predefine the maximum number of molecules that can be visualized per cubic subvolume. Based on this value, it then generates all possible molecule positions in advance by evenly partitioning the axes of the subvolume space. During simulation, each subvolume updates its condition iteratively and determines whether a molecule should appear on any of the positions. However, this approach was not suitable for STEPS visualization for several reasons. First, tetrahedral subvolumes have a much wider range of size and shape compared to the ones in a cubic mesh, thus it is practically difficult to partition the space evenly for each subvolume. Second, if all molecule positions are generated in advance it is possible for a molecule to be shown at a fixed position over time, giving the wrong impression that no movement has occurred for that molecule where instead conceptually it has changed position inside the subvolume. Third, as the maximum number of visible molecules is fixed for each subvolume, subvolumes with high concentrations of molecules may be visually over-simplified due to a lack of available positions, while the ones with low concentrations retain large amounts of unused coordinate data. Finally, the number of coordinates that need to be generated scales linearly with the number of subvolumes in the simulation, causing a large memory cost for simulations with fine meshes even if the amount of molecules in the system is small.
Because of these reasons, the STEPS visualization toolkit, instead, adopts a runtime generation approach for molecule visualization. At each visual update iteration, tetrahedral and triangular SSA subsystems in every dynamic visual component calculate the number of molecules within themselves and generate the exact number of corresponding random positions. The toolkit uses a fast algorithm that guarantees all these random positions are uniformly distributed and bound by the subsystem's geometry. These positions are then fed to individual visual components and rendered in the corresponding displays as dots with different sizes and colors, predefined in the component. The process repeats when the simulation reaches the next visual update interval. One exception is the multiple-state “channel species” on patches, whose positions are permanent after initialization except when they diffuse inside the membrane.
In the above solution, the number of random positions generated at each iteration equals the total number of molecules over all visual components. While this is achievable for simulations with a small numbers of molecules, as this number increases it becomes difficult and eventually unfeasible to render them due to limited graphical resources. Therefore the visualization toolkit regulates the position generation with two restrictions. The first restriction is the “maximum amount of points” that can be generated for each visual component. Once the number of molecules in a component exceeds this maximum, a reducing function is called to lessen the amount of points generated according to the second restriction: “maximum point density,” defined as the maximum number of possible points being generated per unit of measurement (m3 for tetrahedron, and m2 for triangle). For each associated tetrahedron/triangle of the visual components, the maximum number of points that can be generated within is determined by multiplying the density with its volume or area, reflecting the proportional distribution of molecules. The density can be either predefined by the user, or adjusted automatically according to the ratio of maximum against actual amount of points that will be generated when the auto-adjust mode in the reducing function is enabled. Each visual component has its own maximum amount and density configuration so that they can be specified for individual species but remain consistent within the component.
The runtime point generation approach requires fast data synchronization between visualization and simulation, which is often considered to be a weakness of pure Python applications. In our toolkit, this issue is managed by allowing direct data interfacing between STEPS and NumPy. NumPy is a Python extension package that supports large, multi-dimensional array construction and fast array operation. It is currently the standard Python package for numerical computing and is supported by many scientific computing packages. NumPy arrays are also the fundamental data structure for PyQtGraph, the package we used to implement our visualization toolkit. Using the SWIG interface, NumPy arrays can be directly accessed by other C/C++ packages, including STEPS. At each visual update cycle, molecule distribution data from the simulation is written directly in formatted NumPy arrays that will be assigned to visual components, eliminating expensive STEPS-Python-NumPy data copying. Our implementation also further speeds up the visual system by implementing all computational intensive routines, such as random point generation, in C++.
Although the approach described above provides an intuitive grasp of how the simulation performs, information acquired from it is generally vague and qualitative. Important modeling information such as concentration and spatial distribution changes of molecules can only be observed with more accurate, quantitative analysis of the simulation. For this reason our visualization toolkit also implements a set of data plotting functions that enable dynamic monitoring of the amount and spatial distribution of molecules, which can be synchronized with the visualization updates during simulation. The quantitative plotting allows modelers to rapidly validate and debug their models at the early stage of model development, which is essential for complex computational models.
Application of the above toolkits highly depends on the conditions and research interests of specific projects. In this section we present two examples that originate from our previous research to explain how the toolkits can be used in practice. The meshes and Python scripts used for these simulations can be downloaded from ModelDB (
The first example is the inositol 1,4,5-trisphosphate receptor (IP3R) model described in (Doi et al.,
To create a suitable geometry for the simulation, we extract a triangular spine morphology from an electron microscopic reconstruction of spiny dendrites (
In the visualization, Ca2+ in cytosol and ER is represented in orange, while IP3 in cytosol is represented in red, using the “compartment species” visual component. Different IP3 receptor sites on the membrane are represented as different states of a “patch channel” component with individual color and transparency configurations. Native and Ca2+ bound receptor states are colored in blue with different transparencies, while the IP3 bound state and the open state are colored in magenta with 20 and 100% opacity, respectively.
Figure
In order to quantitatively analyze the relationship between the number of open states of the IP3 receptor and cytosolic Ca2+ concentration, we create dynamic plots with these two measures and monitor their changes throughout the simulation. As shown in Figure
The second example originates from our previous research (Santamaria et al.,
Four meshes were generated for this example, using project specified scripts for the CUBIT Python API. The mesh generation script is available upon request and can be modified to produce variations of the meshes. Each mesh consisted of a cylinder of 20 μm length and 0.7 μm diameter, representing the dendritic shaft. We then randomly attached a number of simplified spines, each formed by a spherical head and a cylindrical neck, onto the shaft cylinder. Spines were generated according to statistics from EM studies (Harris and Stevens,
Four simulations are assigned to and executed by a simulation control, each of which simulates molecule diffusion in one of the four meshes. States of the simulations are visualized in separated displays. In each display, the mesh for the simulation is rendered by the compartment mesh component. As this research mainly focuses on the molecule distribution in the dendritic shaft, we use the shaft tetrahedron indices stored in the ROI dataset to create a tetrahedron species component that only displays molecules inside these tetrahedrons. This is a better solution compared to the one where all molecules in the simulation are displayed, particularly for meshes with high spine densities (Figure
To quantitatively visualize the difference of molecule distribution caused by varying spine density, we plot the spatial distributions along dendritic shafts using the visualization toolkit (Figure
In this paper we have described two supporting toolkits for STEPS that are implemented in Python. We've introduced the geometry preparation toolkit that integrates CUBIT with STEPS via Python, allowing complete mesh preparation solutions for STEPS simulations. We've also analyzed approaches to improve efficiency, accuracy and comprehensibility of visualization for spatial reaction diffusion simulations, which are adopted in our implementation of the visualization toolkit. Two examples are presented to showcase the application of the toolkits in real research projects. The IP3R model example demonstrates how compartments and patches are identified and created in realistic spine morphology using the geometry preparation toolkit, and how the simulation can be visualized properly by splitting molecule species in multiple displays. The anomalous diffusion example showcases the usage of “Regions of Interest” datasets for visually filtering molecules in a specific region. This example also demonstrates how multiple simulations are executed and visualized simultaneously for comparison.
The toolkits are components of the STEPS supporting environment, where Python-based submodules are implemented to close the gaps between interfaces of various Python packages and the generic interface of STEPS. The Python world is an open and rapidly growing community where hundreds of new packages are available to the public everyday. On one hand, this provides rich and flexible package options for research projects using STEPS, on the other hand, packages selected to implement a customized toolkit may soon be out of date or lack improved features provided in new packages. Therefore, instead of detailing the package-dependent, technical implementation of the toolkits, we've concentrated on introducing novel, underlying mechanisms and principles involved. The approaches described in this paper are beneficial not only to the implementation of current toolkits, but also to the design and implementation of toolkits for other simulators in the same category.
At the moment the STEPS supporting environment is not yet completed, and the existing toolkits can be further improved in several aspects. The generation of biochemical models remains text based, requiring significant amount of human efforts in scripting and maintenance, despite the availability of the SBML (Hucka et al.,
As for the toolkits described in this paper, the geometry support toolkit requires CUBIT, which is commercially licensed. We anticipate alternatives with similar functionality that can be obtained freely so that the whole geometry preparation process can be achieved without extra financial cost. One candidate is TetGen, whose format has been supported in STEPS since early versions, although it still lacks several features such as graphical interaction with meshes. So far, the visualization toolkit supports visualization of spatial reaction diffusion systems, but does not yet support visualization of new features in STEPS version 2, such as membrane potential and current, which is implemented in the EField system (Hepburn et al.,
STEPS 2.2 with both toolkits described in this paper, as well as API references and a user manual, can be accessed from
Weiliang Chen designed, implemented and tested the toolkits described, as well as drafted the manuscript. Erik De Schutter conceived of and supervised the STEPS project and helped draft the manuscript. Both authors contributed to the manuscript and read and approved the submission.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by OISTSC. CUBIT was used under rights and license from Sandia National Laboratories: Sandia License Number 09-N06832.
The Supplementary Material for this article can be found online at: