Edited by: Satrajit S. Ghosh, Massachusetts Institute of Technology, USA
Reviewed by: Alexis Roche, Siemens Research - CIBM, Switzerland; Thouis Raymond Jones, Harvard University, USA
*Correspondence: Juan Nunez-Iglesias, Life Sciences Computation Centre, Victorian Life Sciences Computation Initiative, 187 Grattan Street, Carlton, VIC 3010, Australia e-mail:
This article was submitted to the journal Frontiers in Neuroinformatics.
†Present address: Juan Nunez-Iglesias, Life Sciences Computation Centre, Victorian Life Sciences Computation Initiative, Carlton, VIC, Australia
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The aim in high-resolution connectomics is to reconstruct complete neuronal connectivity in a tissue. Currently, the only technology capable of resolving the smallest neuronal processes is electron microscopy (EM). Thus, a common approach to network reconstruction is to perform (error-prone) automatic segmentation of EM images, followed by manual proofreading by experts to fix errors. We have developed an algorithm and software library to not only improve the accuracy of the initial automatic segmentation, but also point out the image coordinates where it is likely to have made errors. Our software, called gala (graph-based active learning of agglomeration), improves the state of the art in agglomerative image segmentation. It is implemented in Python and makes extensive use of the scientific Python stack (numpy, scipy, networkx, scikit-learn, scikit-image, and others). We present here the software architecture of the gala library, and discuss several designs that we consider would be generally useful for other segmentation packages. We also discuss the current limitations of the gala library and how we intend to address them.
Connectomics, the elucidation of complete neuronal circuits, requires resolutions as low as 5–10 nm to distinguish the smallest neuronal processes, but also fields of view hundreds of micrometers across or more, as neurons can easily span those distances. This size disparity results in large image volumes of at least 10 gigavoxels and often orders of magnitude larger. Neurons are visible as distinct regions, or segments, in this 3-dimensional image volume. Combining an accurate segmentation with the position of pre- and post-synaptic sites in the image (Kreshuk et al.,
Various methods have been proposed (and implemented) to go from images to neuronal morphology and connectivity. Cardona et al. (
The algorithm, called GALA (graph-based active learning of agglomeration), works by repeatedly consulting a gold standard segmentation (prepared by human annotators) as it agglomerates sub-segments according to its current best guess. (Note: throughout this paper, we will use “GALA” to describe the algorithm, and “gala” or “Gala” for the Python library and software.) It thus accumulates a training dataset used to fit a classifier, which guides subsequent agglomeration decisions. Furthermore, through the probability output of the classifier, it can estimate its own confidence in whether two segments should be merged, and this estimate can be used for proofreading.
GALA outperformed previous agglomeration methods for automatic segmentation of an isotropic (10 × 10 × 10 nm resolution) focused ion beam scanning electron microscope (FIBSEM) dataset of
Since we have described the details of the GALA algorithm in detail elsewhere (Nunez-Iglesias et al.,
GALA belongs a class of segmentation algorithms called agglomerative algorithms, in which segments are formed by merging smaller segments. Other examples include mean agglomeration (Arbeláez et al.,
GALA uses machine learning to obtain a merge priority function or policy, which dictates which pair of segments to merge next. It has three main prerequisites to learn this function:
a a a pixel-level intensity map, which is optional but required for most features. This is usually the
For the first requirement, we have used the watershed algorithm (Vincent and Soille,
The second requirement is a completely segmented image to be used as ground truth. For neuronal EM images, we used ground truth segmentation generated with the open-source Raveler software (Olbris et al., in preparation), while a large ground truth body exists for natural images in the Berkeley Segmentation Data Set (BSDS) (Martin et al.,
For the final requirement, we have used Ilastik (Sommer et al.,
Given the above, we create a region adjacency graph, or RAG (implemented in
First, we import the relevant gala modules and read in the data using the
Next, we create a feature manager. These can be concatenated using the
Using the feature manager, watershed oversegmentation, ground truth segmentation, and probability map, we can create a region adjacency graph
With the training dataset, we can train a classifier, using scikit-learn syntax. Indeed, any scikit-learn classifier can be used here.
By composing the feature map and the classifier, we obtain a policy: a function whose input is a graph and two nodes (representing segments) and whose output is a number in [0, 1].
This policy is then used to segment a test (previously unseen) volume. We agglomerate the superpixels until the classifier returns a merge probability of 0.5, which corresponds to even odds that the merge is a true or false merge. (This assumes a well-calibrated classifier, meaning that the output corresponds to the probability of a sample feature vector belonging to the “+1” class. Bostrom (
Because gala was created as research software, it implements a number of additional agglomerative segmentation algorithms, including mean boundary, oriented mean boundary (Arbeláez et al.,
The gala API presents a simple tool to obtain state of the art segmentations, and also allows the exploration of a complete set of hierarchical agglomerative segmentation strategies. Further, because the segmentation strategy is learned, it can be applied with very little modification to many different domains, as demonstrated by its success in natural image segmentation as well as two different kinds of EM data.
One of the most generally reusable parts of the gala library is the evaluation module in
Among these, we focused the most effort on the VI metric for its numerous advantages (Meila,
The two components of VI are computed efficiently (
This is interpreted as follows: the undersegmentation VI of the watershed superpixels compared to the ground truth segmentation is 0.18 bits. That is, the average watershed basin will have slightly over 97% overlap with one ground truth body and 3% with another (since −0.97 log2(0.97) − 0.03log2(0.03) = 0.19 ≈ 0.185). In contrast, its oversegmentation VI is 1.64 bits, which means that most ground truth segments are split into more than 3 watershed pixels. (The conditional entropy of a perfect (1/3, 1/3, 1/3) split is 1.58 bits. A perfect 50-50 split has an entropy of 1 bit.)
By computing the two conditional entropies at each segmentation threshold, we generate the split VI plot (Figure
This approach is in contrast to the commonly used plot of VI against segmentation threshold (see e.g., Andres et al.,
In addition to the flexible Python library interface, we developed a set of command-line scripts to perform common gala functions, such as training and segmenting. This is the primary way of interacting with gala in a production environment. The scripts make use of the excellent
In this section, we focus on a few design elements that we consider essential to gala's success.
Many segmentation libraries assume 2D or 3D data, or provide separate functions for each (see, for example, Achanta et al.,
We instead abstracted away the notion of a neighboring pixel (or voxel) with a
A great many algorithms in computer vision can be parameterized by neighboring voxels. Thus, we encourage developers to write these using n-dimensional logic from the start to increase the range of applications of their software. The numpy library's excellent ndarray object was essential for our n-dimensional support, making gala a prime example of the success of the Python ecosystem for scientific computing.
An example of a critical feature when determining the probability that two segments should be merged is the average pixel-level probability of boundary (Ren and Malik,
The above concept can be generalized to a large class of features. We found that, in many cases, caching intermediate computations dramatically improved the time complexity of feature computation. For example, to compute the standard deviation of the pixel probabilities, we must cache the sum of the squared pixel probabilities, along with simple sum and counts. To compute a histogram, we cache the unnormalized histogram, and each bin is summed when two boundaries are combined.
We therefore devised a single class, which we call a feature manager, that is responsible for defining the cached values, and for computing the feature vector from the cached values. This has enabled less obvious features, including, for example, some based on the convex hull of the segment. The convex hull feature manager stores as a cache the convex hull of each node. When two nodes are merged, the resulting convex hull can be computed faster by starting from the two initial hulls, rather than from the newly formed segment, since these have fewer vertices than the segments themselves. The manager then uses the hull to compute features such as segment convexity, by comparing the volume of the convex hull to the volume of the segment.
We have strived to make it easy to develop new feature managers, which will be useful as new, more sophisticated 3D segmentation features are developed (Bogovic et al.,
Given the vast heterogeneity of our initial feature space, we wished to use a random forest (RF) as our classifier of choice.
Because of this, it is trivial to try different classifiers for the learning and agglomeration steps of
In short, by using established interfaces, we were able to future-proof our software. We recommend that anyone looking to build software in the Python ecosystem take a long look at related libraries to match interfaces as closely as possible.
Although we have discussed many of gala's strengths and its positive design aspects, it does currently have limitations, which we describe here, along with potential fixes.
As currently implemented, gala requires a fully segmented volume from which to learn. In our experiments, this has become the major bottleneck when starting segmentations on new data. Therefore, a priority in gala development going forward is the ability to mask volumes so that partial ground truth can be used.
Gala's implementation, based on NetworkX, is slow and has a high memory footprint. However, many improvements are within easy reach.
Firstly, we currently store feature caches and compute feature vectors as separate arrays. This results in a huge time overhead for large graphs due to memory allocation, and also in memory usage because of the dictionaries required to store all the separate arrays. However, because we are performing a hierarchical agglomeration, we know that the number of nodes and edges is bounded by twice the initial number. Therefore, we can pre-allocate an initial array of shape
Additionally, the graph currently stores indices to the voxels comprising each node and boundary, which is unnecessary. Space and time can be saved by keeping only a single voxel and rebuilding nodes using a flood fill.
Finally, we chose the heavy
Like most academic software, gala is a mixture of new algorithms, some good design, and a variety of questionable decisions left over from a time of different priorities. We wrote this description in the hope that the existing and future functionality, the better parts of the software, and the lessons learned will be of value to the wider research community. We particularly emphasize that Don Knuth's famous maxim that “premature optimization is the root of all evil” (Knuth,
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the Howard Hughes Medical Institute and the Victorian Life Sciences Computation Initiative. We thank Dmitri Chklovskii for financial support and useful discussions, and Jake VanderPlas for helpful suggestions for the “evaluate” submodule. We also thank Mathew A. Saunders and the rest of the FlyEM team for testing and feedback on the software.