The following describes Felix Z. Hoffmann's project in the scope of the Open Science Fellows program (English version of the website in preparation) to produce an open and reproducible computational research publication. Updates on the project can also be found in a GitHub repository on the project.
Open computational research study[Bearbeiten]
Replicating computational research studies has been shown to be exceedingly difficult (Topalidou et al. 2015). The main factors that make replication problematic or even impossible, are the unavailability of code coupled with incomplete descriptions in the article, the unavailability of data and the use of software packages and environments that cannot be recreated on current systems.
These problems are often addressed in many guidelines for reproducible computational research (Sandve et al. 2013; Rostami et al. 2017). In brief summary, a computational research publication should contain
- fully documented data and code
- used software libraries
- full computational environment.
In reality, very few published computational studies satisfy all of these criteria. While guidelines on what should be shared are plentiful, they don’t help researchers with the practical question of how to do that. Finding practical solutions on how to document and share code, such that the full research process can be traced back, how to make the used computational environment available to others – these are hard problems that are in need of dedicated efforts to answer.
My intention for the Wikimedia Open Science Fellowship is twofold: I want to publish my own computational research openly, fully satisfying the above criteria. By going through this process in coordination and support with the mentors of the program, I will have to find answers to the problems above. As a second goal, I want to create documentation for the practical solutions I find. It is my hope that these guides and tutorials will help take some of the burden of other computational scientists interested in publishing openly and will be able support them in the process of openly publishing their results.
Below I describe my research, the tools I’m considering using to publish my research openly and, finally, the proposed form of the tutorials I will create in the scope of this fellowship.
Connectivity in local neuronal circuits of cortical areas has been reported to be non-random (Song et al. 2005; Perin, Berger, and Markram 2011) but the connection principles underlying the formation of the observed patterns remain unclear. Countering many plasticity-based approaches, I analysed how non-random connectivity patterns may emerge purely from stereotypical morphology of neurons in these circuits.
The study’s main results are products of complex computations, in which I generate large random networks with specific connectivity rules – motivated from stereotypical morphological – and analyse in a second step the structure of the resulting networks. My findings, that highly resemble the structure of local cortical circuits (Song et al. 2005; Perin, Berger, and Markram 2011), imply that constraints from neuron morphology may already play a large role in determining a cortical circuit’s connectivity patterns. This work has been well received beyond the borders of my lab and I now want to move towards publishing these results.
Tools supporting openly sharing computational research[Bearbeiten]
When sharing a large body of code and results of complex simulations, it is crucial that it is documented what piece of code generated which set of data using what set of parameters. Sumatra, an automated electronic lab notebook, is an open source tool with the purpose of tying this information together. Early on in my research work I have started using Sumatra to keep track of my computations, with involvement with the software leading to my participation in the Google Summer of Code program in 2014 to further enhance the tool’s functionality. If published alongside the code and data, the electronic lab notebook Sumatra allows full tracking of the generated data’s provenance.
Computational research studies often rely on a complex configuration of dependencies on libraries and additional software installed. Only if this configuration is made public as well can the research results be reproduced without significant effort. A relatively new virtualization tool allowing the sharing of computational environments, is Docker. As a modern open source initiative, Docker has been portrayed as a promising tool for reproducible research (Boettiger 2015). I’ve been successfully using Docker in my work for a while and making the Docker image part of the publication would allow immediate access to the computational environment I used to generate the results to anyone having the Docker engine installed.
The form and content of the tutorial will strongly be shaped by the answers I determine during the course of the fellowship. It’s important to me that the guides I create can be used as is, while giving references to possible other solutions at the same time. I’ve previously reported about reproducible research and other technical implementations with good success on my blog, but I would like to consider more general platforms for publishing these guides as well.
Boettiger, Carl. 2015. “An Introduction to Docker for Reproducible Research, with Examples from the R Environment.” ACM SIGOPS Operating Systems Review 49 (1): 71–79. doi:10.1145/2723872.2723882.
Perin, Rodrigo, Thomas K. Berger, and Henry Markram. 2011. “A Synaptic Organizing Principle for Cortical Neuronal Groups.” Proceedings of the National Academy of Sciences 108 (13): 5419–24. doi:10.1073/pnas.1016051108.
Rostami, Vahid, Junji Ito, Michael Denker, and Sonja Grün. 2017. “[Re] Spike Synchronization and Rate Modulation Differentially Involved in Motor Cortical Function.” ReScience 3 (1). doi:10.5281/zenodo.583814.
Sandve, Geir Kjetil, Anton Nekrutenko, James Taylor, and Eivind Hovig. 2013. “Ten Simple Rules for Reproducible Computational Research.” Edited by Philip E. Bourne. PLoS Computational Biology 9 (10): e1003285. doi:10.1371/journal.pcbi.1003285.
Song, Sen, Per Jesper Sjöström, Markus Reigl, Sacha Nelson, and Dmitri B Chklovskii. 2005. “Highly Nonrandom Features of Synaptic Connectivity in Local Cortical Circuits.” PLoS Biol 3 (3): e68. doi:10.1371/journal.pbio.0030068.
Topalidou, Meropi, Arthur Leblois, Thomas Boraud, and Nicolas P. Rougier. 2015. “A Long Journey into Reproducible Computational Neuroscience.” Frontiers in Computational Neuroscience 9. doi:10.3389/fncom.2015.00030.
- ↑ Sumatra 0.7: http://neuralensemble.org/sumatra/
- ↑ Docker: https://www.docker.com/
- ↑ 3 Diagrams per Page: http://felix11h.github.io/blog/