# Wikiversity:Fellow-Programm Freies Wissen/Einreichungen/The Book of Statistical Proofs

**Name**&**Kontakt**: Joram Soch, joram.soch@bccn-berlin.de**Institution**: Bernstein Center for Computational Neuroscience & Charité-Universitätsmedizin, Berlin
- Abschlussbericht (erschien am 28.05.2020)
## Projektbeschreibung[Bearbeiten]
## Zusammenfassung[Bearbeiten]Thanks to the increased application of advanced statistics (such as Bayesian inference and machine learning) and the increased availability of computing resources (such as computing clusters or high-performance GPUs), the recent past has seen many disciplines developing a “computational” branch – computational neuroscience, computational chemistry or computational sociology, to name just a few of them. In parallel, while there is a growing amount of data being collected (“big data”), there are also growing concerns about the reproducibility of data analyses and replicability of empirical studies (“replication crisis”). In this situation, the empirical sciences need sound methodology. Sound methods developments in turn almost universally rely on statistical theorems to theoretically justify new statistical techniques proposed to analyze empirical data. However, while most of this statistical theory is easy to retrieve, proofs for those theorems are often difficult to obtain, because they are either contained in expensive books or locked behind publisher paywalls or distributed across multiple sources. This hinders the understanding of statistical theory and the development of cutting-edge techniques. The core objective of the proposed project is to close this gap by collecting important statistical proofs into a centralized, open and collaboratively edited archive, ## Zielsetzung[Bearbeiten]Within this project, I want to develop and establish a Wiki-like archive that collects important proofs from statistics and probability theory, to be used by all kinds of computational researchers when developing new methods for data analysis. To this end, I will (i) work out a taxonomy of statistical proofs, (ii) choose an implementation for the proof archive, (iii) collect statistical proofs from distributed sources and (iv) communicate the proof archive to the interested community. ## Multiplikation[Bearbeiten]As an expert on Bayesian model selection (BMS) for general linear models (GLM), I want to contribute my statistical knowledge to the envisaged archive. Beside this, I was already able to win over a colleague of mine to contribute proofs related to the multivariate general linear model (MGLM). The Bernstein Center for Computational Neuroscience in Berlin is a fairly large community of mathematically literate scientists. Once established, I plan to advertise the archive to colleagues at my institution and reference it on posters presented at conferences as well as in papers submitted for peer-review, such that the project attains its open science value and becomes a truly collaborative enterprise. ## Nachnutzung[Bearbeiten]As the material collected within this project is supposed to be assembled in a stable repository (see below), it will be available to everyone, whether they are computational researchers or mathematically interested laypeople, online for free. Additionally, given that the project has become collaborative by the end of the funding period, it will be supported by the community that is editing and curating the archive (including myself). ## Meilensteine[Bearbeiten]Based on the project goals outlined above, the project consists of four distinct work packages (WP) the completion of which constitute individual milestones (MS): - WP 1: to finalize the table of contents for
*The Book of Statistical Proofs*; for which a preliminary draft already exists; to be completed by 09/2019. - WP 2: to decide over the actual implementation of the proof archive; candidates being a Wikibooks entry (such as this), a GitHub repository of LaTeX files (such as this) or GitHub pages with Jekyll integration (see here); to be completed by 09/2019.
- WP 3: to collect proofs from distributed sources; such as text books, journal articles and Wikipedia articles; to be worked on until 06/2020.
- WP 4: to communicate the proof archive to the community; e.g. via a review article, blog posts and Twitter communication; to be completed by 03/2020.
## Mittelverwendung[Bearbeiten]The proposed project is collaborative by nature. To facilitate and incentivize the unpaid work going into this open science resource, for the first 200 proofs submitted to ## Beitrag zu den Wikimedia-Projekten[Bearbeiten]Depending on what kind of implementation is chosen for the proof archive (see above), the collected resources can be summarized into an open-content text book ( ## Beitrag zu Offener Wissenschaft[Bearbeiten]The proposed project contributes to open science in three ways: First, important information that is, in part, the foundation of today’s statistical practice in the empirical sciences will not be contained in expensive books or locked behind publisher paywalls or distributed across multiple sources anymore, but freely and openly available to everyone ( |