Received: 29 June 2015/Accepted: 29 June 2015
Abstract. This resource describes WooW-II, a two-day workshop on open workflows for quantitative social scientists. The workshop is broken down in five main parts, where each of them typically consists of an introductory tutorial and a hands-on assignment. The specific tools discussed in this workshop are Markdown, Pandoc, Git, Github, R, and Rstudio, but the theoretical approach applies to a wider range of tools (e.g., LATEX and Python). By the end of the workshop, participants should be able to reproduce a paper of their own and make it available in an open form applying the concepts and tools introduced.
As in most social sciences, virtually no training is provided in regional science on workflow design and choice of appropriate tools, especially not from the viewpoint of open science (Healy 2011, Arribas-Bel 2014). Students and young researchers typically receive no guidance as to why or how they should adopt habits that favor the open science principles in their research activity. This is unfortunate, because learning and adopting new tools and workflows require a large time investment, which will only pay-off in the long run. The best time to get started is early in the career when one still has (some) time available to invest. Therefore, this workshop is specifically aimed at young researchers and covers the main ideas behind a well-designed workflow with openness, transparency and reproducibility in mind. At the same time, the content provides an introductory, hands-on overview of a set of free tools that have been designed with such values in mind.
We do not get into every detail of each tool. Instead, we aim to give a gentle introduction, to provide further material, and to place these in the appropriate context. Specific emphasis is set on how certain tools contribute to building a coherent open workflow and how they relate to each other. The main areas reviewed are: mark-up languages such as Markdown; reference managers – particularly those open and free such as Bibtex, which are compatible with LATEX; conversion tools such as Pandoc; open environments for statistical computing such as R or Python; version control systems such as Git; and online hosting on open repositories such as GitHub. At the end of the workshop, participants should be able to reproduce a paper of their own and make it available in an open form applying the concepts and tools introduced. Materials are organized on a website that is openly hosted on GitHub and licensed using Creative Commons meaning that access, remix and redistribution are permitted.
The structure of the workshop is organized in two main blocks. The first session introduces basic concepts such as open science, transparency and reproducibility. Here, we stress the relevance of paying attention to the way science is carried out and connect it to the choice of tools that allow such values to be seamlessly embraced in the day-to-day practice of quantitative research in social science. The second, longer, part of the workshop includes four sessions with hands-on overviews of specific tools that have been designed with open science principles in mind and that hence provide the ingredients of a well-thought-out open workflow. The delivery alternates presentation time with hands-on practice, allowing participants to get a real taste of what using the tools implies and therefore experience their advantages.
The five sessions are presented as follows: