The Scholar’s Backpack:
Using virtual environments to support modern research practice.

Bret Davidson | Eka Grguric

NCSU Libraries

bretdavidson.github.io/cni-2016

Agenda

  • Open science as problem space
  • Open science as modern research practice
  • Open science at NC State
  • Scholar's Backpack

Open Science: what is it?

  • Open Access
  • Open Data
  • Open Notebooks
  • Open Source

Open Science is a return to first principles of scientific practice.

Royal Society

Nullius in Verba

"Take nobody's word for it."

Open Science can
increase reproducibility.

Five Schools of Thought

by Sönke Bartling & Sascha Friesike

Editors, http://book.openingscience.org/

  • Infrastructure
  • Public
  • Measurement
  • Democratic
  • Pragmatic

Why Libraries?

Hunt Library

Aligns with core library values

  • information access
  • open peer review
  • community-based knowledge creation
  • the preservation and dissemination of research
  • libraries are champions of open (open source; open data)

Libraries

are about

supporting their users

Academic Libraries

are about

supporting research practice

Ongoing disruption by digital technologies in modern research practice

Hypothetical Open Science Workflow

Open Science Workflow

101 Innovations in Scholarly Communication, https://innoscholcomm.silk.co/

Policy Shifts

in support of open

OECD Policy Paper

Ecosystem of Support for Modern Research Practice at NCSU Libraries

Research Support at NCSU Libraries
Visualization Wall
Visualization Workshops
Makerspaces at NCSU Libraries
Wolfpack Citizen Science Challenge

The NCSU Libraries'
Open Science Initiative

Goals

  • explore open science practice at NCSU
  • better understand researcher needs in context

Take a non-prescriptive
user-centered approach.

Create opportunities for communication.

Open Science Unconference

Open Science Unconference

Follow-up Informal Interviews

  • Modern Research Skills Gap
  • Insufficient Incentives
Summer of Open Science
Summer of Open Science Event Series

Goals

  • Hands on skill building
  • Provide networking opportunities
  • Increase visibility of library spaces & services

Skills

  • Scholarly identity creation
  • Scientific computing
  • Building a website
  • Data harvesting
  • Code collaboration

The Planning Team

Representation from both technical
and non-technical departments.

Summer of Open Science

Summer of Open Science

  • Workshops
    • Intro to the Command Line Interface
    • Web Scraping with Python
    • Understand and Build Your Scholarly Identity
    • Scientific Computing with Python & Raspberry Pi
    • Build Your Scholarly Website the Easy Way

  • Events
    • Meetups
    • End-of-Summer Showcase
Workshop Instructors
SoS Python Workshop in Makerspace

Scientific Computing with Python & Raspberry Pi

40 person waiting list

SoS Python Workshop in Makerspace 2

Interdisciplinary Need:
over 40 departments across ~16 colleges

Takeaways

  • Libraries are well positioned to fill gaps in the curriculum
  • "Open Science" attracted a range of disciplines
  • High demand for introductory skill training, particularly coding skills (Python)
  • Interest in interdisciplinary research sharing
  • Summer presents interesting opportunities and challenges

Virtual Environments for Reproducible Computing

Technical workshops are
ripe for disaster.

What could go wrong?

  • Images reset overnight
  • Improper permissions
  • Network connectivity issues
  • Language Versions
  • Missing packages

Instructor Challenges

  • Consistency across user environments
  • Consistency of course materials
  • Time to provision computing environments
  • Ease of collaboration

Student Challenges

  • Basic data types and structures
  • Module system
  • Retrieve a web page with Requests
  • Parse content with Beautiful Soup
  • Generate a word cloud with matplotlib
  • Control Structures
  • Exception Handling
  • Working with file system

Computing Tasks
vs.
Computing Environments

Many Options

  • Custom Operating System Images
  • Custom Distributions, e.g. Anaconda
  • Interactive Environments, e.g. Jupyter

Our Approach

  • Vagrant for managing operating system
  • Ansible for provisioning and configuration
  • Course or lab specific packages and resources

Easy!

  1. Install Vagrant
  2. Install VirtualBox
  3. Clone project repo
  4. `vagrant up`
  5. `vagrant ssh`
  6. Execute code!

This is reproducible computing!

Benefits

  • Consistent environment user to user
  • Single target for course materials
  • Faster provisioning for new workshops
  • Repeatable course to course

Rise of Scholarly Code

Researcher Challenges

  • Consistency across lab environments
  • Ability to see results of code
  • Consistency across time
  • Ease of collaboration

github.com/NCSU-Libraries/scholars-backpack

python-vagrant

Features

  • Python
  • R and R Studio
  • Jupyter Notebook Server
  • Example Notebooks

Vagrant

Vagrant

Create and configure lightweight, reproducible, and portable development environments.

Usage

  • Easy installation through binary package
  • Flexible configuration via text-based configuration file
  • Single command: `vagrant up`

Ansible

"Automation engine" for provisioning
and configuration management.

Provisioning

"To make something available."

Installation!

Configuration Management

"Establish and maintain consistency of an environment."

Provisioning

  • Text editor
  • Python & R
  • Git
  • Web Browser
  • etc.

Configuration

  • Start Jupyter notebook server
  • Set environment variables
  • Set default login directory

Benefits

  • Improved consistency
  • Ability to see results of code
  • Ease of collaboration

Future Work

Richer Environment

  • Broader scientific computing
  • Improved adherance to best practices
  • Docker containers for portability

Embedded Use

  • Curricular use
  • Laboratory use

Summary

Open Science represents a new framework for research and provides an opportunity for libraries to engage researchers in new ways.

NCSU Libraries has done workshops and outreach around this framework and there is evidence of strong interest across disciplines.

We are redeploying existing technical resources and cutting edge technology in ways that used to be difficult or impossible.

This approach has helped us identify a new leadership role for libraries in open research support.

Thanks!

bret_davidson@ncsu.edu

eka_grguric@ncsu.edu | @egrguric

github.com/NCSU-Libraries/scholars-backpack

bretdavidson.github.io/cni-2016