VM Brain-Gut Coaching Class (VMBGCC) Analysis

Analysis pipeline for evaluating the Virginia Mason Brain-Gut Coaching Class program — a quality-improvement study examining patient-reported outcomes (IBS-SSS, PHQ-2, GAD-7), healthcare utilization, and thematic patterns from 182 patients across 34 class sessions.

Repository Structure

.
├── code/                   # All analysis scripts
│   ├── run_*.R             # Entry-point wrappers (start here)
│   ├── VMBGCC_*.Rmd        # Core analysis notebooks (literate programming)
│   ├── VMBGCC_*.R          # Auto-generated from Rmd (do not edit directly)
│   └── VMBGCC_functions.R  # Shared utility functions
├── data/
│   ├── inputData/          # Raw Excel data (not tracked in git)
│   └── outputData/         # Cleaned data and analysis results
├── documents/              # Drafts, literature, notes, study documents
└── figures/                # Publication-ready PDF and PNG figures

Pipeline Overview

The analysis is organized into five sequential phases. Each phase has:

A run_*.R wrapper — the entry point you should execute. It checks for required upstream outputs and auto-runs dependencies if they are missing.
A VMBGCC_*.Rmd notebook — the core analysis code with embedded documentation. These can also be knit interactively in RStudio for an exploratory workflow.
A VMBGCC_*.R script — auto-generated by knitr::purl() from the Rmd. Do not edit these directly; they are overwritten each run.

Execution Order

┌──────────────────────────────────────────────────────────────────────┐
│  Phase 1: CLEANING                                                   │
│  run_cleaning.R  →  VMBGCC_cleaning.Rmd                             │
│  Reads raw Excel  →  Produces cleaned RDS/CSV + diagnosis matrix     │
└──────────────────────┬───────────────────────────────────────────────┘
                       │
         ┌─────────────┼─────────────┐
         ▼             ▼             ▼
┌─────────────┐ ┌────────────┐ ┌────────────┐
│  Phase 2:   │ │  Phase 3:  │ │  Phase 4:  │
│ DESCRIPTIVES│ │  OUTCOMES  │ │  THEMATIC  │
│  run_       │ │  run_      │ │  run_      │
│ descriptives│ │ outcomes.R │ │ thematic.R │
│       .R    │ │            │ │            │
└──────┬──────┘ └─────┬──────┘ └─────┬──────┘
       │              │              │
       └──────────────┼──────────────┘
                      ▼
         ┌────────────────────────┐
         │  Phase 5: FIGURES      │
         │  run_figures.R         │
         │  Reads all upstream    │
         │  results → pub-ready   │
         │  figures and tables    │
         └────────────────────────┘

Phases 2–4 are independent of each other and can be run in any order (or in parallel). Phase 5 requires all prior phases to be complete.

Phase Details

Phase	Entry Point	Core Notebook	Reads	Produces
1. Cleaning	`run_cleaning.R`	`VMBGCC_cleaning.Rmd`	Raw Excel (4 sheets)	`bgccClean.rds/.csv`, `thematicCoding.rds`, `diagnosisOneHot.rds`
2. Descriptives	`run_descriptives.R`	`VMBGCC_descriptives.Rmd`	`bgccClean.rds`	Table 1, EDA figures, temporal trends
3. Outcomes	`run_outcomes.R`	`VMBGCC_outcomes.Rmd`	`bgccClean.rds`	`outcomeResults.rds` (Wilcoxon tests, effect sizes, mixed models)
4. Thematic	`run_thematic.R`	`VMBGCC_thematic.Rmd`	`bgccClean.rds`, `thematicCoding.rds`	`themeResults.rds` (prevalence, co-occurrence, micro-goals)
5. Figures	`run_figures.R`	`VMBGCC_figures.Rmd`	All of the above	32 publication-ready figures (PDF + PNG)

Quick Start

Full pipeline (from terminal)

cd /path/to/VM_brainGutCoaching
Rscript code/run_figures.R

This single command will auto-detect missing upstream outputs and run the entire pipeline end-to-end (cleaning → descriptives → outcomes → thematic → figures).

Step-by-step (interactive in RStudio)

# 1. Clean raw data
source("code/run_cleaning.R")

# 2-4. Run analyses (any order)
source("code/run_descriptives.R")
source("code/run_outcomes.R")
source("code/run_thematic.R")

# 5. Generate figures (requires 1-4)
source("code/run_figures.R")

Or open the individual .Rmd files in RStudio and knit/run chunks interactively for exploratory work.

Shared Utilities

VMBGCC_functions.R provides functions used across all phases:

Function	Purpose
`savePlot()`	Save ggplot objects as PDF and/or PNG at 600 DPI
`toNumericSafe()`	Numeric conversion handling "Unknown" / blank → NA
`excelDateToR()`	Convert Excel serial date numbers to R Date objects
`ibsSSSBand()`	Classify IBS-SSS severity (Remission / Mild / Moderate / Severe)
`gad7Band()`	Classify GAD-7 severity (Minimal / Mild / Moderate / Severe)
`phq2Screen()`	Binary PHQ-2 depression screen (Positive ≥ 3 / Negative)

Debug / Inspection Utilities

Script	Purpose
`read_excel_sheets.R`	List all Excel sheets with structure and preview
`debug_cleaning.R`	Troubleshoot data matching and encoding issues
`inspect_data_deep.R`	Check row counts, unique values, summary statistics

Prerequisites

R packages (loaded automatically by each script):

Data wrangling: dplyr, tidyr, stringr, forcats
Excel I/O: openxlsx
Statistics: rstatix, effectsize, lme4, pwr
Visualization: ggplot2, patchwork, ggalluvial, UpSetR
Tables: gtsummary, gt, flextable

Data

Raw data is stored in data/inputData/ and is not tracked in version control. The source file is a multi-sheet Excel workbook containing patient demographics, pre/post outcome scores, healthcare utilization metrics, thematic coding from consensus review, and a one-hot diagnosis matrix.

All output files in data/outputData/ follow the naming convention VMBGCC.<date>_<description>.<ext>.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code		code
data		data
documents		documents
figures		figures
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
Rplots.pdf		Rplots.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VM Brain-Gut Coaching Class (VMBGCC) Analysis

Repository Structure

Pipeline Overview

Execution Order

Phase Details

Quick Start

Full pipeline (from terminal)

Step-by-step (interactive in RStudio)

Shared Utilities

Debug / Inspection Utilities

Prerequisites

Data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VM Brain-Gut Coaching Class (VMBGCC) Analysis

Repository Structure

Pipeline Overview

Execution Order

Phase Details

Quick Start

Full pipeline (from terminal)

Step-by-step (interactive in RStudio)

Shared Utilities

Debug / Inspection Utilities

Prerequisites

Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages