Skip to content

AMindToThink/priming-effects

Repository files navigation

Priming Effects in LLMs

Disclaimer: This project — including all code, analysis, and the research report — was implemented entirely by AI (Claude Code). Matthew Khoriaty (@AMindToThink) directed the research questions and reviewed the outputs but cannot fully vouch for the correctness of the implementation or statistical analysis. The full Claude Code conversation logs are available in logs/conversation/ for transparency.

Systematic investigation of whether modern LLMs exhibit priming effects analogous to those documented in human psycholinguistics. Tests syntactic, semantic, lexical, temporal decay, and pragmatic priming across GPT-4o-mini and Claude 3.5 Haiku.

Key finding: GPT-4o-mini exhibits a strong, dose-dependent dative syntactic priming effect (0% → 30%, p < 0.0001), mirroring cumulative priming in humans.

See REPORT.md for the full write-up.

Quick Start

# Install dependencies
uv sync --all-extras

# Set your OpenRouter API key
export OPENROUTER_API_KEY_mk_era_1=sk-or-...

# Dry run (2 trials, one model)
uv run python -m experiments.runner --trials 2 --models gpt4o-mini

# Full run (~2,500 API calls, both models)
uv run python -m experiments.runner

# Run a single experiment
uv run python -m experiments.runner exp1 --trials 30

# Generate figures from saved results
uv run python -c "
import pandas as pd
from pathlib import Path
from analysis.plots import generate_all_figures
all_results = {p.stem: pd.read_parquet(p) for p in Path('data/processed').glob('*.parquet')}
generate_all_figures(all_results)
"

# Run tests
uv run pytest tests/ -v

Project Structure

priming_effects/
├── src/
│   ├── config.py          # Model specs, API key, experiment defaults
│   ├── client.py          # Async OpenRouter client with caching + retries
│   ├── cache.py           # SHA-256 disk cache for API responses
│   ├── classifiers.py     # Voice, dative, semantic, register classifiers
│   └── stimuli.py         # All experimental stimuli
├── experiments/
│   ├── base.py            # BaseExperiment ABC
│   ├── exp1_syntactic.py  # Syntactic priming (voice + dative, dose 0/1/3/5)
│   ├── exp2_semantic.py   # Semantic field leakage
│   ├── exp3_lexical_boost.py  # Shared vs. different verb priming
│   ├── exp4_decay.py      # Priming persistence over filler turns
│   ├── exp5_pragmatic.py  # Hedging/formal/assertive register mirroring
│   └── runner.py          # CLI entry point
├── analysis/
│   ├── stats.py           # Chi-squared, Cohen's h, bootstrap CIs, decay fit
│   └── plots.py           # 7 figures
├── tests/
│   ├── test_classifiers.py
│   └── test_stats.py
├── data/processed/        # Result DataFrames (parquet)
├── figures/               # Output plots (PNG)
└── REPORT.md              # Full research report

Experiments

# Experiment What it tests Key result
1 Syntactic Priming Dose-response (0,1,3,5 primes) for voice + dative Dative: 0%→30% (GPT-4o-mini, p<0.0001)
2 Semantic Priming Domain vocabulary leakage Null — models compartmentalize well
3 Lexical Boost Shared verb amplification Floor effect (0% passive)
4 Decay Curve Priming over 0–10 filler turns Floor effect
5 Pragmatic Style Register mirroring Near-zero with keyword classifier

Models

Both accessed via OpenRouter (https://openrouter.ai/api/v1):

  • openai/gpt-4o-mini
  • anthropic/claude-3.5-haiku

Tests

32 unit tests covering classifiers (voice, dative, semantic overlap, register) and statistical functions (Cohen's h, bootstrap CIs, chi-squared, decay fitting):

uv run pytest tests/ -v

About

Investigating priming effects in LLMs — syntactic, semantic, lexical, decay, and pragmatic

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages