This is the evaluation framework for the DebateLLM Fallacy Detector, a DeBERTa-base model specialized in identifying logical fallacies in text. Developed as part of my GSoC (Google Summer of Code) project.
The primary aim of this suite is to rigorously evaluate the performance of RowdyI7er/DebateLLM. It includes tools for:
- Small-Scale Verification: Initial assessment of the model with 32 statements.
- Large-Scale Performance Benchmarking: A comprehensive test using 300 unique logical fallacy statements.
- Detailed Reporting: Automated generation of performance metrics, class breakdowns, and confusion analysis.
- Word-Level Explainability (SHAP): Heatmap-based analysis showing which specific keywords trigger each fallacy label (GSoC Highlight).
The model covers the following 8 categories:
- Ad Hominem
- Appeal to Authority
- Bandwagon
- False Dilemma
- Hasty Generalization
- No Fallacy (Standard Factual Statement)
- Slippery Slope
- Strawman
├── scripts/ # Main evaluation and report generation logic.
│ ├── evaluate_300.py # Large-scale (300 statements) inferencing script.
│ └── ...
├── reports/ # Detailed Markdown performance reports.
│ └── performance_report_300.md
├── data/ # Results and label mappings.
│ └── evaluation_results_300.json
└── research/ # Investigation and debug history.
pip install torch transformers tqdmTo replicate the performance test and generate a fresh report:
- Execute the Evaluation:
python scripts/evaluate_300.py
- Generate the Report:
python scripts/generate_report_300.py
Check the reports/ folder for the result.
To understand why the model predicted a certain fallacy:
python scripts/explain_fallacy.py "Your input sentence here"The script will save a Heatmap HTML report in reports/explanations/. Open this file in your browser to see the word-level saliency map.