This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"
-
Updated
Feb 2, 2025 - Python
This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"
First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)
Terraform-Guardrail (TerraGuard) MCP is an open-source governance framework that enforces consistent, executable guardrails for Terraform across CI/CD pipelines—helping teams deliver secure, compliant infrastructure at scale without slowing down development.
A companion repository for llm-router containing a collection of pipeline-ready plugins. Features a masking interface for anonymizing sensitive data and a guardrail system for validating input/output safety against defined policy rules.
Kavach AI provides robust, multi-layered content moderation and safety guardrails for AI systems. It helps protect your AI applications from harmful content, jailbreak attempts, prompt injections, and other security vulnerabilities.
🔒 Enhance Terraform governance with a Python-based MCP server and CLI, ensuring faster workflows and stronger compliance for safer infrastructure deployments.
Add a description, image, and links to the guardrail topic page so that developers can more easily learn about it.
To associate your repository with the guardrail topic, visit your repo's landing page and select "manage topics."