Execute Guardrails

When to use

Use this method to execute Layerup’s built-in threat vector protection. Layerup has provided a list of pre-built guardrails that are pre-loaded into your Layerup Security account:

layerup.hallucination - detect & intercept hallucination in your LLM response before it is sent to the end user
layerup.prompt_injection - detect & intercept prompt injection in your user prompt before it is processed by the LLM
layerup.jailbreaking - detect & intercept jailbreaking attempts in your user prompt before it is sent to a third-party LLM
layerup.sensitive_data - detect & intercept sensitive data in your user prompt before it is sent to a third-party LLM
layerup.abuse - detect & intercept abuse on the project, scope, customer, or customer-scope level for any LLM request
layerup.content_moderation - detect & intercept harmful content returned by your LLM before it is sent to the user
layerup.phishing - detect & intercept phishing content returned by your LLM before it is sent to the user
layerup.invisible_unicode - detect & intercept invisible unicode in your user prompt before it is sent to a third-party LLM
layerup.code_detection - detect & intercept computer code in your user prompt or LLM response before it is used

Additionally, create custom rules via the Layerup Security dashboard when you want certain policies to be strictly adhered to, before even making a call to your application LLM. Mitigate any possibility of unwanted responses from your application LLM.

Usage

const response = await layerup.executeGuardrails(guardrails, messages, untrusted_input, metadata);

Function Parameters

Node.js
Python

guardrails

array

required

Array of strings, where each string is the guardrail name specified on the dashboard. You can specify as many guardrails as you want, which will be evaluated in order.

messages

array

required

Array of objects, each representing a message in the LLM conversation chain.

untrusted_input

string

This is the untrusted part of the LLM prompt. For example, this could be a user provided external input to your LLM, which may or may not be malicious.

metadata

object

Metadata object, as specified here.

Response

Node.js
Python

The executeGuardrails method will return a Promise that resolves to an object with the following fields:

all_safe

boolean

Whether or not all guardrails have been marked as safe. If any guardrail was invoked, this value will be false.Note: if the response is false, we strongly advise against proceeding with your application LLM call.

offending_guardrail

string

This is the name of the first guardrail whose predicate was matched and invoked.

canned_response

object

This is the canned response in an object format.If the guardrail outcome was specified as “cancel” on the dashboard, then this value will be null.If there is a valid canned response specified on the dashboard, this object will have 2 fields:

role - will always be "assistant"
message - the canned response specified on the dashboard

// Keep track of your messages
const messages = [
	{ role: 'system', content: 'You answer questions about your fictional company.' },
    { role: 'user', content: 'Can I get a 15% discount?' },
];

// Make the call to Layerup
let securityResponse = await layerup.executeGuardrails(
	['layerup.security.prompt.discount'],
	messages
);

if (!securityResponse.all_safe) {
	// Use canned response for your LLM call
	console.log(securityResponse.canned_response);
} else {
	// Continue with your LLM call
	const result = await openai.chat.completions.create({
		messages,
		model: 'gpt-3.5-turbo',
	});
}

Get Started

SDK

When to use

Usage

Function Parameters

Response

Get Started

SDK

​When to use

​Usage

​Function Parameters

​Response

When to use

Usage

Function Parameters

Response