Execute Guardrails
Execute pre-defined guardrails that allow you to send canned responses when prompts or responses meet a certain predicate.
When to use
Use this method to execute Layerup’s built-in threat vector protection. Layerup has provided a list of pre-built guardrails that are pre-loaded into your Layerup Security account:
layerup.hallucination
- detect & intercept hallucination in your LLM response before it is sent to the end userlayerup.prompt_injection
- detect & intercept prompt injection in your user prompt before it is processed by the LLMlayerup.jailbreaking
- detect & intercept jailbreaking attempts in your user prompt before it is sent to a third-party LLMlayerup.sensitive_data
- detect & intercept sensitive data in your user prompt before it is sent to a third-party LLMlayerup.abuse
- detect & intercept abuse on the project, scope, customer, or customer-scope level for any LLM requestlayerup.content_moderation
- detect & intercept harmful content returned by your LLM before it is sent to the userlayerup.phishing
- detect & intercept phishing content returned by your LLM before it is sent to the userlayerup.invisible_unicode
- detect & intercept invisible unicode in your user prompt before it is sent to a third-party LLMlayerup.code_detection
- detect & intercept computer code in your user prompt or LLM response before it is used
Additionally, create custom rules via the Layerup Security dashboard when you want certain policies to be strictly adhered to, before even making a call to your application LLM. Mitigate any possibility of unwanted responses from your application LLM.
Usage
Function Parameters
Array of strings, where each string is the guardrail name specified on the dashboard. You can specify as many guardrails as you want, which will be evaluated in order.
Array of objects, each representing a message in the LLM conversation chain.
This is the untrusted part of the LLM prompt. For example, this could be a user provided external input to your LLM, which may or may not be malicious.
Response
The executeGuardrails
method will return a Promise
that resolves to an object with the following fields:
Whether or not all guardrails have been marked as safe. If any guardrail was invoked, this value will be false
.
Note: if the response is false
, we strongly advise against proceeding with your application LLM call.
This is the name of the first guardrail whose predicate was matched and invoked.
This is the canned response in an object format.
If the guardrail outcome was specified as “cancel” on the dashboard, then this value will be null
.
If there is a valid canned response specified on the dashboard, this object will have 2 fields:
role
- will always be"assistant"
message
- the canned response specified on the dashboard