When to use

Use this method to execute Layerup’s built-in threat vector protection. Layerup has provided a list of pre-built guardrails that are pre-loaded into your Layerup Security account:

  • layerup.hallucination - detect & intercept hallucination in your LLM response before it is sent to the end user
  • layerup.prompt_injection - detect & intercept prompt injection in your user prompt before it is processed by the LLM
  • layerup.jailbreaking - detect & intercept jailbreaking attempts in your user prompt before it is sent to a third-party LLM
  • layerup.sensitive_data - detect & intercept sensitive data in your user prompt before it is sent to a third-party LLM
  • layerup.abuse - detect & intercept abuse on the project, scope, customer, or customer-scope level for any LLM request
  • layerup.content_moderation - detect & intercept harmful content returned by your LLM before it is sent to the user
  • layerup.phishing - detect & intercept phishing content returned by your LLM before it is sent to the user
  • layerup.invisible_unicode - detect & intercept invisible unicode in your user prompt before it is sent to a third-party LLM
  • layerup.code_detection - detect & intercept computer code in your user prompt or LLM response before it is used

Additionally, create custom rules via the Layerup Security dashboard when you want certain policies to be strictly adhered to, before even making a call to your application LLM. Mitigate any possibility of unwanted responses from your application LLM.

Usage

Function Parameters

guardrails
array
required

Array of strings, where each string is the guardrail name specified on the dashboard. You can specify as many guardrails as you want, which will be evaluated in order.

messages
array
required

Array of objects, each representing a message in the LLM conversation chain.

untrusted_input
string

This is the untrusted part of the LLM prompt. For example, this could be a user provided external input to your LLM, which may or may not be malicious.

metadata
object

Metadata object, as specified here.

Response

The executeGuardrails method will return a Promise that resolves to an object with the following fields:

all_safe
boolean

Whether or not all guardrails have been marked as safe. If any guardrail was invoked, this value will be false.

Note: if the response is false, we strongly advise against proceeding with your application LLM call.

offending_guardrail
string

This is the name of the first guardrail whose predicate was matched and invoked.

canned_response
object

This is the canned response in an object format.

If the guardrail outcome was specified as “cancel” on the dashboard, then this value will be null.

If there is a valid canned response specified on the dashboard, this object will have 2 fields:

  • role - will always be "assistant"
  • message - the canned response specified on the dashboard