|
1 | 1 | --- |
2 | | -title: Output-Invariance and Time-Based Testing – Practical Techniques for Black-Box Enumeration of LLMs |
| 2 | +title: Output-Invariant and Time-Based Testing – Practical Techniques for Black-Box Enumeration of LLMs |
3 | 3 | excerpt: Abusing inherent context and sluggishness in LLMs for stealthy enumeration of prompt injection points. |
4 | 4 | tags: |
5 | 5 | - ai |
@@ -43,13 +43,13 @@ We'll assume a **Direct Prompt Injection** scenario, i.e. a tester/attacker is i |
43 | 43 |
|
44 | 44 | Let's take a look at the first method. |
45 | 45 |
|
46 | | -## Output-Invariance Testing |
| 46 | +## Output-Invariant Testing |
47 | 47 |
|
48 | | -The idea is quite simple, I just think the term "output-invariance testing" sums it up nicely. |
| 48 | +The idea is quite simple, I just think the term "output-invariant testing" sums it up nicely. |
49 | 49 |
|
50 | 50 | The key idea is to take a base request/response, change the input slightly without changing context, and aim to keep the LLM response unchanged. |
51 | 51 |
|
52 | | -Output-invariance is always relative to some base request. So any mention of "output-invariant prompt" means there are two prompts: a base prompt and a modified test prompt. |
| 52 | +Output-invariance is always relative to some base request. Any mention of "output-invariant prompt" implies two prompts: a base prompt and a modified test prompt. |
53 | 53 |
|
54 | 54 | ### Concept |
55 | 55 |
|
@@ -105,12 +105,12 @@ However, the LLM implementation would return the same response: |
105 | 105 | } |
106 | 106 | ``` |
107 | 107 |
|
108 | | -This is because LLMs have something traditional implementations don't: they "understand" context and language. It "recognises" *Michael Scott* resembles a name, and the phrase *My name is* indicates the following text is a name. |
| 108 | +This is because LLMs have something traditional implementations don't: they "understand" context and language. It "recognises" `Michael Scott` resembles a name, and the phrase `My name is` indicates the following text is a name. |
109 | 109 |
|
110 | 110 | {% image "assets/same-picture.jpg", "jw-60", "Corporate needs you to find the differences between Trump and Musk. GPT: ..." %} |
111 | 111 |
|
112 | 112 | {% alert "success" %} |
113 | | -The key idea behind the Output-Invariance Testing is to take a base (HTTP) request, then **change a field slightly but aim to keep the LLM response— the output— invariant (unchanged)**. |
| 113 | +The key idea behind Output-Invariant Testing is to take a base (HTTP) request, then **change a field slightly but aim to keep the LLM response— the output— invariant (unchanged)**. |
114 | 114 | {% endalert %} |
115 | 115 |
|
116 | 116 | To reiterate, we have two requests/responses involved: |
@@ -367,7 +367,7 @@ The rise of LLM applications is a clear signal for penetration testers and red-t |
367 | 367 |
|
368 | 368 | 3. Scaling and automation is a natural follow-up topic when discussing enumeration. |
369 | 369 |
|
370 | | -4. After making the Pam Same Picture meme, a thought occurred to me: would LLMs also normalise typos? Would they consider something like "bubble tea" and "bublbe tea" to be the *same picture*? That may be another avenue for output-invariant attacks. |
| 370 | +4. After making the Pam Same Picture meme, a thought occurred to me: would LLMs also normalise typos? Would they consider something like `bubble tea` and `bublbe tea` to be the *same picture*? This may be another option for output-invariant attacks. |
371 | 371 |
|
372 | 372 | ### Further Resources |
373 | 373 |
|
|
0 commit comments