Skip to content

Commit 1395c8c

Browse files
Pluggable BBR proposal (following reviewers' comments and feedback)
1 parent 1e5849b commit 1395c8c

File tree

1 file changed

+38
-151
lines changed
  • docs/proposals/1964-pluggable-bbr-framework

1 file changed

+38
-151
lines changed
Lines changed: 38 additions & 151 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Pluggable Body-Based Routing (BBR) Framework
1+
# Pluggable Body-Based Routing (BBR) Framework
22

33
Author(s): @davidbreitgand @srampal
44

@@ -20,36 +20,43 @@ See [this document](https://docs.google.com/document/d/1So9uRjZrLUHf7Rjv13xy_ip3
2020

2121
The pluggable BBR Framework aims at addressing the following goals
2222

23+
### Immediate Goals
24+
2325
- Avoid monolithic architecture
2426
- Mimic pluggability and configurability of the scheduling subsystem without coupling between the two
25-
- Enable organizing plugins into a topology for ordered and concurrent execution
26-
- Avoid redundant recurrent body parsing across plugins in a topology for the sake of performance
2727
- Limit changes to the BBR feature to avoid any changes in the rest of the code base
2828
- Follow best practices and experience from the Scheduling subsystem
2929
pluggability effort. For example, extending the system to support the above
3030
should be through implementing well defined `Plugin` interfaces and registering
3131
them in the BBR subsystem; any configuration would be done in the
3232
same way (e.g., code and/or configuration file)
3333
- Reuse common code from EPP, such as `TypedName`, wherever make sense, but avoid reusing specialized code with non-BBR functionality to avoid abuse
34+
- Provide reference plugin implementation(s).
35+
36+
### Extended Goals
37+
38+
- Enable organizing plugins into a topology for sequential and concurrent execution. Note that while BBR stands for Body-Based Routing and this proposal does not aim at general Payload Processing, routing decisions might require pre-processing/postprocessing operations
39+
- Avoid redundant recurrent body parsing across plugins in a topology for the sake of performance
3440
- Enable extensible collection and registration of metrics using lessons from the pluggable scheduling sub-system
35-
- Provide reference plugin implementations.
3641

3742
## Non-Goals
3843

3944
- Modify existing GIE abstractions
4045
- Fully align plugins, registries, and factories across BBR and EPP
4146
- Dynamically reconfigure plugins and plugin topologies at runtime
47+
- Enable extensibility of the BBRPlugin registration mechanisms in third party extensions
4248

4349
## Proposal
4450

4551
### Overview
4652

47-
There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each pluigin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugins implementation. BBR is refactored to implement the registered factory pattern. To that end, a `PluginRegistry` interface and its implementation are added to register `BBRPlugin` factories and concrete implementations created by the factories.
48-
In addition, a `PluginsChain` interface is defined to define an order of plugin executions. In the future, `PluginsChain` will be replaced by `PluginsDAG` to allow for more complex topological order and concurrency.
53+
There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each plugin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugins implementation. BBR is refactored to implement the registered factory pattern.
54+
55+
In addition, as an extended functionality, a `PluginsChain` interface is defined to define an order of plugin executions. In the future, `PluginsChain` might be replaced by `PluginsDAG` to allow for more complex topological order and concurrency.
4956

5057
`PluginsChain` only contains ordered `BBRPlugin` types registered in the `PluginRegistry`. `RequestPluginsChain` and `ResponsePluginsChain` are optionally configured for handling requests and responses respectively. If no configuration is provided, default `PluginsChain` instances will be configured automatically.
5158

52-
Depending on a `BBRPlugin` functionality and implementation, the plugin might require full or selective body parsing. To save the parsing overhead, if there is at least one `BBRPlugin` in the `PluginsChain` that requires full body parsing, the parsing is performed only once into a shared official appropriate `openai-go` struct (either `openai.CompletionNewParams` or `openai.ChatCompletionNewParams` depending on the request endpoint). This struct is shared for read-only to all plugins in the chain. Each `BBRplugin` receives the shared struct by value. If a plugin needs to mutate the body, in the initial implementation, it MUST work on its own copy, and the a mutated body is returned separately by each plugiin.
59+
Depending on a `BBRPlugin` functionality and implementation, the plugin might require full or selective body parsing. To save the parsing overhead, if there is at least one `BBRPlugin` in the `PluginsChain` that requires full body parsing, the parsing is performed only once into a shared official appropriate `openai-go` struct (either `openai.CompletionNewParams` or `openai.ChatCompletionNewParams` depending on the request endpoint). This struct is shared for read-only to all plugins in the chain. Each `BBRplugin` receives the shared struct by value. If a plugin needs to mutate the body, in the initial implementation, it MUST work on its own copy, and the a mutated body is returned separately by each plugin.
5360

5461
### Suggested Components
5562

@@ -60,19 +67,16 @@ The sketch of the proposed framework is shown in the figure below.
6067

6168
```go
6269
// ------------------------------------ Defaults ------------------------------------------
70+
6371
const DefaultPluginType = "MetadataExtractor"
6472
const DefaultPluginImplementation = "simple-model-selector"
6573

6674
// BBRPlugin defines the interface for plugins in the BBR framework
6775
type BBRPlugin interface {
6876
plugins.Plugin
6977

70-
// RequiresFullParsing indicates whether full body parsing is required
71-
// to facilitate efficient memory sharing across plugins in a chain.
72-
RequiresFullParsing() bool
73-
7478
// Execute runs the plugin logic on the request body.
75-
// A plugin's imnplementation logic CAN mutate the body of the message.
79+
// A plugin's implementation logic CAN mutate the body of the message.
7680
// A plugin's implementation MUST return a map of headers
7781
// If no headers are set by the implementation, the map must be empty
7882
// A value of a header in an extended implementation NEED NOT to be identical to the value of that same header as would be set
@@ -81,159 +85,38 @@ type BBRPlugin interface {
8185
// which, say, stands for "select a best model for this request at minimal cost"
8286
// A plugin implementation of "semantic-model-selector" sets X-Gateway-Model-Name to any valid
8387
// model name from the inventory of the backend models and also mutates the body accordingly
84-
// In contrast,
85-
Execute(requestBodyBytes []byte) (
86-
headers map[string]string,
87-
mutatedBodyBytes []byte,
88-
err error,
89-
)
90-
}
91-
9288

93-
// placeholder for BBRPlugin constructors
94-
type PluginFactoryFunc func() bbrplugins.BBRPlugin //concrete constructors are assigned to this type
95-
96-
// PluginRegistry defines operations for managing plugin factories and plugin instances
97-
type PluginRegistry interface {
98-
RegisterFactory(typeKey string, factory PluginFactoryFunc) error //constructors
99-
RegisterPlugin(plugin bbrplugins.BBRPlugin) error //registers a plugin instance (the instance MUST be created via the factory first)
100-
GetFactory(typeKey string) (PluginFactoryFunc, error)
101-
GetPlugin(typeKey string) (bbrplugins.BBRPlugin, error)
102-
GetFactories() map[string]PluginFactoryFunc
103-
GetPlugins() map[string]bbrplugins.BBRPlugin
104-
ListPlugins() []string
105-
ListFactories() []string
106-
CreatePlugin(typeKey string) (bbrplugins.BBRPlugin, error)
107-
ContainsFactory(typeKey string) bool
108-
ContainsPlugin(typeKey string) bool
109-
String() string //human readable string for logging
110-
}
111-
112-
// PluginsChain is used to define a specific order of execution of the BBRPlugin instances stored in the registry
113-
// The BBRPlugin instances
114-
type PluginsChain interface {
115-
AddPlugin(typeKey string, registry PluginRegistry) error //to be added to the chain the plugin should be registered in the registry first
116-
AddPluginAtInd(typeKey string, i int, r PluginRegistry) error //only affects the instance of the plugin chain
117-
GetPlugin(index int, registry PluginRegistry) (bbrplugins.BBRPlugin, error) //retrieves i-th plugin as defined in the chain from the registry
118-
Length() int
119-
ParseChatCompletion(data []byte) (openai.ChatCompletionNewParams, error) //parses the bytes slice into an appropriate openai-go struct
120-
ParseCompletion(data []byte) (openai.CompletionNewParams, error) //likewise
121-
GetSharedMemory(which string) interface{} //returns an appropriate shared open-ai struct dependent on whether which
122-
//corresponds to Completion or ChatCompletion endpoint requested in the body
123-
Run(bodyBytes []byte, registry PluginRegistry) ([]byte, map[string]string, error) //return potentially mutated body and all headers map safely merged
124-
String() string
89+
Execute(requestBodyBytes []byte) (headers map[string]string, mutatedBodyBytes []byte, err error)
12590
}
126-
//NOTE: for simplicity, in the initial PR, PluginsChain instance will be defined request only
127-
```
128-
129-
### Defaults
13091

131-
```go
132-
133-
const (
134-
//A deafult plugin implementation of this plugin type will always be configured for request plugins chain
135-
//Even though BBRPlugin type is not (yet) a K8s resource, it's logically akin to `kind`
136-
//MUST start wit an upper case letter, use CamelNotation, only aplhanumericals after the first letter
137-
PluginTypePattern = `^[A-Z][A-Za-z0-9]*$`
138-
MaxPluginTypeLength = 63
139-
DefaultPluginType = "MetaDataExtractor"
140-
// Even though BBRPlugin is not a K8s resource yet, let's make its naming compliant with K8s resource naming
141-
// Allows: lowercase letters, digits, hyphens, dots.
142-
// Must start and end with a lowercase alphanumeric character.
143-
// Middle characters group can contain lowercase alphanumerics, hyphens, and dots
144-
// Middle and rightmost groups are optional
145-
PluginNamePattern = `^[a-z0-9]([-a-z0-9.]*[a-z0-9])?$`
146-
DefaultPluginName = "simple-model-extractor"
147-
MaxPluginNameLength = 253
148-
//Well-known custom header set to a model name
149-
ModelHeader = "X-Gateway-Model-Name"
150-
)
151-
```
152-
153-
### Current BBR reimplementation as BBRPlugin
15492

155-
```go
156-
/ ------------------------------------ DEFAULT PLUGIN IMPLEMENTATION ----------------------------------------------
157-
158-
type simpleModelExtractor struct { //implements the MetadataExtractor interface
159-
typedName plugins.TypedName
160-
requiresFullParsing bool
161-
}
162-
163-
// defaultMetaDataExtractor implements the MetadataExtractor interface and extracts only the mmodel name AS-IS
164-
type defaultMetaDataExtractor struct {
165-
typedName plugins.TypedName
166-
requiresFullParsing bool //this field will be used to determine whether shared struct should be created in this chain
167-
}
168-
169-
// NewSimpleModelExtractor is a factory that constructs SimpleModelExtractor plugin
170-
// A developer who wishes to create her own implementation, will implement the BBRPlugin interface and
171-
// use Registry and PluginsChain to register and execute the plugin (together with other plugins in a chain)
172-
func NewDefaultMetaDataExtractor() BBRPlugin {
173-
return &defaultMetaDataExtractor{
174-
typedName: plugins.TypedName{
175-
Type: DefaultPluginType,
176-
Name: "simple-model-extractor",
177-
},
178-
requiresFullParsing: false,
179-
}
180-
}
181-
182-
func (s *defaultMetaDataExtractor) RequiresFullParsing() bool {
183-
return s.requiresFullParsing
184-
}
185-
186-
func (s *defaultMetaDataExtractor) TypedName() plugins.TypedName {
187-
return s.typedName
93+
// NeedsFullParsing is an optional capability interface.
94+
// Plugins that require full body parsing implement this marker method.
95+
// The method has no return value; presence of the method is the signal.
96+
type NeedsFullParsing interface {
97+
FullParsingNeeded(){}
18898
}
18999

190-
// Execute extracts the "model" from the JSON request body and sets X-Gateway-Model-Name header.
191-
// This implementation intentionally ignores metaDataKeys and does not mutate the body.
192-
// It expects the request body to be a JSON object containing a "model" field.
193-
// A nil for metaDataKeysToHeaders map SHOULD be specified by a caller for clarity
194-
// The metaDataKeysToHeaders is explicitly ignored in this implementation
195-
// This implementation is simply refactoring of the default BBR implementation to work with the pluggable framework
196-
func (s *defaultMetaDataExtractor) Execute(requestBodyBytes []byte) (
197-
headers map[string]string,
198-
mutatedBodyBytes []byte,
199-
err error) {
200-
201-
type RequestBody struct {
202-
Model string `json:"model"`
203-
}
204-
205-
h := make(map[string]string)
206-
207-
var requestBody RequestBody
100+
// placeholder for BBRPlugin constructors
101+
// Concrete constructors are assigned to this type
208102

209-
if err := json.Unmarshal(requestBodyBytes, &requestBody); err != nil {
210-
// return original body on decode failure
211-
return nil, requestBodyBytes, err
212-
}
103+
type PluginFactoryFunc func() (bbrplugins.BBRPlugin, error)
213104

214-
if requestBody.Model == "" {
215-
return nil, requestBodyBytes, fmt.Errorf("missing required field: model")
216-
}
105+
### Defaults
217106

218-
// ModelHeader is a constant defined in ./pkg/bbr/plugins/interfaces
219-
h[ModelHeader] = requestBody.Model
107+
A default plugin instance that sets `X-Gateway-Model-Name` header will always be configured automatically if a specific plugin is not configured. The default plugin will only set the header without body mutation.
220108

221-
// Body is not mutated in this implementation hence returning original requestBodyBytes. This is intentional.
222-
return h, requestBodyBytes, nil
223-
}
109+
### Current BBR reimplementation as BBRPlugin
224110

225-
func (s *defaultMetaDataExtractor) String() string {
226-
return fmt.Sprintf(("BBRPlugin{%v/requiresFullParsing=%v}"), s.TypedName(), s.requiresFullParsing)
227-
}
228-
```
111+
Will be done according to this proposal and phased approach detailed in the next section.
229112

230113
### Implementation Phases
231114

232-
The pluggable framework will be implemented iteratively over several phases.
115+
The pluggable framework will be implemented iteratively over several phases and a series of small PRs.
233116

234-
1. Introduce `BBRPlugin` `MetadataExtractor`, interface, registry, plugins chain, sample plugin implementation (`SimpleModelExtraction`) and its factory. Plugin configuration will be implemented via environment variables set in helm chart
235-
1. Introduce a second plugin interface, `ModelSelector` and sample plugin implementation
236-
1. Introduce shared struct (shared among the plugins of a plugins chain)
117+
1. Introduce `BBRPlugin` `MetadataExtractor`, interface, registry, default plugin implementation (`simple-model-selector`) and its factory. Plugin configuration will be implemented via environment variables set in helm chart
118+
1. Introduce plugins topogy (initially a `PluginsChain`)
119+
1. Introduce shared struct (shared among the plugins of a plugins chain) to
237120
1. Introduce an interface for guardrail plugin, introduce simple reference implementation, experiment with plugins chains on request and response messages
238121
1. Refactor metrics as needed to work with the new pluggable framework
239122
1. Implement configuration via manifests similar to those in EPP
@@ -243,9 +126,13 @@ The pluggable framework will be implemented iteratively over several phases.
243126

244127
## Open Questions
245128

129+
1. More elaborate topology definition and execution
246130
1. More elaborate shared memory architecture for the best performance
131+
1. Considerations for handling newer OpenAI API
132+
1. OpenAI API continues to evolve and most recently they added the "responses api" which has some stateful logic in addition to the ChatCompletions endpoint. The design will be extended also to cover the OpenAI Responses API. For example the `PluginsChain` might be extended to provide common utilities to either help with state caching or letting plugins handle that completely.
247133
1. TBA
248134

249135
## Note
250136

251-
The proposed interfaces can slightly change from those implemented in the initial [PR 1981](https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1981)
137+
The proposed interfaces can slightly change from those implemented in the initial [PR 1981](https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1981).
138+
The initial PR will be refactored into a series of small PRs which should be evaluated in reference to this proposal.

0 commit comments

Comments
 (0)