Skip to content

Commit 6a47094

Browse files
committed
Release 0.8.0b7
1 parent 7a53baa commit 6a47094

File tree

182 files changed

+10838
-4824
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

182 files changed

+10838
-4824
lines changed

.mock/definition/__package__.yml

Lines changed: 541 additions & 1016 deletions
Large diffs are not rendered by default.

.mock/definition/datasets.yml

Lines changed: 45 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ service:
176176
question: Who wrote Hamlet?
177177
target:
178178
answer: William Shakespeare
179-
action: add
179+
action: set
180180
commit_message: Add two new questions and answers
181181
response:
182182
body:
@@ -196,24 +196,53 @@ service:
196196
datapoints:
197197
- messages:
198198
- role: user
199-
content: >-
200-
Hi Humanloop support team, I'm having trouble
201-
understanding how to use the evaluations feature in your
202-
software. Can you provide a step-by-step guide or any
203-
resources to help me get started?
199+
content: |
200+
How do i manage my organizations API keys?
204201
target:
205-
feature: evaluations
206-
issue: needs step-by-step guide
202+
response: >-
203+
Hey, thanks for your questions. Here are steps for how to
204+
achieve: 1. Log in to the Humanloop Dashboard
205+
206+
207+
2. Click on "Organization Settings."
208+
If you do not see this option, you might need to contact your organization admin to gain the necessary permissions.
209+
210+
3. Within the settings or organization settings, select the
211+
option labeled "API Keys" on the left. Here you will be able
212+
to view and manage your API keys.
213+
214+
215+
4. You will see a list of existing API keys. You can perform
216+
various actions, such as:
217+
- **Generate New API Key:** Click on the "Generate New Key" button if you need a new API key.
218+
- **Revoke an API Key:** If you need to disable an existing key, find the key in the list and click the "Revoke" or "Delete" button.
219+
- **Copy an API Key:** If you need to use an existing key, you can copy it to your clipboard by clicking the "Copy" button next to the key.
220+
221+
5. **Save and Secure API Keys:** Make sure to securely store
222+
any new or existing API keys you are using. Treat them like
223+
passwords and do not share them publicly.
224+
225+
226+
If you encounter any issues or need further assistance, it
227+
might be helpful to engage with an engineer or your IT
228+
department to ensure you have the necessary permissions and
229+
support.
230+
231+
232+
Would you need help with anything else?
207233
- messages:
208234
- role: user
209235
content: >-
210-
Hi there, I'm interested in fine-tuning a language model
211-
using your software. Can you explain the process and
212-
provide any best practices or guidelines?
236+
Hey, can do I use my code evaluator for monitoring my
237+
legal-copilot prompt?
213238
target:
214-
feature: fine-tuning
215-
issue: process explanation and best practices
216-
action: add
239+
response: >-
240+
Hey, thanks for your questions. Here are steps for how to
241+
achieve: 1. Navigate to your Prompt dashboard.
242+
2. Select the `Monitoring` button on the top right of the Prompt dashboard
243+
3. Within the model select the Version of the Evaluator you want to turn on for monitoring.
244+
245+
Would you need help with anything else?
217246
commit_message: Add two new questions and answers
218247
response:
219248
body:
@@ -770,6 +799,8 @@ service:
770799
evaluator_version_id: evaluator_version_id
771800
created_at: '2024-01-15T09:30:00Z'
772801
updated_at: '2024-01-15T09:30:00Z'
802+
source:
803+
openapi: openapi/openapi.auto.json
773804
display-name: Datasets
774805
docs: >+
775806
Datasets are collections of input-output pairs that you can use within

.mock/definition/evaluations.yml

Lines changed: 56 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,9 @@ service:
125125
arguments_type: target_required
126126
return_type: number
127127
evaluator_type: python
128-
code: def evaluate(answer, target):\n return 0.5
128+
code: |-
129+
def evaluate(answer, target):
130+
return 0.5
129131
version_logs_count: 1
130132
total_logs_count: 1
131133
inputs:
@@ -258,7 +260,9 @@ service:
258260
arguments_type: target_required
259261
return_type: number
260262
evaluator_type: python
261-
code: def evaluate(answer, target):\n return 0.5
263+
code: |-
264+
def evaluate(answer, target):
265+
return 0.5
262266
version_logs_count: 1
263267
total_logs_count: 1
264268
inputs:
@@ -362,7 +366,9 @@ service:
362366
arguments_type: target_required
363367
return_type: number
364368
evaluator_type: python
365-
code: def evaluate(answer, target):\n return 0.5
369+
code: |-
370+
def evaluate(answer, target):
371+
return 0.5
366372
version_logs_count: 1
367373
total_logs_count: 1
368374
inputs:
@@ -507,7 +513,9 @@ service:
507513
arguments_type: target_required
508514
return_type: number
509515
evaluator_type: python
510-
code: def evaluate(answer, target):\n return 0.5
516+
code: |-
517+
def evaluate(answer, target):
518+
return 0.5
511519
version_logs_count: 1
512520
total_logs_count: 1
513521
inputs:
@@ -596,6 +604,7 @@ service:
596604
spec:
597605
arguments_type: target_free
598606
return_type: boolean
607+
evaluator_type: llm
599608
name: name
600609
version_id: version_id
601610
created_at: '2024-01-15T09:30:00Z'
@@ -614,6 +623,7 @@ service:
614623
email_address: email_address
615624
full_name: full_name
616625
updated_at: '2024-01-15T09:30:00Z'
626+
url: url
617627
getStats:
618628
path: /evaluations/{id}/stats
619629
method: GET
@@ -734,6 +744,26 @@ service:
734744
inputs:
735745
- name: name
736746
id: id
747+
evaluator_logs:
748+
- id: id
749+
evaluator_logs: []
750+
evaluator:
751+
path: path
752+
id: id
753+
spec:
754+
arguments_type: target_free
755+
return_type: boolean
756+
evaluator_type: llm
757+
name: name
758+
version_id: version_id
759+
created_at: '2024-01-15T09:30:00Z'
760+
updated_at: '2024-01-15T09:30:00Z'
761+
status: uncommitted
762+
last_used_at: '2024-01-15T09:30:00Z'
763+
version_logs_count: 1
764+
total_logs_count: 1
765+
inputs:
766+
- name: name
737767
evaluator_logs:
738768
- prompt:
739769
path: path
@@ -750,9 +780,31 @@ service:
750780
inputs:
751781
- name: name
752782
id: id
783+
evaluator_logs:
784+
- id: id
785+
evaluator_logs: []
786+
evaluator:
787+
path: path
788+
id: id
789+
spec:
790+
arguments_type: target_free
791+
return_type: boolean
792+
evaluator_type: llm
793+
name: name
794+
version_id: version_id
795+
created_at: '2024-01-15T09:30:00Z'
796+
updated_at: '2024-01-15T09:30:00Z'
797+
status: uncommitted
798+
last_used_at: '2024-01-15T09:30:00Z'
799+
version_logs_count: 1
800+
total_logs_count: 1
801+
inputs:
802+
- name: name
753803
page: 1
754804
size: 1
755805
total: 1
806+
source:
807+
openapi: openapi/openapi.auto.json
756808
display-name: Evaluations
757809
docs: >+
758810
Evaluations help you measure the performance of your Prompts, Tools and LLM

.mock/definition/evaluators.yml

Lines changed: 55 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,9 @@ service:
6161
arguments_type: target_required
6262
return_type: number
6363
evaluator_type: python
64-
code: def evaluate(answer, target):\n return 0.5
64+
code: |-
65+
def evaluate(answer, target):
66+
return 0.5
6567
version_logs_count: 1
6668
total_logs_count: 1
6769
inputs:
@@ -119,7 +121,9 @@ service:
119121
arguments_type: target_required
120122
return_type: number
121123
evaluator_type: python
122-
code: def evaluate(answer, target):\n return 0.5
124+
code: |-
125+
def evaluate(answer, target):
126+
return 0.5
123127
commit_message: Initial commit
124128
response:
125129
body:
@@ -136,7 +140,9 @@ service:
136140
arguments_type: target_required
137141
return_type: number
138142
evaluator_type: python
139-
code: def evaluate(answer, target):\n return 0.5
143+
code: |-
144+
def evaluate(answer, target):
145+
return 0.5
140146
version_logs_count: 1
141147
total_logs_count: 1
142148
inputs:
@@ -192,7 +198,9 @@ service:
192198
arguments_type: target_required
193199
return_type: number
194200
evaluator_type: python
195-
code: def evaluate(answer, target):\n return 0.5
201+
code: |-
202+
def evaluate(answer, target):
203+
return 0.5
196204
version_logs_count: 1
197205
total_logs_count: 1
198206
inputs:
@@ -261,7 +269,9 @@ service:
261269
arguments_type: target_required
262270
return_type: number
263271
evaluator_type: python
264-
code: def evaluate(answer, target):\n return 0.5
272+
code: |-
273+
def evaluate(answer, target):
274+
return 0.5
265275
version_logs_count: 1
266276
total_logs_count: 1
267277
inputs:
@@ -314,7 +324,9 @@ service:
314324
arguments_type: target_required
315325
return_type: number
316326
evaluator_type: python
317-
code: def evaluate(answer, target):\n return 0.5
327+
code: |-
328+
def evaluate(answer, target):
329+
return 0.5
318330
version_logs_count: 1
319331
total_logs_count: 1
320332
inputs:
@@ -364,7 +376,9 @@ service:
364376
arguments_type: target_required
365377
return_type: number
366378
evaluator_type: python
367-
code: def evaluate(answer, target):\n return 0.5
379+
code: |-
380+
def evaluate(answer, target):
381+
return 0.5
368382
version_logs_count: 1
369383
total_logs_count: 1
370384
inputs:
@@ -419,7 +433,9 @@ service:
419433
arguments_type: target_required
420434
return_type: number
421435
evaluator_type: python
422-
code: def evaluate(answer, target):\n return 0.5
436+
code: |-
437+
def evaluate(answer, target):
438+
return 0.5
423439
version_logs_count: 1
424440
total_logs_count: 1
425441
inputs:
@@ -491,7 +507,9 @@ service:
491507
arguments_type: target_required
492508
return_type: number
493509
evaluator_type: python
494-
code: def evaluate(answer, target):\n return 0.5
510+
code: |-
511+
def evaluate(answer, target):
512+
return 0.5
495513
version_logs_count: 1
496514
total_logs_count: 1
497515
inputs:
@@ -501,8 +519,11 @@ service:
501519
method: POST
502520
auth: true
503521
docs: >-
504-
Submit evalutor judgment for an existing Log. Creates a new Log and
505-
makes evaluated one its parent.
522+
Submit Evaluator judgment for an existing Log.
523+
524+
525+
Creates a new Log. The evaluated Log will be set as the parent of the
526+
created Log.
506527
display-name: Log
507528
request:
508529
name: CreateEvaluatorLogRequest
@@ -538,6 +559,9 @@ service:
538559
provider_latency:
539560
type: optional<double>
540561
docs: Duration of the logged event in seconds.
562+
stdout:
563+
type: optional<string>
564+
docs: Captured log and debug statements.
541565
provider_request:
542566
type: optional<map<string, unknown>>
543567
docs: >-
@@ -554,7 +578,7 @@ service:
554578
Unique identifier for the Session to associate the Log to.
555579
Allows you to record multiple Logs to a Session (using an ID
556580
kept by your internal systems) by passing the same `session_id`
557-
in subsequent log requests.
581+
in subsequent log requests.
558582
parent_id:
559583
type: string
560584
docs: >-
@@ -595,7 +619,9 @@ service:
595619
type: optional<string>
596620
docs: The name of the Environment the Log is associated to.
597621
name: createEvaluatorLogRequestEnvironment
598-
judgment: optional<unknown>
622+
judgment:
623+
type: optional<CreateEvaluatorLogRequestJudgment>
624+
docs: Evaluator assessment of the Log.
599625
spec: optional<CreateEvaluatorLogRequestSpec>
600626
response:
601627
docs: Successful Response
@@ -611,6 +637,8 @@ service:
611637
parent_id: parent_id
612638
session_id: session_id
613639
version_id: version_id
640+
source:
641+
openapi: openapi/openapi.auto.json
614642
types:
615643
SrcExternalAppModelsV5EvaluatorsEvaluatorRequestSpec:
616644
discriminated: false
@@ -619,10 +647,24 @@ types:
619647
- root.CodeEvaluatorRequest
620648
- root.HumanEvaluatorRequest
621649
- root.ExternalEvaluatorRequest
650+
source:
651+
openapi: openapi/openapi.auto.json
652+
CreateEvaluatorLogRequestJudgment:
653+
discriminated: false
654+
docs: Evaluator assessment of the Log.
655+
union:
656+
- boolean
657+
- string
658+
- list<string>
659+
- double
660+
source:
661+
openapi: openapi/openapi.auto.json
622662
CreateEvaluatorLogRequestSpec:
623663
discriminated: false
624664
union:
625665
- root.LlmEvaluatorRequest
626666
- root.CodeEvaluatorRequest
627667
- root.HumanEvaluatorRequest
628668
- root.ExternalEvaluatorRequest
669+
source:
670+
openapi: openapi/openapi.auto.json

0 commit comments

Comments
 (0)