Skip to content

Commit 21b919a

Browse files
Update 2024-07-01
1 parent ecbf0be commit 21b919a

1 file changed

Lines changed: 347 additions & 0 deletions

File tree

_workshop/2024-07-01

Lines changed: 347 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,348 @@
1+
Large Language Model Agents llm-agents-mooc
12

3+
Large Language Model Agents
4+
===========================
5+
6+
MOOC, Fall 2024
7+
---------------
8+
9+
Announcement:
10+
11+
Sign up and learn more about the LLM Agents Hackathon [here](https://rdi.berkeley.edu/llm-agents-hackathon/)!
12+
13+
### Prospective Students
14+
15+
* This course has completed. Video lectures can still be found in the syllabus below. Please sign up for the [Spring 2025 iteration](https://llmagents-learning.org/sp25) today!
16+
* All certificates have been released! Thank you for a great semester.
17+
18+
Course Staff
19+
------------
20+
21+
Instructor
22+
23+
Co-instructor
24+
25+
![](assets/dawn-berkeley.jpg)
26+
27+
![](assets/XinyunChen.jpg)
28+
29+
[Dawn Song](https://people.eecs.berkeley.edu/~dawnsong/)
30+
31+
Xinyun Chen
32+
33+
Professor, UC Berkeley
34+
35+
Research Scientist,
36+
Google DeepMind
37+
38+
Guest Speakers
39+
--------------
40+
41+
.table { width: 100%; table-layout: fixed; border-collapse: collapse; } .table td { width: 25%; text-align: center; vertical-align: top; padding: 10px; }
42+
43+
![](assets/Denny Zhou.jpeg)
44+
45+
![](assets/Shunyu Yao.jpeg)
46+
47+
![](assets/Chi Wang.jpg)
48+
49+
![](assets/Jerry Liu.jpg)
50+
51+
Denny Zhou
52+
53+
Shunyu Yao
54+
55+
Chi Wang
56+
57+
Jerry Liu
58+
59+
![](assets/Google Deepmind.png)
60+
61+
![](assets/openai.png)
62+
63+
![](assets/Google Deepmind.png)
64+
65+
![](assets/LlamaIndex.png)
66+
67+
![](assets/Burak Gokturk.png)
68+
69+
![](assets/Omar Khattab.jpg)
70+
71+
![](assets/Graham Neubig.jpg)
72+
73+
![](assets/Nicolas Chapados.jpg)
74+
75+
Burak Gokturk
76+
77+
Omar Khattab
78+
79+
Graham Neubig
80+
81+
Nicolas Chapados
82+
83+
![](assets/Google.jpg)
84+
85+
![](assets/databricks.png)
86+
87+
![](assets/CMU.png)
88+
89+
![](assets/servicenow.png)
90+
91+
![](assets/Yuandong Tian.png)
92+
93+
![](assets/Jim Fan.jpeg)
94+
95+
![](assets/Percy Liang.jpeg)
96+
97+
![](assets/Ben Mann.jpeg)
98+
99+
Yuandong Tian
100+
101+
Jim Fan
102+
103+
Percy Liang
104+
105+
Ben Mann
106+
107+
![](assets/meta ai.jpeg)
108+
109+
![](assets/nvidia.png)
110+
111+
![](assets/stanford.png)
112+
113+
![](assets/Anthropic.png)
114+
115+
Course Description
116+
------------------
117+
118+
Large language models (LLMs) have revolutionized a wide range of domains. In particular, LLMs have been developed as agents to interact with the world and handle various tasks. With the continuous advancement of LLM techniques, LLM agents are set to be the upcoming breakthrough in AI, and they are going to transform the future of our daily life with the support of intelligent task automation and personalization. In this course, we will first discuss fundamental concepts that are essential for LLM agents, including the foundation of LLMs, essential LLM abilities required for task automation, as well as infrastructures for agent development. We will also cover representative agent applications, including code generation, robotics, web automation, medical applications, and scientific discovery. Meanwhile, we will discuss limitations and potential risks of current LLM agents, and share insights into directions for further improvement. Specifically, this course will include the following topics:
119+
120+
* Foundation of LLMs
121+
* Reasoning
122+
* Planning, tool use
123+
* LLM agent infrastructure
124+
* Retrieval-augmented generation
125+
* Code generation, data science
126+
* Multimodal agents, robotics
127+
* Evaluation and benchmarking on agent applications
128+
* Privacy, safety and ethics
129+
* Human-agent interaction, personalization, alignment
130+
* Multi-agent collaboration
131+
132+
Syllabus
133+
--------
134+
135+
Date
136+
137+
Guest Lecture
138+
(3:00PM-5:00PM PST)
139+
140+
Supplemental Readings
141+
142+
Sept 9
143+
144+
**LLM Reasoning**
145+
Denny Zhou, Google DeepMind
146+
[Livestream](https://www.youtube.com/live/QL-FS_Zcmyo) [Intro](https://rdi.berkeley.edu/llm-agents-mooc/slides/intro.pdf) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/llm-reasoning.pdf) [Quiz 1](https://forms.gle/1pb6nkwZPyUqvFPe6)
147+
148+
\- [Chain-of-Thought Reasoning Without Prompting](https://arxiv.org/abs/2402.10200)
149+
\- [Large Language Models Cannot Self-Correct Reasoning Yet](https://arxiv.org/abs/2310.01798)
150+
\- [Premise Order Matters in Reasoning with Large Language Models](https://arxiv.org/abs/2402.08939)
151+
\- [Chain-of-Thought Empowers Transformers to Solve Inherently Serial Problems](https://arxiv.org/abs/2402.12875)
152+
153+
Sept 16
154+
155+
**LLM agents: brief history and overview**
156+
Shunyu Yao, OpenAI
157+
[Livestream](https://www.youtube.com/watch?v=RM6ZArd2nVc) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/llm_agent_history.pdf) [Quiz 2](https://forms.gle/t9ictrAjxTrWd9tr6)
158+
159+
\- [WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents](https://arxiv.org/abs/2207.01206)
160+
\- [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629)
161+
162+
Sept 23
163+
164+
**Agentic AI Frameworks & AutoGen**
165+
Chi Wang, AutoGen-AI
166+
**Building a Multimodal Knowledge Assistant**
167+
Jerry Liu, LlamaIndex
168+
[Livestream](https://www.youtube.com/live/OOdtmCMSOo4) [Chi’s Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/autogen.pdf) [Jerry’s Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/MKA.pdf) [Quiz 3](https://forms.gle/osuDEXRmDHJvLi1T7)
169+
170+
\- [AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation](https://arxiv.org/abs/2308.08155)
171+
\- [StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows](https://arxiv.org/abs/2403.11322)
172+
173+
Sept 30
174+
175+
**Enterprise trends for generative AI, and key components of building successful agents/applications**
176+
Burak Gokturk, Google
177+
[Livestream](https://www.youtube.com/live/Sy1psHS3w3I) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/Burak_slides.pdf) [Quiz 4](https://forms.gle/gcoo9HWGdF3xVAPGA)
178+
179+
\- [Google Cloud expands grounding capabilities on Vertex AI](https://cloud.google.com/blog/products/ai-machine-learning/rag-and-grounding-on-vertex-ai?e=48754805)
180+
\- [The Needle In a Haystack Test: Evaluating the performance of RAG systems](https://towardsdatascience.com/the-needle-in-a-haystack-test-a94974c1ad38)
181+
\- [The AI detective: The Needle in a Haystack test and how Gemini 1.5 Pro solves it](https://cloud.google.com/blog/products/ai-machine-learning/the-needle-in-the-haystack-test-and-how-gemini-pro-solves-it?e=48754805)
182+
183+
Oct 7
184+
185+
**Compound AI Systems & the DSPy Framework**
186+
Omar Khattab, Databricks
187+
[Livestream](https://www.youtube.com/live/JEMYuzrKLUw) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/dspy_lec.pdf) [Quiz 5](https://forms.gle/tXzmfgTsdYW5XjLL6)
188+
189+
\- [Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs](https://arxiv.org/abs/2406.11695)
190+
\- [Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together](https://arxiv.org/abs/2407.10930)
191+
192+
Oct 14
193+
194+
**Agents for Software Development**
195+
Graham Neubig, Carnegie Mellon University
196+
[Livestream](https://www.youtube.com/live/f9L9Fkq-8K4) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/neubig24softwareagents.pdf) [Quiz 6](https://forms.gle/v4AD4vFYCjdK9qeDA)
197+
198+
\- [SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering](https://arxiv.org/abs/2405.15793)
199+
\- [OpenHands: An Open Platform for AI Software Developers as Generalist Agents](https://arxiv.org/abs/2407.16741)
200+
201+
Oct 21
202+
203+
**AI Agents for Enterprise Workflows**
204+
Nicolas Chapados, ServiceNow
205+
[Livestream](https://www.youtube.com/live/-yf-e-9FvOc) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/agentworkflows.pdf) [Quiz 7](https://forms.gle/VWpd1q23PBFptVCe8)
206+
207+
\- [WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?](https://arxiv.org/abs/2403.07718)
208+
\- [WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks](https://arxiv.org/abs/2407.05291)
209+
\- [TapeAgents: a Holistic Framework for Agent Development and Optimization](https://rdi.berkeley.edu/llm-agents-mooc/assets/tapeagents.pdf)
210+
211+
Oct 28
212+
213+
**Towards a unified framework of Neural and Symbolic Decision Making**
214+
Yuandong Tian, Meta AI (FAIR)
215+
[Livestream](https://www.youtube.com/live/wm9-7VBpdEo) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/102824-yuandongtian.pdf) [Quiz 8](https://forms.gle/MLZy2czaMtJYhTSD7)
216+
217+
\- [Beyond A\*: Better Planning with Transformers via Search Dynamics Bootstrapping](https://arxiv.org/abs/2402.14083)
218+
\- [Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces](https://arxiv.org/abs/2410.09918v1)
219+
\- [Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets](https://arxiv.org/abs/2410.01779)
220+
\- [SurCo: Learning Linear Surrogates For Combinatorial Nonlinear Optimization Problems](https://arxiv.org/abs/2210.12547)
221+
222+
Nov 4
223+
224+
**Project GR00T: A Blueprint for Generalist Robotics**
225+
Jim Fan, NVIDIA
226+
[Livestream](https://www.youtube.com/live/Qhxr0uVT2zs) [Slides](https://rdi.berkeley.edu/llm-agents/assets/jimfangr00t.pdf) [Quiz 9](https://forms.gle/SsvzxWugqJD2VB2FA)
227+
228+
\- [Voyager: An Open-Ended Embodied Agent with Large Language Models](https://voyager.minedojo.org/)
229+
\- [Eureka: Human-Level Reward Design via Coding Large Language Models](https://eureka-research.github.io/)
230+
\- [DrEureka: Language Model Guided Sim-To-Real Transfer](https://eureka-research.github.io/dr-eureka/)
231+
232+
Nov 11
233+
234+
**No Class - Veterans Day**
235+
236+
Nov 18
237+
238+
**Open-Source and Science in the Era of Foundation Models**
239+
Percy Liang, Stanford University
240+
[Livestream](https://www.youtube.com/live/f3KKx9LWntQ) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/percyliang.pdf) [Quiz 10](https://forms.gle/VMu6XGEWHhW1xvij6)
241+
242+
\- [Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models](https://arxiv.org/abs/2408.08926)
243+
244+
Nov 25
245+
246+
**Measuring Agent capabilities and Anthropic’s RSP**
247+
Ben Mann, Anthropic
248+
[Livestream](https://www.youtube.com/live/6y2AnWol7oo) [Slides](https://rdi.berkeley.edu/llm-agents-mooc/slides/antrsp.pdf) [Quiz 11](https://forms.gle/QU9EyEyeWP5sjPAu7)
249+
250+
\- [Announcing our updated Responsible Scaling Policy](https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy)
251+
\- [Developing a computer use model](https://www.anthropic.com/news/developing-computer-use)
252+
253+
Dec 2
254+
255+
**Towards Building Safe & Trustworthy AI Agents and A Path for Science‑ and Evidence‑based AI Policy**
256+
Dawn Song, UC Berkeley
257+
[Livestream](https://www.youtube.com/live/QAgR4uQ15rc) [Slides](https://rdi.berkeley.edu/llm-agents/assets/dawn-agent-safety.pdf) [Quiz 12](https://forms.gle/EHwL4Jb3LZzNDw7w7)
258+
259+
\- [A Path for Science‑ and Evidence‑based AI Policy](https://understanding-ai-safety.org/)
260+
\- [DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models](https://arxiv.org/abs//2306.11698)
261+
\- [Representation Engineering: A Top-Down Approach to AI Transparency](https://arxiv.org/abs/2310.01405)
262+
\- [Extracting Training Data from Large Language Models](https://www.usenix.org/system/files/sec21-carlini-extracting.pdf)
263+
\- [The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks](https://www.usenix.org/system/files/sec19-carlini.pdf)
264+
265+
Completion Certificate
266+
----------------------
267+
268+
LLM Agent course completion certificates will be awarded to students based on the rules of the following tiers. All assignments are due December 12th, 2024 at 11:59PM PST. To recieve your certificate please complete the [Certificate Declaration Form](https://forms.gle/nYGJLPTdb7af2Dq59) by December 17th, 2024 at 11:59PM PST.
269+
270+
**Trailblazer Tier:**
271+
272+
* Complete all 12 quizzes associated with each lecture
273+
* Pass the written article assignment
274+
275+
**Mastery Tier:**
276+
277+
* Complete all 12 quizzes associated with each lecture
278+
* Pass the written article assignment
279+
* Pass all 3 lab assignments
280+
281+
**Ninja Tier:**
282+
283+
* Complete all 12 quizzes associated with each lecture
284+
* Pass the written article assignment
285+
* Submit a project to the [LLM Agents Hackathon](https://rdi.berkeley.edu/llm-agents-hackathon/)
286+
287+
**Legendary Tier:**
288+
289+
* Complete all 12 quizzes associated with each lecture
290+
* Pass the written article assignment
291+
* Become a prize winner or finalist at the [LLM Agents Hackathon](https://rdi.berkeley.edu/llm-agents-hackathon/)
292+
293+
**Honorary Tier:**
294+
295+
* For the most helpful/supportive students in discord!
296+
* Meets coursework requirements of Ninja OR Mastery Tier
297+
298+
_NOTE: completing the assignments associated with this course in order to earn a Completion Certificate is completely optional. You are more than welcome to just watch the lectures and audit the course!_
299+
300+
Coursework
301+
----------
302+
303+
All coursework will be released and submitted through the course website.
304+
305+
### Quizzes
306+
307+
All quizzes are released in parallel with (or shortly after) the corresponding lecture. Please remember to complete the quiz each week. Although it’s graded on completion, we encourage you to do your best. The questions are all multiple-choice and there are usually at most 5 per quiz. The quizzes will be posted in the Syllabus section.
308+
309+
An archive of all of the quizzes can be found [here](https://docs.google.com/document/d/1pYvOxt2UWwc3z4QlW2Di5LQT-FJPWZ419mxJT7pFPsU/edit?usp=sharing).
310+
311+
### Written Article
312+
313+
Create a twitter post, linkedin post, or medium article to post on Twitter of roughly 500 words. Include the link to our MOOC website in the article and tweet.
314+
315+
* Students in the Trailblazer or Mastery Tier should either summarize information from one of the lecture(s) or write a postmortem on their learning experience during our MOOC
316+
* Students in the Ninja or Lengendary Tier should write about their hackathon submission
317+
318+
The written article is an effort-based assignment that will be graded as pass or no pass (P/NP). Submit your written article assignment [HERE](https://forms.gle/7ekobPNSWDLBWnDT6).
319+
320+
### Labs
321+
322+
There will be 3 lab assignments to give students some hands-on experience with building agents. Students must pass all 3 lab assignments. All labs are due December 12th, 2024 at 11:59pm PST. Please read the instructions carefully [here](https://rdi.berkeley.edu/llm-agents-mooc/assets/instructions.pdf). Please read the FAQs [here](https://docs.google.com/document/d/12MHAbLh86kGDuc2ddOmLHwjAuHh_rHiuAf_UiFuYcvE/edit?usp=sharing) before asking questions in Discord.
323+
324+
Assignment
325+
326+
Submission Form
327+
328+
[Lab 1](https://drive.google.com/drive/folders/1mOisEUkoLBcIcdkdGDiftq4IFAJ3xpzJ?usp=sharing)
329+
330+
[Submission 1](https://forms.gle/kTXzpaJqh3d4BvEWA)
331+
332+
[Lab 2](https://drive.google.com/file/d/1GOd6miwZ_PzcqubPt8dNLfbq_9X5TUHf/view?usp=sharing)
333+
334+
[Submission 2](https://forms.gle/5J7ZM1DvmP4TfiAG8)
335+
336+
[Lab 3](https://drive.google.com/file/d/1QW3YpNJQdYaIavexlQ9YVsiW7kgT6VOg/view?usp=sharing)
337+
338+
[Submission 3](https://forms.gle/9zEpec81YVr2LUzt7)
339+
340+
### Hackathon
341+
342+
Check out our hackathon website [here](https://rdi.berkeley.edu/llm-agents-hackathon/). Sign up for the hackathon [here](https://docs.google.com/forms/d/e/1FAIpQLSevYR6VaYK5FkilTKwwlsnzsn8yI_rRLLqDZj0NH7ZL_sCs_g/viewform) — every member of the team should signup individually. Then, complete the team creation form [here](https://docs.google.com/forms/d/e/1FAIpQLSdKesnu7G_7M1dR-Uhb07ubvyZxcw6_jcl8klt-HuvahZvpvA/viewform). There are no limits to team sizes.
343+
344+
For any questions, please visit our Hackathon FAQ [here](https://docs.google.com/document/d/1P4OBOXuHRJYU9tf1KH_NQWvaZQ1_8wCfNi3MOnCw6RI/edit?usp=sharing). You can also ask questions and find potential team members in our [LLM Agents Discord](https://discord.gg/NWVpQ9rBvd).
345+
346+
Submit your final hackathon project [here](https://forms.gle/jNr8nSH9Cy9qpYcu5) before December 17th, 2024 @11:59PM PST.
347+
348+
This page was generated by [GitHub Pages](https://pages.github.com).

0 commit comments

Comments
 (0)