Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 3 - Tranformers & Large Language Models
BeginnerKey Summary
- •Artificial Intelligence (AI) is the science of making machines do tasks that would need intelligence if a person did them. Today’s AI mostly focuses on specific tasks like recognizing faces or recommending products, which is called narrow AI. A future goal is general AI, which would do any thinking task a human can, but it does not exist yet.
- •Machine learning is a main way to build AI by letting computers learn from data instead of hard-coded rules. With enough examples, the computer finds patterns and uses them to make predictions on new cases. This shift from rules to learning is why AI has grown so fast.
- •Supervised learning uses labeled examples, like photos marked as “cat” or “dog,” to teach a model to predict the right label for new photos. It’s like a student learning from an answer key. This method powers spam filters, image classifiers, and many medical diagnosis tools.
- •Unsupervised learning uses data without labels to discover structure on its own. It finds groups, patterns, and relationships, like grouping customers by behavior. This is useful when labeling is too expensive or unclear.
- •Reinforcement learning (RL) teaches an agent by giving rewards or penalties for its actions. The agent tries actions, sees the reward, and learns what works over time, like learning to win at chess or Go. RL is also used in robotics and recommendation systems.
- •Natural Language Processing (NLP) helps computers understand and generate human language. It powers chatbots, translators, and text generators by turning words into machine-friendly forms. Tasks include understanding meaning, answering questions, and writing responses.
- •Computer vision lets computers “see” and interpret images and video. It identifies objects, recognizes faces, and tracks motion, enabling self-driving cars, medical imaging, and security systems. Vision systems learn visual patterns and generalize to new scenes.
Why This Lecture Matters
AI is now part of daily life and nearly every industry, from healthcare and finance to transportation and retail. Understanding the basics helps professionals spot real opportunities and avoid hype. With this knowledge, product managers can choose the right approach for features, analysts can design better data projects, engineers can build safer pipelines, and leaders can ask the right questions about risks and ROI. Knowing supervised, unsupervised, and reinforcement learning lets you match methods to problems and data you actually have. Understanding NLP and computer vision makes it easier to plan chatbots, content tools, perception systems, and quality checks. Just as important, awareness of job displacement, bias, and misuse helps teams build responsibly, comply with laws, and protect users. In a job market hungry for AI literacy, being able to explain how AI works, where it helps, and how to safeguard it is a career advantage. This lecture gives a grounded map: what AI is, how it learns, where it applies, and how to steer it with ethics—skills that matter in modern work and society.
Lecture Summary
Tap terms for definitions01Overview
This lecture explains Artificial Intelligence (AI) in clear, simple terms and shows how it fits into everyday life. It begins with a classic definition from Marvin Minsky: AI is the science of making machines do things that would need intelligence if a person did them. The talk then divides AI into two main branches. Narrow (or weak) AI focuses on specific tasks—like recognizing faces, playing chess, or recommending products—and is the kind we use most today. General (or strong) AI is a hypothetical future system that could match a human’s ability to learn and reason across any topic, but it does not exist yet. After setting this foundation, the lecture moves into the main ways we actually build AI systems: machine learning, natural language processing (NLP), and computer vision.
The lecture gives special attention to machine learning because it powers much of modern AI. Instead of writing step-by-step rules for every possible situation, we give the computer data, and it learns patterns from that data. The lecture introduces three common types of machine learning. Supervised learning uses labeled examples (like images marked “cat” or “dog”) to teach a model to predict correct labels for new examples. Unsupervised learning works without labels and tries to find hidden structure, such as grouping customers into segments by behavior. Reinforcement learning teaches agents by giving them rewards or penalties for their actions, helping them learn strategies over time, which is how many systems learned to play games like chess and Go.
Next, the lecture explains two important application areas: NLP and computer vision. NLP lets computers understand, interpret, and generate human language. It enables tasks like machine translation, chatbots, and answering questions. Computer vision allows machines to “see” and make sense of images and video. It supports object detection, face recognition, and motion tracking, which power self-driving cars, security systems, and medical image analysis.
To connect these ideas to the real world, the lecture surveys where AI is used today: healthcare (diagnosing diseases and suggesting treatments), finance (fraud detection and risk management), transportation (self-driving cars and traffic optimization), entertainment (recommendation engines for movies and music), manufacturing (automation and quality control), and retail (personalized shopping and supply chain optimization). The main message is that AI is already deeply woven into many industries and everyday tools, and its impact is growing.
The lecture also addresses risks and ethics. It highlights three core concerns: job displacement as AI automates certain tasks; bias in AI systems when training data is skewed or unfair, leading to harmful outcomes; and misuse of AI for malicious ends, such as autonomous weapons or invasive surveillance. The lecture emphasizes the need for safeguards and ethical guidelines to ensure AI is developed and used to benefit people while minimizing harm. The overall structure moves from defining AI and its types, to core methods (especially machine learning and its three main styles), to key application domains, and then to societal impacts and responsibilities. By the end, you understand what AI is, how it is built, where it is used, what can go wrong, and how to think about building and using it responsibly.
This lecture is ideal for beginners who want a broad but solid introduction. You do not need math or programming to follow it, though some familiarity with everyday apps (like recommendation systems or translation tools) helps. After studying this material, you will be able to explain narrow vs. general AI, describe the main kinds of machine learning with examples, name what NLP and computer vision do, recognize common uses of AI across industries, and discuss key risks and ethical needs. The content is organized to give a complete picture, so you can confidently talk about AI in school, at work, or in everyday conversations.
Key Takeaways
- ✓Start with the problem, not the model. Write a one-sentence goal, who it helps, and how you will measure success. This prevents building flashy systems that miss real needs. Clear goals guide data collection and model choice.
- ✓Match the method to the data: labeled (supervised), unlabeled (unsupervised), or sequential with rewards (RL). If you lack labels, avoid forcing a supervised approach and consider clustering or anomaly detection. For interactive decisions over time, think RL. The right fit saves effort and boosts results.
- ✓Invest in high-quality, representative data. Bad or biased data leads to unfair or weak models, no matter how fancy the algorithm. Document sources and known limits. Diverse data improves fairness and robustness.
- ✓Split data into training, validation, and test sets. Tune on validation, report final results on the test set only. This keeps you honest about generalization. Avoid peeking at the test set during development.
- ✓Choose simple baselines first. A clear, strong baseline reveals whether complex models are worth it. Compare with fair metrics. Complexity without gain adds risk and cost.
- ✓Use the right metrics for the task. For imbalanced problems, look beyond accuracy to precision, recall, or F1. For recommendations, consider top-k metrics. Metrics should reflect real-world costs of errors.
- ✓Guard against overfitting. Use regularization, early stopping, and cross-validation where helpful. Monitor train vs. validation performance. Simpler models and more data often help.
Glossary
Artificial Intelligence (AI)
AI is when we make machines do tasks that would need intelligence if a person did them. This can include seeing, talking, planning, or learning. AI does not have to think like a human; it just needs to solve the task well. Today most AI is narrow and focused on specific jobs.
Narrow AI (Weak AI)
Narrow AI is designed to do one specific job very well. It cannot easily switch to a very different task. Many apps you use every day rely on narrow AI. It can be superhuman in one thing but clueless in another.
General AI (Strong AI)
General AI is a future idea of a machine that can learn and do any thinking task a human can. It would understand many areas and adapt to new ones. This kind of AI does not exist yet. It is a long-term research goal and a topic of debate.
Machine Learning (ML)
Machine learning lets computers learn patterns from data instead of following hand-written rules. With many examples, the system figures out how inputs relate to outputs. This is why AI has grown so fast. It handles messy, real-world problems better than fixed rules.
