In this video, Sam Charrington discusses the recent advancements in AI, focusing on the work of Nicholas Carlini from Google DeepMind regarding adversarial machine learning and model security. They delve into Carlini’s 2024 ICML best paper winner, which demonstrates the ability to steal the last layer of production language models, including ChatGPT and PaLM-2. The conversation explores the implications of model stealing, ethical concerns about model privacy, and the significance of the embedding layer in language models. Additionally, they discuss remediation strategies implemented by OpenAI and Google to counteract these security threats, as well as future directions in AI security research. The video also touches on another of Carlini’s papers about the application of differential privacy in large-scale pre-training.