John Schulman, co-founder of OpenAI, discusses the future of AI, reasoning, and reinforcement learning (RLHF) in a conversation with Dwarkesh Patel. Schulman explains the distinction between pre-training and post-training, highlighting that pre-training involves imitating internet content, while post-training refines the model to be more useful and aligned with human preferences. He envisions AI models becoming significantly better over the next five years, capable of handling more complex tasks and acting coherently for extended periods. Schulman emphasizes the importance of continual training and the potential for models to generalize from diverse data. He also discusses the challenges of ensuring AI models are safe and aligned with human values, suggesting that coordination among AI developers and careful deployment are crucial. The conversation touches on the potential for AI to accelerate scientific research, the importance of human oversight in AI-driven processes, and the need for explicit AI literacy education. Schulman concludes by discussing the future of AI in assisting with long-term projects and the potential economic and regulatory implications of AI advancements.