In this video, AOS Kumar from Stochastic Programming explores how to access Large Language Models (LLMs) for free using Cloudflare’s Worker AI and Gateway AI services. The video provides an overview of these services, their benefits, and real-world applications. Worker AI hosts various LLMs, including models for text embeddings, generation, classification, and translation. Gateway AI offers features like caching, rate limiting, and real-time logging to enhance API usage. AOS Kumar demonstrates how to create an API token, configure the services, and use Postman to send requests. He explains the advantages of using Cloudflare’s services, such as reduced latency and improved visibility into API usage. The video also covers how to integrate other AI providers like OpenAI and AWS Bedrock with Gateway AI. AOS Kumar highlights a current limitation with IP address restrictions when using Gateway AI and suggests a potential fix. The video concludes with a call to action for viewers to explore Cloudflare’s AI services and leverage them for their projects.