“Idea to Image” introduces a novel approach to automatic image design and generation, leveraging the power of GPT-4V(ision) to facilitate multimodal iterative self-refinement. This system empowers users to efficiently transform high-level creative concepts into effective text-to-image (T2I) prompts, thanks to its ability to iteratively explore and refine these prompts based on feedback and the system’s growing understanding of different T2I model characteristics. Through continuous cycles of prompt revision and draft image synthesis, Idea2Img not only enhances the semantic and visual quality of generated images but also supports complex input ideas that include interleaved image-text sequences and design instructions, demonstrating a significant advancement over traditional T2I models as confirmed by user preference studies.