
Computer-Aided Design (CAD) forms the backbone of designing most physical products today, enabling engineers to translate 2D sketches into 3D models for testing and refinement before final production. However, mastering CAD software is often difficult due to its complexity and extensive command options, necessitating significant investment in time and practice.
Researchers at MIT are working to simplify the learning curve associated with CAD by developing an AI model that operates CAD software in a manner akin to a human user. This innovative system can produce a 3D version of an object from a provided 2D sketch by effectively navigating the software’s various options.
The MIT team introduced a new dataset called VideoCAD, encompassing over 41,000 instances that illustrate the process of building 3D models within CAD. By leveraging these instructional videos, the AI can perform tasks comparable to a human, significantly enhancing accessibility. This approach aims towards the creation of a “CAD co-pilot,” envisioned as a tool capable of both suggesting subsequent actions and automating tedious sequences that typically require extensive manual operations.
Ghadi Nehme, a graduate student in the Department of Mechanical Engineering at MIT, notes, “There’s an opportunity for AI to increase engineers’ productivity as well as make CAD more accessible to more people.” Faez Ahmed, an associate professor involved in the project, emphasizes the significance of lowering the barriers to design, allowing individuals without extensive CAD training to more easily develop 3D models and unleash their creativity.
The team’s research will be presented at the upcoming Conference on Neural Information Processing Systems (NeurIPS) in December.
Building on recent advancements in AI-driven user interface (UI) agents that can execute tasks within software, such as compiling information into Excel, the team pondered whether these agents could successfully navigate the multifaceted labyrinth of CAD software. They aimed to develop a UI agent capable of executing commands in an organized manner to convert a 2D sketch into its 3D counterpart.
Initially, they utilized an existing dataset of human-designed objects from CAD, which mapped the high-level commands like “sketch line” and “circle” used in object creation. However, they soon discovered that understanding the finer details of CAD interactions was imperative for effective training. Their newfound system translates abstract commands into practical actions within the software interface.
As Nehme explains, this process allows the AI to break down high-level instructions into specific click-and-drag actions. Subsequently, the researchers produced over 41,000 videos documenting human interactions with CAD, detailing real-time clicks, mouse movements, and keyboard inputs that informed their training model.
Upon consuming this rich dataset, dubbed VideoCAD, the AI model can transform a 2D sketch into a fully constructed 3D shape by manipulating the CAD software directly. The creations produced range in complexity, from basic components like brackets to intricate architectural designs, signaling the potential for further training aimed at enhancing the model’s capabilities.
Mehdi Ataei, a senior research scientist at Autodesk Research, who was not part of the study, remarked, “VideoCAD is a valuable first step toward AI assistants that help onboard new users and automate repetitive modeling work that follows familiar patterns.” He expresses excitement for the future of CAD software, envisioning successors that could operate across various CAD systems, incorporating richer functionalities, and tackling the nuanced workflows typically employed by human designers.