In recent years, foundation models have revolutionized the cutting-edge AI research domains, from natural language processing and vision to robotics by being trained on massive datasets. Specifically, large language models (LLMs) focusing on understanding and generating human-like text has revolutionized chatbots, translation systems, and content creation. On the other hand, multi-modal models transcend single data type limitations and integrate text, image, and sometimes audio inputs to create more comprehensive AI systems. These models can understand context, make connections across modalities, and interact with users in more nuanced and sophisticated ways. In this roundtable, we will talk about these foundation models, their reasoning and cognitive abilites, their limitations to reach AGI, and their exciting future.
Researcher @ Google DeepMind
Associate Professor @ Sharif University of Technology
Chief AI Officer @ Tapsage