05 June 2025 14:00 - 14:30
Beyond the text: Multimodal generative AI for future intelligence business applications
Generative AI moves beyond text-based models into multimodal AI systems integrating vision, speech, and structured data. This session will bring together the current developments in multimodal AI, including vision-language models, generative AI for tabular data, and AI-driven workflow automation.
We will see how businesses can capitalize on LLMs, transformers, and diffusion models to build next-generation AI agents capable of comprehending and generating text across multiple modalities. Use cases highlight how financial services, healthcare, and cloud platforms are harnessing multimodal AI to elevate customer experiences, automate sophisticated workflows, and accelerate business transformation.
The lecture introduces actionable insights into building scalable multimodal AI pipelines, integrating APIs, and utilizing foundation models such as GPT, DALL·E, and Gemini to redefine intelligent applications in the enterprise landscape.