In recent months, you may have heard the phrase ‘multimodal AI’ in Learning & Development, you also might not – and that’s okay too! The 2024 AI Index Report, highlighted it as an emerging AI technology. Many people are, quite rightly, getting their heads around what AI is, what it means for them and how they may (or may not) use it.
But such is the pace of technological advancements, it would be fair to say that there’s no such thing as ‘sitting back and relaxing’ anymore.
The long and short of multimodal AI
In short, multimodal AI combines various modes of data, such as text, images and audio which allows the AI model to gain a more comprehensive understanding of content and context – allowing for more complex and nuanced outputs.
In a longer, perhaps more engaging explanation (we’ll let you decide!) we’ll be using omelets to describe how multimodal AI differs from text-based AI.
A text-based omelet
A user tells AI that they want to make an omelet, so the AI gives them the recipe and method.
The user follows the recipe and makes the omelet, maybe asking the odd question along the way in text form with no other inputs. The result? An omelet. Could be a good omelet, could be a bad omelet, we don’t know.
A multimodal omelet
In this case, a user tells the AI that they want to make an omelet and, like last time, the AI gives them the recipe and method. The user follows the recipe again and starts making the omelet.
They then take a photo of it part way through cooking it, share this with the AI and ask if it’s nearly ready. The AI takes in this image, and responds letting the user know they need to flip the omelet based on what it sees. The result? A better cooked omelet. Maybe…depends on the chef, but you get what we mean.
Great, but where does multimodal AI fit in L&D?
Of course, making omelets isn’t part of the L&D job description, so let’s take this and apply it to L&D.
Some early practical applications of multimodal AI in L&D
Personalized skills paths
- Multimodal AI can analyze learners’ interactions with different modalities (text, images, videos) to recommend personalized skills, resources and pathways.
- Assessment formats can be adapted by multimodal AI, providing tailored feedback and support throughout
Skill Development and Performance Support
- Multimodal AI job aids can provide employees with multimedia resources to support skill development on the job.
- Performance data can be analyzed by multimodal AI from various sources to identify skill gaps, recommend targeted skills interventions, and track progress over time.
Emotional Intelligence and Soft Skills Training
- Multimodal AI can analyze facial expressions, tone of voice, and language patterns to provide feedback on emotional intelligence skills and interpersonal communication.
- AI-powered simulations can simulate real-world scenarios for practicing negotiation, conflict resolution, and other soft skills in a multimodal environment.
Accessibility and Inclusivity
- Multimodal AI can enhance accessibility by providing alternative modalities for content consumption, such as audio descriptions for visual content or text-to-speech for written materials.
- Multimodal AI can assist learners with language barriers by offering translations, subtitles, and visual aids to improve comprehension and engagement.
Steps to using multimodal AI in your L&D
1) Research and identify needs
NOTE: This one is the most important! You shouldn’t go straight for multimodal AI without understanding your ‘why’
- Define specific learning objectives, challenges and opportunities where multimodal AI can enhance L&D initiatives.
- Research practical applications of multimodal AI in L&D to align with organizational goals
2) Collaborate with AI experts
- Partner with technology vendors specializing in AI to explore potential solutions
- Participate in industry events and training sessions to network with experts and stay updated with advancements
3) Evaluate tech providers
- Research and evaluate AI technology providers offering multimodal solutions tailored to L&D requirements
- Request demos to see it in action
4) Develop a business case
- Articulate how multimodal AI can support business objectives overall
- Identify potential ROI, cost savings, efficiency gains and competitive advantages associated with using the technology