Real-time Digital Human for Dementia Care
End-to-end AI companion system with STT + GPT-4o + TTS pipeline, RAG-enhanced dialogue, and NVIDIA Audio2Face + UE5 avatar — finalist in 2025 Cougar Investigator Grant Challenge.
Problem
Dementia patients lack consistent, responsive companionship. Caregivers face burnout and existing digital solutions cannot sustain natural, multi-turn voice interactions with emotional expressiveness.
Action
Built an end-to-end real-time pipeline: STT for speech recognition (92% average accuracy), GPT-4o for contextual dialogue generation, and TTS for natural voice output. Integrated Fine-tuning + RAG to improve context retention and recall accuracy by 25%. Combined NVIDIA Audio2Face with Unreal Engine 5 for lip-sync and facial expression animation, boosting realism and immersion by 30%.
Result
Achieved smooth multi-turn voice interaction with 3-4 second response latency. Context retention and recall accuracy improved by 25% via RAG. Avatar realism and immersion improved by 30% via Audio2Face + UE5. Selected as finalist in 2025 Cougar Investigator Grant Challenge (Shark Tank format).
Learnings
Deepened understanding of multimodal AI pipeline integration (STT → LLM → TTS → Avatar), real-time streaming architectures, and designing empathetic systems for vulnerable populations.