Envisioning Virtual Humans usage in medical training in the next 5 Years
The demand for higher quality training for medical professionals is steadily increasing, and the expectations for our medical frontliners to be wholly prepared from day one is now higher than ever. The novel coronavirus pandemic outbreak has inevitably disrupted the medical education scene as well, hindering the clinical exposure of our medical students.
One of the biggest contributors to the preparedness of our medical students is experiencing real patient situations. Sadly, this will always mean that real patients enduring their ailments, will always remain as a part of the medical training process.
There are many other professions other than the medical community that face similar issues as well. Hospitality, etc. It is a challenge that is commonplace in any front-facing profession.
That does not have to be the norm in the near future. We envision virtual humans to be the first line of training for the professionals around us, Increasing confidence and experience before the first human being is even introduced in the training process.
Creating the AI of Virtual Humans:
What makes us unique as individuals, and how can we create a realistic simulation of that?
There are a considerable amount of factors that determine who we are as an individual.
The experiences we gain along the way in our lives, the community that we live in, our social circles, which in turn dictates our mannerisms, our day-to-day choices, our commitments, etc.
This poses a big challenge to simulating a realistic virtual human. What factors are we looking at? How should they be categorized, and labeled? Given how interconnected these factors are in affecting an individual's response to a given prompt?
Take for example:
What is currently used:
Thankfully, we have generated copious amounts of conversational data in the internet, in sources such as Reddit, Quora, etc, which allows us to create NLP models that can simulate human conversations. Large language models such as GPT-3 and Megatron are trained using such internet data, that are able to generate responses eerily similar to a real human behind the screen. It takes into account the history of the conversation you are having with the model, and it is able to be pre-trained to respond appropriately in an expected situation.
Current Issues with this approach at the moment
Pre-training such language models to suit your needs is not for the faint of heart. Not only is it costly, substantial trial-and-error is required to get your models to perform like how you expect them to. This is difficult, both theoretically and practically, in situations where you require a large number of pre-trained models to simulate multiple conversations quickly and cost-effectively, and have them responding exactly like how we want them to.
The way we envision this to change in 5 years:
As we understand more about how each factor contributes to the type of response a human will provide, we will be able to, in turn, have more efficient pre-training, and creation of more realistic models.
Better methods of training data collection, and data manipulation will increase, allowing developers to create more realistic virtual human conversational models, with a substantially smaller pool of data.
How we interact with virtual humans:
A variety of novel interaction systems to interact with virtual humans will be invented:
Interaction systems are generally too limited for realistic Virtual Human interactions:
Interactions with virtual humans are still widely limited to audio inputs, namely speech input. Granted, there are other interaction systems that are utilized as well, such as computer vision for face tracking, but they are limited in technical maturity.
Speech-focused conversational AI pipelines:
Companies such as Nvidia have showcased their virtual avatar framework recently in GTC 2022, which is widely regarded as the cutting-edge of virtual avatar interactions. They generally follow the following architecture:
Encompassing more of non-verbal communications:
Of course, we do not only converse using speech, a large portion of communication is done non-verbally. For realistic conversations to be properly replicated, such inputs must be accounted for.
The growth of models targeting non-verbal communications:
In the next years to come, we envision more models specifically targeting non-verbal communication will be created, and hardware that will support recording of such communicational cues will become more widespread.
How we experience virtual humans
Computer mediums used to experience virtual human interactions will evolve:
We are limited by how we experience virtual human interactions currently:
When interacting with a virtual human, the user is always reminded that they are interacting through a computer medium, and at a much lower level of interactivity than in real life.
Most interactions with the closest representation of virtual humans, virtual avatars, are experienced through our smartphones, smart booths, and personal IT devices.
Improving the way we simulate visuals, voice and touch:
Improvements in each of the corresponding interactivity of the following senses: Sight, Sound, and Touch. The VR/AR technology has improved leaps and bounds when it comes to improving the interactivity of sight and sound, and many potential technologies are being brought into the market regarding the simulation of touch.
We envision the future of virtual humans to be highly realistic and be able to respond to a wide variety of prompts and inputs. In the next 5 years, there will be large strides in the realm of language models, the level of interactiveness of virtual humans, and the level of fidelity of virtual humans.
However despite all the progress in deep learning, advanced simulations and AR/VR platforms, it's not easy at all to create medically accurate virtual humans for scenario training.
Here at MediVR, we are striving to make this process easier, while improving upon the quality of our virtual humans. So do join us in this exciting journey, and envision the change in this industry together.
Interested to join us in building virtual humans for medical training? We're hiring!