Amazon has announced that digital assistant Alexa can now technically mimic any voice based on a few clips, around 1 minute recording.
If it works as advertised it would be a technical milestone, as computer-generated voices don’t really fool humans and are still a poor choice for video voice-overs, for example. You can watch the live demo in the following video (timecode 1:02:38):
In fact I found it too transform the pressure/tone of the voice from the typical machine-generated agent to match the target human voiceprint. It’s not easy to quantify the demonstration’s success without knowing the original voice, but it seemed reasonably successful, although it still sounded a bit robotic.
The sentence chosen was undoubtedly well chosen for the demo, as it lends itself to slow, almost robotic reading. The technology is similar to AIs used to turn your images into Picasso paintings, but applied to an audio stream.
It might sound fun to have Alexa speak in the voice of your favorite celebrity, friend, or family member. However, the internet quickly turned its attention to the use of voice clips of deceased family members. This is the use case presented by an Amazon executive in the video above.
On the one hand, hearing the voice of a loved one who is no longer with us may sound like a healing experience. However, it is also a potentially slippery slope with unintended consequences. Many people began to question whether the technology could be misused to impersonate living people and whether we have the right to use voices without consent.
The answer is probably “it depends” based on the situation and users. However, one thing is certain: these technologies exist and are only getting better. It’s only a matter of time before synthesized voices become indistinguishable from human ones.
This article was previously published on Source link