Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

Advanced search

1371877 Posts in 64675 Topics- by 56802 Members - Latest Member: brendalinden

January 22, 2020, 06:05:19 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperAudioHow to get consecutive voice clips to sound natural
Pages: [1]
Author Topic: How to get consecutive voice clips to sound natural  (Read 276 times)
Foolish Mortals
Level 0

View Profile WWW
« on: September 26, 2019, 11:35:58 AM »

In the game I'm working on I have to play several short consecutive voice-clips to form a complete sentence. Example (each <> bracket is a different voice clip):

"<Bob here,> <we're at>  <some town> <and are on our way to> <some city>."

Stitching together different voice-clips like this makes it sound stilted and disconnected. This is because there are unnatural pauses when switching clips, and the pitch and tone of the speaker changes.

My current efforts include two methods for removing the unnatural pauses:

1. starting the next clip early if a silence is detected at the end of the preceding clip
2. skipping the first few milliseconds of the new clip up to the first detected 'sound'.

These work OK at removing the unnatural pausimh, but detecting what 'silence' is is difficult, especially when dealing with multiple voice-actors and microphones.

How could I make stitching together voice-clips sound more natural? Any advice would be appreciated. This has to be done in real-time inside the game (I'm using Unity if that matters), and can't be pre-processed or done ahead of time.

Level 3

Iron Synth Chef & Voltage Architect

View Profile WWW
« Reply #1 on: September 30, 2019, 04:40:25 PM »

Interesting question...
Are you stuck with the current audio recordings?
I could imagine if the lines are recorded/spoken in a certain way, with deliberate pauses etc it could sound natural but unfortunately I don't have any advice beyond that.

Richard Kain
Level 10

View Profile WWW
« Reply #2 on: October 15, 2019, 10:22:41 AM »

The tone isn't something that you can really edit extensively. But the pitch and the timing are both manageable. Also, the delay you are experiencing when switching samples might be due to decompression. For speech that needs to avoid delays it is common to store those files as WAV files, with as little compression as possible. That is the easiest way to avoid delays when loading an audio file up for playback.

If you record your audio samples with a relatively normal pitch, you can use an "average" pitch value to apply to those samples during playback. Most modern audio systems allow you to adjust the pitch of audio samples at run-time. If there is too much variance between the pitch of each file, you could also apply a per-sample pitch adjustment. This would require a bit more setup for each instance, but should improve your results.
Pages: [1]
Jump to:  

Theme orange-lt created by panic