A matter of approach: procedural sound design in video games

Igor Dall'Avanzi
Oct 12, 2016
7 min read

Realism and immersion are two common goals that have signed the history of development of video games (Collins, 2008, 133/134). This quest can be observed on a graphic level, thanks to the photorealism reached from modern AAA titles, and on an interface one, thanks to the advancement of virtual reality, which promises to “Step [players] into incredible virtual worlds and experience entertainment in new and extraordinary ways.” (Sony, 2016).

Could we say the same of the sonic elements of video games then?

The way sounds are handled today is still largely based on sample playback and processing (Farnell, 2007, p. 14/15; Marks 2009, p.6), an approach that dates back to the middle of the 90s, when games started being released on CDs. (Collins, 2008, p. 68/69).

A different approach

Let’s think for a moment on how interactions and movement of models are handled in video games. For each thing inside the game world a model is built. This model is based on a skeleton that can be moved on its joints in order to create animations.

A programmer will use this skeleton to create different animations assets, like “walk”, “run”, “jump” or “attack”. The animations are then programmed to react to game inputs and interpolated to switch seamlessly. Moreover, different textures could be put on the same skeleton to have different characters (e.g: different type of humanoids or bears, picture 1 and 2 shows two different textures (meshes) on the same skeleton).

When the game is played, the engine is moving the player character and the npcs inside the game world, knowing where they are, what they are doing, how to make them interact, and handling how light impacts on them and their shade.

In the following two GIFs a run animation is shown on the two charachters. Once its programmed, it can workbe used for all the assets that share the same skeleton. Also, notice that the shade movement is not implemented by someone, but handled entirely by the engine (unreal engine 4).

If we were to treat animations the way audio is handled instead we would probably have some movie file for each character animation, which would be played when a trigger is received. (a “walk” movie for walk, a “run” movie for run etc.). We would probably need one file for each type of character/object in the game, and then we would probably apply some randomized visual effects to make them less repetitive, for example a continuos little variation in the frame rate. We should build some visual crossfades to move from one animation to another, and to have proper 3d modeling we would need a lot of video files, at least one for each perspective, and a way to merge them.

The difference of the two approaches is that the first focuses on imitating the process inside the engine (how movement happens) while the second on capturing the process in the real world and playing it back in the game, and this reflects exactly how audio is implemented in games. The sounds needed are recorded and then variable playback and digital signal processing are applied to avoid repetition and save memory space.

Procedural sound design, instead, is more like creating a “model” for each type of audio, making the game able to create sounds itself in real time and to add the variations needed to them. (Kastbauer, 2010)

In the following video Graham Gatheral shows a procedural sound design approach. Instead of recording a lot of samples for each possible interaction with the barrel, the sound is generated by a synthesizer that takes values like height or acceleration to modulate its sound.

In this way, the synthesizer that handles the sounds of the barrel could be a part of the asset, like the skeleton or his textures, saving recording and programming time when the actor is spawned inside the game world or new interactions are added.

The semiotic problem

Before going deeper with some examples of how procedural sound design is used in video games, a definition for it is needed. Being a recent field, the terms procedural, algorithmic and generative are usually associated to audio in different ways. Farnell (2007) lists and explains all types of linear and non linear audio, classifying procedural as “non-linear, often synthetic sound, created in real time according to a set of programmatic rules and live input”. A concept enforced also by Weir (2016) (sound designer and audio director at hello games) for whom procedural audio is “real time sound creation using synthesis techniques such as physical modeling with deep links into game system”.

While the concept stated of Farnell includes also non synthetic systems, in this article I will talk about examples of procedural audio based on real time synthesis and how they could affect the industry.

Procedural sound design in the industry

GTA V

In GTA V 30% of the audio assets were made in a procedural way, thanks to the new real time synthesis toolkit implemented in RAGE, the proprietary audio middleware developed by Rockstar and used in games like GTA IV and Red Dead Redemption. The main aims of having synthesis were reaching a better response to game events in assets than what would have been possible with samples, and save memory by doing that.

The system was used to create sounds like door slams, air conditioning units, bicycles and even “a procedural dubstep generator” that “can create infinite dubstep” (MacGregor, 2014). The tool was also really useful to handle noise, Mac Gregor said, in which the compression used to save memory lead to unwanted artifacts.

The fact that this approach came from one of the biggest companies in the industry, which puts budget and resources in audio, having a dedicated team made both by programmers and sound designers is no surprise: one of the major factors against procedural audio is the skill set it requires, that includes both programming and synthesis techniques, and an important understanding of how sounds are generated in the physical world. (Farnell, 2007, p. 13, Nair, 2012)

No Man’s Sky' Vocalien plugin

No Man’s sky’s Vocalien has been discussed in one of my previous posts. To give voice to the procedurally generated creatures of No Man’s sky, the audio team at Hello Games created a custom plugin for Wwise that creates a wave and processes it following the traits of the creature itself, like their size or their type, giving a different, but appropriate, voice to everyone of them.

This is a perfect example of how real time synthesis can cover a large amount of unpredictable assets saving memory space.

Audio Gaming’s Audio Weather

Audio Weather is “the first procedural audio for video games weather system” (AudioGaming, 2016) released by Audio Gaming as a plug in for FMod. The system is based on physical modeling and generates a continuously changing environmental soundscape through synthesis, and elements like wind or rain intensity can be associated to game events.

The company has also built more complex plugins for Digital Audio Workstations, like Audiowind or AudioRain. These are examples of how procedural audio can be used to empower the workflow in linear media: instead of running through hours of recorded files to find the right sound that best suits a certain scene, sound designers can work directly on the parameter of the sound itself, as they would do with a synthesizer.

Conclusions

Procedural sound design could allow a faster, more realistic and memory cheaper integration of audio in video games (Farnell, 2007; Kastbauer, 2010; Nair, 2012; Nair, 2014), but is more demanding in terms of CPU usage and development time; moreover, it's still difficult to reach the "cinematic realism" often used in video games. (Collins, 2008, p. 134; Farnell, 2007, p. 24)

Another limit about his diffusion is the fact of being a really recent field, that merges different types of knowledge, in particular informatics and sound design (Farnell, 2007, p.25). Specific learning resources are rare and, despite visual programming languages for music like Max or PureData represent valid tools for modeling synthesis and a big step forward, their implementation inside game engines is still difficult. (Nair, 2014, Kastbauer, 2011). An easier way to make these software communicate (like the work that is being done with Open Sound Control) could help to develop and spread more this type of audio implementation inside the industry.

For now, the most realistic scenario of procedural sound design being used in the industry is an hybrid approach, in which the basic interactions with objects (like in the barrell video) are handled by procedural sound design, while more meaningful sounds, like bosses, weapons or dialogues, are designed and implemented linearly.

Bibliography

Audiogaming (2016) Game Audio [Online]. Available from <https://lesound.io/products/gameaudio/>. [Accessed October 12th, 2016]

Collins, K. (2008) Game Sound: an introduction to the history, theory, and practice of video game music and sound design. MIT Press.

Farnell A. (2007) An Introduction to procedural audio and its application in computer games [Online]. Available from <http://cs.au.dk/~dsound/DigitalAudio.dir/Papers/proceduralAudio.pdf> [Accessed October 12th 2016]

Gatheral G. (n.d.) Graham Gatheral - Audio for games [Online]. Available from: <http://www.gatheral.co.uk> [Accessed October 12th 2016]

Hello Games (2016) No Man's Sky [multi-platform]. Guildford: Hello Games

Kastbauer (2010) Audio Implementation Greats #8: Procedural Audio Now [Online]. Available from <http://designingsound.org/2010/09/audio-implementation-greats-8-procedural-audio-now/> [Accessed October 12th 2016]

Marks, A. (2009) The Complete Guide to Game Audio: For Composers, Musicians, Sound Designers, Game Developers [Kindle e-book]. Burlington: Focal Press. Available From <http://www.amazon.it> [Accessed 25th February 2016].

Sony (2016) Explore Playstation VR [Online] US: Sony Interactive Entertainment LCC. Available from <https://www.playstation.com/en-us/explore/playstation-vr/> [Accessed October 12th 2016]

MacGregor, A (2016) The Sound of Grand Theft Auto V. In: Game Developer Conference 2014 March 17-21 2014, San Francisco, CA

Nair, V. (2012) Procedural Audio: Interview with Andy Farnell [Online]. Available from: <http://designingsound.org/2012/01/procedural-audio-interview-with-andy-farnell/> [Acessed October 12th 2016]

Nair, V. (2014) What’s The Deal With Procedural Game Audio? [Online] Available from: <http://designingsound.org/2014/10/whats-the-deal-with-procedural-game-audio/> [Acessed october 12th 2016]

Rockstar Games GTA V (2014) [multi-platform]. Rockstar Games

Weir, P. (2016) Encouraging Chaos, the use of generative sound in No Man's Sky. In Sonàr 2016 16/06/2016, Barcellona, E