XAudio2

1.

2.

XAudio2 Performance Tips
Tom Mathews
Lead Developer
Advanced Technology Group
Microsoft

3.

Overview
XAudio2 overview
Voice & Graph optimization
xAPO optimization
Voice reuse
Compression
Streaming
Debugging / Performance analysis

4.

What Is XAudio2
Low-level cross-platform game audio API
Play hundreds of sounds at once
Loop, start, stop, adjust sounds at any time
Volume, pitch, filter, reverb, DSP
Identical code on both platforms
Building block for higher-level sound design tools
such as the XACT3 engine
Replaced XAudio1
Replaced DirectSound for gaming purposes

5.

Features
Flexible channel routing
Any channel can be sent to any other channel with
attenuation/amplification
Multistage submixing
For example, each car can have a submix (exhaust,
transmission, engine, etc.), and each car’s mix can then
be fed into another submix for environmental effects

6.

Advanced Features
Deferred commands
Most operations (Start, SetParameter, SetOutputVoice,
SetEffectChain) can be grouped and applied as atomic,
sample-accurate operations
xAPOs (DSPs)
In-box APOs (Reverb, notch, etc.)
Create custom equalizers, compressors, limiters,
monitors,
phase shifters, attenuators, delays, …..
And they can be cross-platform, like the in-box APOs.

7.

XAudio2: Minimum CPU
Vectorized signal processing
XAudio2 requires at least SSE
Available since 1999 for PCs
Makes extensive use of it in processing code
Your processing code may do the same
XAudio2 also makes use of SSE2/FTZ/DAZ
Available since 2001 for PCs
XAudio2 makes use of XMA hardware-accelerated
decode and VMX instructions for 360

8.

Audio Flow
Pitch/SRC + filter
Effect1
EffectN
32k
(Mono)
Filter
Pitch/SRC + filter
Effect1
EffectN
Sample
Rate.
Conv.
EffectN
44k (5.1)
XMA2
Effect1
32k (5.1)
24k
(Mono)
xWMA
32k (5.1)
32k (5.1)
Pitch/SRC + filter
Effect1
EffectN
Sample Rate Conversion
32k (5.1)
EffectN
XMA2
Submix Voices Mastering Voice
Effect1
32k,
Mono
Source Voices
48k
(5.1)

9.

Graph Optimization
Filter
Effect1
EffectN
Sample
Rate.
Conv.
SUBMIX!
Apply FX to many voices at once for the price of
one
Make use of lower-rate sub-graphs
Lower rate == fewer samples == less CPU
Run expensive global send FX at a lower rate/channels
than the final mix
Provides for more detailed control of performance
characteristics
Allows for smooth crossfades between disparate FX
e.g. Environmental reverb crossfade

10.

Source Voices
32k
(Mono)
Setting up for best performance
XMA2
Pitch + filter
Effect1
EffectN
Use XAudio2_VOICE_NOPITCH & _NOSRC when
possible
Minimize MaxFrequencyRatio when used
Stopped voices are not touched by
the real-time processing thread
Voice Pooling
Much faster than repeated allocation/free
SetFrequencyRatio may be applied to reuse
voices for
data of a different sampling rate
Sample
Rate.
Conv.

11.

Voice Pooling
32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Create pools of Voices
Each Pool is unique on Source Content (xWMA, XMA,
ADPCM) and Channel Count
When you need a new Voice
Identify a lower priority voice in the pool
Call Stop(), then FlushSourceBuffers()
With February XDK, you no longer have to wait for the
next Process() before reusing
If needed: Call SetSourceSampleRate()
Remember: Stopped voices are CPU-free

12.

FX Optimization
32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
XAPO_BUFFER_SILENT
Indicates silent data should be assumed
Actual memory may be uninitialized
Buffers are 16-byte aligned & interleaved perchannel
Use VMX128 instructions
Use in-place processing
In-place: Input buffer == Output buffer
Use EnableEffect/DisableEffect
More convenient than destroying and recreating the
voice/FX
Sample
Rate.
Conv.

13.

XAudio2 Memory Pool
All internal XAudio2 allocations pooled
Allows for efficient parameter passing without imposing
cumbersome parameter scope requirements
Xaudio2 allocates sooner, rather than later
Pool reset when last IXAudio2 instance released
Gives applications control of memory pool lifespan
Possible uses include reclaiming memory between levels
Remember this?
Memory is pooled for many things, including SRCs
and
Pitch Shifting

14.

Compression
32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Always use compression to minimize
disk/memory/cache footprint
Reduce XMA/xWMA quality per sound for optimal
quality/size tradeoff
Seek tables:
Allows caller to skip past unwanted packets, without
having to load the data itself.

15.

Compression - Tradeoffs
32k
(Mono)
XMA2
Pitch + filter
Effect1
PCM
Not compressed, so highest fidelity
ADPCM (Windows Only)
Slight Compression (~4:1, lossy)
XMA (360 Only)
Hardware-accelerated decode (316 concurrent streams)
Good compression (~6+:1)
xWMA
Software decode (Mono/Stereo~=.6-1.2% of 360 core)
Excellent compression (~20+:1)
Good for voices/music, no seamless looping
EffectN
Sample
Rate.
Conv.

16.

Streaming
32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Cycle a circular queue of buffers to submit new
data to XAudio2
Submit new data within voice’s OnBufferEnd
callback
Increasing read-ahead before starting the voice
decreases chance of glitching, but can increase
perceptible latency depending on implementation
Consider streaming several buffers into the engine
before throttling
XMA2 Block Size should be in increments of 32K to
mirror DVD I/O patterns

17.

xWMA Streaming
32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Each xWMA file contains a list of offsets
(DPDS chunk)
EachDPDS
submit 1needs a
modified form of this
2 Submit
Chunk:
50002000
(50000
Submit
list:
0
7000
13000)
0
st
1000
1
2000
2
3000
3
5000
4
7000
5
12000
6
1000
1
2000
2
3000
nd
12000
4000
2
3000)
9000
3000)
(7000-
(12000-
Sample
Rate.
Conv.

18.

Blocking Calls – XAudio2 Thread
The XAudio2 realtime thread can be blocked by:
StopEngine and IXAudio2::Release()
DestroyVoice()
Thus, the need for voice reuse
XAudio2 callbacks
Check time spent in effect chain
Your code can be blocked by any XAudio2 API call,
waiting on internal realtime thread locks.

19.

Debugging
Use the debug versions of XAudio2, X3DAudio,
XAPOBase, etc.
SetDebugConfiguration may be used to control
debug behavior for XAudio2
VolumeMeter xAPO useful for detecting clipping
PIX counters available to track CPU, memory, and
voice statistics
Similar data available via
IXAudio2::GetPerformanceData
Watch for other threads on the core that may be
slowing down XAudio2

20.

Audio performance analysis with
PIX

21.

Quad
5.1
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Filter
Reverb
Sample
Rate.
Conv.
EffectN
Stere
o
Pitch/SRC + filter
Effect1
Mono
Sample Rate Conversion
A Case Study

22.

PIX

23.

Timing Capture

24.

OnProcessingPassEnd Callback
Use callbacks to notify Hardware Thread 5 that it
can resume execution

25.

xbPerfView
w/ Sampling Capture

26.

A Case Study
Quad
5.1
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Sample
Rate.
Conv.
Sample Rate Conversion
Stere
o
Pitch/SRC + filter
Reverb
EffectN
Mono
Filter
Effect1
Adding submixes

27.

xbPerfView
w/ Submixing

28.

Stere
o
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
32k
48k
Quad
5.1
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Reverb
Sample
Rate.
Conv.
Sample Rate Conversion
Mono
32k
48k
Filter
EffectN
SRC & Reverb
Change to Mono->5.1 Reverb
Effect1
A Case Study

29.

xbPerfView
Final Numbers
Component
Start CPU%
Final CPU%
% Freed
MatrixMix
17.48%
4.25%
13.23%
Reverb
6.37%
4.94%
1.43%
Resampling
14.74%
11.41%
3.33%
Total
38.59%
20.60%
17.99%
Idle
27.95%
48.47%
20.52%

30.

With Processing to Spare…

31.

Summary
SUBMIX!
Use OnBufferEnd callbacks to stream data
Intentionally choose your compression methods
Carefully manage your voice interactions
Watch for Blocking Calls
Pool voices where possible
Use EnableEffect/DisableEffect
Profile your title to focus your efforts

32.

www.microsoftgamefest.com
© 2009-2010 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
English     Русский Правила