Recent advances in computing offer the possibility to scale real-time 3D virtual audio scenes to include hundreds of simultaneous sound sources, rendered in realtime, for large numbers of audio outputs. Our Spatial Audio Toolkit for Immersive Environments (SATIE), allows us to render these dense audio scenes to large multi-channel (e.g. 32 or more) loudspeaker systems, in realtime and controlled from external software such as 3D scenegraph software. As we describe here, SATIE is designed for improved scalability: minimum dependency between nodes in the audio DSP graph for parallel audio computation, controlling sound objects by groups and load balancing computation of geometry that allow to reduce the number of messages for controlling simultaneously a high number of sound sources. The paper presents SATIE along with example use case scenarios. Our initial work demonstrates SATIE's flexibility, and has provided us with novel sonic sensations such as ``audio depth of field'' and real-time sound swarming.