An apparatus comprising means for: in response to user input, selecting at least one sound source of a spatial audio scene, comprising multiple sound sources, the spatial audio scene being defined by spatial audio content; selecting at least one related contextual sound source based on the at least one selected sound source; and causing rendering of an audio preview, representing the spatial audio content, that can be selected by a user, wherein the audio preview comprises a mix of sound sources including at least the at least one selected sound source and the at least one related contextual sound source but not all of the multiple sound sources of the spatial audio scene, and wherein selection of the audio preview causes an operation on at least the selected sound source.
展开▼