All sensory information is arbitrarily interchangeable. A camera analyzes the live visual image and creates a translation in sound frequencies.
I actually recommend staging this at just barely audible level. This "barely audible" idea is something Glenn Gould talks about, regarding rehearsing as a kid with his mother running the vacuum cleaner. He could only partially hear the piano,
but because he intentionally was causing the music, but in his mind it sounded ideal. In his mind he got to practice on an ideal Steinway in an acoustically perfect concert hall.