Researchers have developed a technique for embedding data in music and transmitting it to a smartphone.
Since the data are imperceptible to the human ear, they don’t affect listening experience. This could have interesting applications in hotels, museums, and department stores. For example, background music can contain the access data for the local Wi-Fi network, and a mobile phone’s built-in microphone can receive this data.
“What we’re doing is embedding the data in the music itself—transmitting data from the loudspeaker to the mic.”
“That would be handy in a hotel room,” says Simon Tanner, a doctoral student in the Computer Engineering and Networks Laboratory at ETH Zurich, “since guests would get access to the hotel Wi-Fi without having to enter a password on their device.”
Musical data storage
To store the data, Tanner, doctoral student Manuel Eichelberger, and master’s student Gabriel Voirol make minimal changes to the music. In contrast to other scientists’ attempts in recent years, the researchers state that their new approach allows higher data transfer rates with no audible effect on the music.
“Our goal was to ensure that there was no impact on listening pleasure,” Eichelberger says.
Tests the researchers have conducted show that, in ideal conditions, their technique can transfer up to 400 bits per second without the average listener noticing the difference between the source music and the modified version (see also the audio sample). Given that under realistic conditions a degree of redundancy is necessary to guarantee transmission quality, the transfer rate will more likely be some 200 bits—or around 25 letters—per second.
Hear audio samples below:
Storing the data only marginally affects the music itself. People can hardly hear the differences. For example, here is an audio sample of a performance by the ETH Big Band (© Henning Eckels (comp./arr.)):
Data have been embedded in this music stream at a rate of 300 bits per second. Specifically, the short URL of this news article is repeated every 0.7 seconds. The algorithm for receiving the data is not yet publicly available as a smartphone app.
By way of comparison, here is the unchanged original version:
“In theory, it would be possible to transmit data much faster. But the higher the transfer rate, the sooner the data becomes perceptible as interfering sound, or data quality suffers,” Tanner adds.
The researchers use the dominant notes in a piece of music, overlaying each of them with two marginally deeper and two marginally higher notes that are quieter than the dominant note. They also make use of the harmonics (one or more octaves higher) of the strongest note, inserting slightly deeper and higher notes here, too. It is all these additional notes that carry the data. While a smartphone can receive and analyze this data via its built-in microphone, the human ear doesn’t perceive these additional notes.
“When we hear a loud note, we don’t notice quieter notes with a slightly higher or lower frequency,” Eichelberger says. “That means we can use the dominant, loud notes in a piece of music to hide the acoustic data transfer.” It follows that the best music for this kind of data transfer has lots of dominant notes—pop songs, for instance. Quiet music is less suitable.
To tell the decoder algorithm in the smartphone where it needs to look for data, the scientists use very high notes that the human ear can barely register: they replace the music in the frequency range 9.8–10 kHz with an acoustic data stream that carries the information on when and where across the rest of the music’s frequency spectrum to find the data being transmitted.
The transmission principle behind this technique is fundamentally different from the well-known RDS system as used in car radios to transmit the radio station’s name and details of the music that is playing.
“With RDS, the data is transmitted using FM radio waves. In other words, data is sent from the FM transmitter to the radio device,” Tanner explains. “What we’re doing is embedding the data in the music itself—transmitting data from the loudspeaker to the mic.”
Source: ETH Zurich