Researchers at the University of Washington have developed a groundbreaking smart speaker system that utilizes robotic “acoustic swarms” to isolate and manage sounds in busy environments, offering enhanced audio control and privacy. This innovative system uses self-deploying microphones powered by deep-learning algorithms to track individual speakers and separate overlapping conversations, even if the voices are similar.
The ability to locate and control sound, such as isolating one person talking from a specific location in a crowded room, has been a challenge for researchers, particularly without visual cues from cameras. However, the team at the University of Washington has made a breakthrough with their shape-changing smart speaker system.
The system consists of self-deploying microphones that divide rooms into speech zones and track the positions of individual speakers. Each microphone, about an inch in diameter, automatically deploys from and returns to a charging station, similar to a fleet of Roombas. This feature allows the system to be moved and set up automatically in different environments. This technology provides better control of in-room audio compared to a central microphone in, for example, a conference room meeting.
The team tested their prototype in various settings, including offices, living rooms, and kitchens, with groups of three to five people speaking. The system was able to discern different voices within 1.6 feet (50 centimeters) of each other 90% of the time, without prior information about the number of speakers. It could process three seconds of audio in an average of 1.82 seconds, making it suitable for live streaming, although a bit slow for real-time communication like video calls.
As the technology progresses, researchers believe that acoustic swarms could be deployed in smart homes to differentiate people talking with smart speakers, allowing only those in an “active zone” to control devices like a TV vocally.
To address privacy concerns, the researchers have implemented several safeguards. Firstly, the microphones navigate using sound, not cameras, making them easily detectable with blinking lights when they are active. Additionally, the acoustic swarms process all audio locally, ensuring that it is not processed in the cloud like most smart speakers. This local processing provides a privacy constraint. The system also has the capability to create “mute zones” or bubbles where recorded conversations are not accessible, offering privacy benefits beyond those of current smart speakers.
While this technology may evoke concerns about surveillance, the team highlights that it can be used for the opposite purpose, protecting and respecting privacy. Users can specify certain areas as off-limits for recording, and the system will create a bubble around those areas where no audio will be recorded. Furthermore, conversations can be muted or put into private zones to ensure privacy even in close proximity to other conversations being recorded.
Overall, the development of this shape-changing smart speaker system using robotic acoustic swarms represents a significant advancement in audio control and privacy. With further improvements and potential applications in smart homes, this technology has the potential to revolutionize the way we interact with and control audio devices.
Note:
1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it