Google’s Project Zero security researcher, Natalie Silvanovich discovered a serious vulnerability in Group FaceTime which allowed an attacker to call a target and force the call to connect without user interaction from the target, allowing the attacker to listen to the target’s surroundings without their knowledge or consent.
The bug was remarkable in both its impact and mechanism. The ability to force a target device to transmit audio to an attacker device without gaining code execution was an unusual and possibly unprecedented impact of a vulnerability.
The vulnerability was a logic bug in the FaceTime calling state machine that could be exercised using only the user interface of the device.
The logic bugs were found in the Signal, Google Duo, Facebook Messenger, JioChat, and Mocha messaging apps and at present all fixed.
WebRTC and State Machines
With WebRTC, we can add real-time communication capabilities to the application that works on top of an open standard. It supports video, voice, and generic data to be sent between peers, allowing developers to build powerful voice and video communication solutions.
Hence the majority of video conferencing applications are implemented using WebRTC.
WebRTC connections are created by exchanging call set-up information in Session Description Protocol (SDP) between peers, a process which is called signalling.
In a typical connection, the caller begins by sending an SDP offer, and then the callee responds with an SDP answer. These messages contain information that is needed to transmit and receive media, including codec support, encryption keys and much more. After the offer/answer exchange, peers can send SDP candidates to other peers.
Candidates are potential network paths that the two peers can use to connect, and SDP candidates contain information such as IP addresses and TURN servers. Peers usually send more than one candidate to a peer, and candidates can be sent at any time during a connection.
WebRTC connections maintain an internal state related to whether an offer or answer has been received and processed, however, applications that use WebRTC usually have to maintain their state machine to manage the user state of the application.
Every input device is considered a ‘track’, and each specific track must be added to a specific peer connection by calling addTrack before audio or video is transmitted.
Tracks can also be disabled, which is useful for implementing mute and camera-off features.
Bugs and Available Patches in the Messaging Apps
Silvanovich revealed the vulnerability that allowed calls to be connected without interaction from the callee in Signal bug is patched in September 2019 made it possible to connect the audio call by sending the connect message from the caller devices to the callee one instead of the other way around, without user interaction.
The Google Duo bug, a race condition that allowed callees to leak video packets from unanswered calls to the caller, was fixed in December 2020.
Two similar vulnerabilities were discovered in the JioChat and Mocha messengers in July 2020, bugs that allowed sending JioChat audio, fixed in July 2020 and to send Mocha audio and video fixed in August 2020 after exploitation, without user consent.
The Facebook Messenger flaw which allowed audio calls to connect before the call was answered was addressed in November 2020.
Silvanovich also looked for similar bugs in other video conferencing apps, including Telegram and Viber, but didn’t find any such issues.
To read the original article: