The Secure Real-Time Transport Protocol (SRTP) is a security system that extends the
Real-time Transport Protocol (RTP) with a variety of security mechanisms.
WebRTC uses DTLS-SRTP for encryption, authentication and message integrity, as well as protection against replay attacks. This gives privacy by encrypting the RTP load and authentication. SRTP is one of the components for security, it is very convenient for developers who are looking for a reliable and secure API. But what is SRTP and how does it work?
What is SRTP?
SRTP enhances RTP security. The protocol was published by the Internet Engineering Task Force (IETF) in
RFC 3711 , in March 2004.
SRTP provides privacy by encrypting the RTP load, not including RTP headers. It also supports authentication, which is widely used as a defense mechanism in RTP. While SRTP can be used in its entirety, it is also possible to disable / enable certain functions. The main plug in SRTP is key management, since there are many options: DTLS-SRTP, MIKEY in SIP, SDES (Security Description) in SDP, ZRTP, etc.
Encryption
SRTP uses AES (Advanced Encryption Standard) as the default cipher. There are two encryption modes in AES: counter mode (Segmented Integer Counter Mode) and f8 mode. Typically, counter mode is used — it is critical when transmitting traffic over an unreliable network with potential packet loss. The f8 mode is used in mobile 3G networks and is a variant of the Output Feedback mode, in which the decryption is the same as encryption.
SRTP also allows developers to disable encryption using a zero cipher. The zero cipher does not do encryption, it copies the incoming stream directly to the outgoing one, without changes.
It is not recommended to use a zero cipher in WebRTC, since data security is quite important for end users. In fact, valid WebRTC implementations
MUST support encryption, currently with DTLS-SRTP .
Integrity
To preserve the integrity of messages in SRTP, an authentication label is created based on the content and part of the packet headers, which is then added to the RTP packet. This label is used to validate the contents of the payload, which in turn prevents data falsification.
Authentication is also the basis for repelling replay attacks. To block them, each packet is assigned a
sequential index. A new message will be accepted only if its index is the next in order and has not yet been received. Indexes are effective due to the integrity described just above; without it, there is a possibility of index substitution.
Although WebRTC mainly uses the HMAC-SHA1 algorithm for SRTP integrity, it is strongly recommended that you choose PFS (Perfect Forward Secrecy) algorithms over non-PFS and AEAD (Authenticated Encryption with Associated Data) over non-AEAD. The latest WebRTC implementations use DTLS v1.2 with the TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 set of algorithms.
Keys
SRTP uses the key generation function (KDF) to generate keys based on the master key. The key management protocol creates all the keys in a session using the master key. Due to the fact that each session has its own unique key, all sessions are protected. Therefore, if one session was compromised, then the rest is still protected. The key management protocol is used for the master key — usually ZRTP or MIKEY, but there are other variations.
RTP IP stackStreams in WebRTC are protected by one of two protocols: SRTP or DTLS (Datagram Transport Layer Security). DTLS - to encrypt data streams, SRTP - for media streams. However, for the key exchange in SRTP, DTLS-SRTP is used to determine proxy attacks. This is detailed in the IETF documents:
WebRTC security and
security arch .
SRTCP (Secure Real-time Transport Control Protocol)
SRTP has a sister protocol, the Secure Real-time Transport Control Protocol (SRTCP). SRTCP extends RTCP (Real-time Transport Control Protocol) with the same features that SRTP extends RTP, including encryption and authentication. Like SRTP, almost all SRTCP security features can be disabled, except for message authentication — it is required for SRTCP.
SRTP Reefs
SRTP encrypts the RTP packet payload, but not the header extension. This is a vulnerability, since the header extension in the RTP packet may contain important information, for example, the sound levels of each packet in the media stream. Potentially, this may be a sign for an attacker that two people are communicating on the network - the privacy of the conversation may be violated. This circumstance was addressed in IETF
Request for Comments: 6904 , which requires all subsequent SRTP implementations to encrypt header extensions.
In some cases - for example, a conference with many participants - an intermediary in the form of a SFM (Selective Forwarding Mixer) may be needed in order to optimize the RTP parameters during the forwarding of flows. Such an intermediary violates the principle of end-to-end encryption, which is used in peer-to-peer systems; in other words, the end devices must “trust” another participant. To circumvent this limitation, Privacy Enhanced RTP Conferencing (PERC, one of the IETF working groups - translator comment) is working on solutions
like double encryption procedures in SRTP . PERCs provide guarantees supported by hop-by-hop and end-to-end encryption in two separate but related contexts. We will tell about it in the next posts!