Sunday, 11 October 2015

Video Signaling Attributes


To start with important concept, in video calls each endpoint will advertise its receive capabilities for which the sending endpoint will use to encode.  Therefore, we can have asymmetric video streams. This is different from audio calls where both endpoints needs to agree on common audio stream parameters (codec, dtmf, etc)

Video Stream Negotiation

There are multiple attributes negotiated in video stream SDP

Bandwidth Attribute

The bandwidth attribute is presented in SDP body as b=:. This specifies the maximum amount of receive bandwidth supported by the endpoint.

The bandwidth attribute can be present in the session section and/or media section of the SDP body.
 
There are three types of Bandwidth Modifiers which can be present in the bandwidth header inside the SDP body:

  • Transport Independent Application Specific (TIAS) in bps: Bandwidth does NOT include the lower layers (e.g. RTP bandwidth only)
  • Application Specific (AS) in kbps: Bandwidth includes the lower layers (e.g. TCP/UDP and IP)
  • Conference Total (CT):  Max Bandwidth that a Conference Session will use

Example:

o=CiscoSystemsCCM-SIP 161095 1 IN IP4 10.58.9.6
s=SIP Call
b=TIAS:6000000                                    Transport Independent Application Specific bandwidth (RTP) in bits/sec
b=AS:6000                                         Application Specific bandwidth (RTP/UDP/IP) in kbps
t=0 0
m=audio 16444 RTP/AVP 102 103 104 9 105 106 0 8 101
b=TIAS:64000
… attributes of multiple audio codecs in the offer
m=video 16446 RTP/AVP 98 99
b=TIAS:6000000

For this endpoint – the maximum media stream bandwidths that can be received :

= 6 Mbps for all voice and video streams including UDP and IP headers (AS session bandwidth)
= 64kbps for voice RTP traffic – not including UDP and IP headers (TIAS audio)
= 6 Mbps for video RTP traffic – not including UDP and IP headers (TIAS video)

Video Codec Attributes

Video codec advertised by each endpoint is considered to be the desired receive codec. In below SDP body multiple codecs are advertised in the preference order (98 is H264 followed by 99 which is H263).

m=video 16446 RTP/AVP 98 99
c=IN IP4 10.58.9.86
b=TIAS:6000000
a=rtpmap:98 H264/90000                            H.264/ Sampling Rate 90000 Hz
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;max-cpb=200;maxbr=5000;max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000
a=rtpmap:99 H263-1998/90000                       H.263 version 2/ Sampling Rate 90000 Hz
a=fmtp:99 QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1
a=rtcp-fb:* nack pli
a=rtcp-fb:* ccm tmmbr

H264

In H264 codec, there are two layers present which and Video Coding Layer (VCL) and Network Abstraction Layer (NAL).

The VCL layer creates a coded representation of the video image by partitioning the video frame into slices with each slice partitioned into Macroblocks (rectangular samples of pixels). The slices are grouped in NALU. Depending on the packetization mode advertised by endpoints, single or multiple NALUs can be encapsulated in single RTP packet.
Note: Multiple RTP packets can represent a single frame

Let's go deeper in H264.

a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;maxcpb=200;max-br=5000;max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000

profile-level-id=428016             The Profile-Level-ID describes the minimum set of features/capabilities that are supported by this endpoint

packetization-mode=1                These parameters describe the features and capabilities beyond those of the profile-level-id that are supported by this endpoint
max-mbps=245000
max-fs=9000
max-cpb=200
max-br=5000
max-rcmd-nalu-size=3456000
max-smbps=245000
max-fps=6000

Profile-Level-ID consists of 6 hex-digits. The first 4 hex-digits define the profile-id while the other 2 hex-digits define the level.  In our case the profile-id is 4280 while the level is 16. Profile-Level-ID must be symmetrical for the call

Profile-ID describes the subset of coding tools that the codec supports by the endpoint. The profile ID 4280 represents baseline profile (BP,66) which supports encoding features such as  Flexible Macroblock Ordering, Arbitrary Slice Ordering, Redundant Slices.

Profile Level describes the resolution, frame rate and bit rate that the endpoint can support. Level 16 in hex, which is 22 in dec, represents level 2.2 = 352 x 480 pixels @ 30 frames per second.

packetization-mode=1        
Values (0,1,2)
0 = a single NALU packet sent in an RTP packet, no fragments
1= multiple NALUs can be sent in decoding order. Fragments allowed
2= multiple NALUs can be sent out of decoding order. Fragments allowed
The negotiated packetization mode for the call must be symmetrical
max-mbps=245000
Max Decoding speed = Max Macroblocks/sec = 245000 (Baseline profile level 2.2 value = 20250)
max-fs=9000
Max Frame Size = 9000 Macroblocks (Baseline profile level 2.2 value = 1620)
max-cpb=200
Max Coded Picture Buffer size = 200 kbits (Baseline profile level 2.2 value = 4 kbits)
max-br=5000
Max video bit rate = 5000 kbps, Baseline profile level 2.2 value = 4000 kbps
max-rcmd-nalu-size=3456000
Max NALU packet size (bytes) that the receiver can handle
max-smbps=245000
Max Static Macroblock processing rate – macroblocks/second
max-fps=6000
Max Frames Per Second in 1/100s of a frame/second = 60 fps (Baseline profile level 2.2 value = 30 fps)

Offer (H.264 and H.263 Offered)

a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;max-cpb=200; max-br=5000; max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000
a=rtpmap:99 H263-1998/90000
a=fmtp:99 QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1

Answer (H.264 selectedSymmetric Attributes - Asymmetric attributes)

a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=108000;max-fs=3600;max-cpb=200; max-br=5000; max-rcmd-nalu-size=1382400;max-smbps=108000;max-fps=6000

RTCP Attributes

Video endpoints use RTCP packets as feedback mechanism for rate adaption when packet loss/congestion is encountered. The negotiation of RTCP feedback mechanism is taking place as part of call establishment (part of video negotiation).

Looking at SDP body we can see the following RTCP headers

a=rtcp-fb:* nack pli
“rtcp-fb” RTP Control Protocol (RTCP) - Feedback
“*” RTCP-Feedback for any of the offered video codecs
NACK – Negative Acknowledgement – indicates the loss of one or more RTP packets
PLI – Picture Loss Indication
a=rtcp-fb:* ccm tmmbr
“rtcp-fb” RTCP-Feedback
“*” RTCP-Feedback for any of the offered video codecs
“ccm” indicates support of codec control using RTCP feedback messages
"tmmbr" indicates support of the Temporary Maximum Media Stream Bit Rate Request/Notification

 BFCP Video Attributes

BFCP video stream negotiation is exactly similar to main video stream negotiation. However, it is important to point out two headers in SDP body used to distinguish BFCP video from main video parameters. These two headers are content and label.

For main video stream the content parameter will be main while for BFCP video stream the content parameter will be slides (desktop sharing).

a=content:main
a=label:11

Or

a=label:12
a=content:slides

 The label parameter in BFCP video is very important and is mapped to floor-id in BFCP attributes.

a=rtpmap:99 H263-1998/90000
a=fmtp:99 QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1
a=label:12
a=content:slides
a=rtcp-fb:* nack pli
a=rtcp-fb:* ccm tmmbr
m=application 5070 UDP/BFCP *
c=IN IP4 10.58.9.86
a=floorctrl:c-s
a=floorid:2 mstrm:12

No comments:

Post a Comment