メニュー

Expand
Rate this page:

Thanks for rating this page!

We are always striving to improve our documentation quality, and your feedback is valuable to us. How could this documentation serve you better?

Using the Network Bandwidth Profile API

概要

The Network Bandwidth Profile API makes it possible to specify how the downlink bandwidth of a Group Room Participant should be distributed among its subscribed tracks. Using this API developers can assign higher bandwidth to higher priority tracks, protect audio quality and keep the consumed network and battery resources under control.

The Network Bandwidth Profile API is only available for Group Rooms.

ベータ

The Network Bandwidth Profile API is currently available as a BETA product. Some features are not yet implemented and others may be changed before the product is declared as Generally Available. BETA products are not covered by a Twilio SLA. See this article for more information on Beta product support.

目次

What is the Network Bandwidth Profile API

Twilio Group Rooms are based on the SFU (Selective Forwarding Unit) architecture, where Participants commonly publish a maximum of two video tracks (e.g. webcam and/or screen share) but may subscribe to many. The number of subscribed tracks typically grows in proportion to N-1 where N is the total number of Participants in the room. When scalable video codecs (such as VP8 Simulcast) are used, each track can be forwarded using multiple qualities. Hence, the SFU needs to decide which quality is assigned to each track. In other words, it needs to determine how the available downlink bandwidth is allocated among those tracks. This requires addressing the following problems:

  • To guarantee that the network does not get overloaded, which may happen if all tracks are subscribed with their maximum quality.
  • To guarantee that videos rendered using larger UI areas are allocated more bandwidth than thumbnail videos.
  • To avoid mobile devices consuming more bandwidth or battery than what is required for each specific use-case.
  • To limit the number of displayed video tracks to what is appropriate for every use-case.
  • To preserve audio quality in case of severe congestion.

The Network Bandwidth Profile API enables developers to control how the available downlink bandwidth of a participant is split among its subscribed tracks.

The Network Bandwidth Profile API has been created to address these issues. The following sections explain how to use it.

SDK Compatibility

The Network Bandwidth Profile API is only available in Group Rooms (including Small Group Rooms). The following table illustrates current support:

Twilio Video SDK Network Bandwidth Profile API Support (Group Rooms only)
JavaScript 2.0.0-beta14+
Android もう少しです
iOS もう少しです

Network Bandwidth Profiles default behavior

By default Bandwidth Profiling is disabled. This means that if you are not interested in Network Bandwidth Profiles or if you don’t use the Network Bandwidth Profile API you should not experience any changes in your application behavior. For activating Bandwidth Profiling in a Participant, you just need to specify a bandwidthProfile at connect time. The following section guides you on how to do it.

Specifying a Network Bandwidth Profile

A Network Bandwidth Profile specifies how a Participant’s downlink bandwidth is consumed. Notice that it is per-Participant meaning that different Participants may have different Network Bandwidth Profiles. This also means that Bandwidth Profiling can be activated on some Participants and be inactive on others in the same Room. Developers can specify Network Bandwidth Profiles as a connect option.

JavaScript SDK (v2.0.0-beta14+)

//Specifying a Network Bandwidth Profile at connect time
const { connect } = require('twilio-video');

const room = await connect(token,{
  name: "my-new-room",
  bandwidthProfile: {
    video: {
      mode: 'collaboration',
      maxSubscriptionBitrate: 2400000,
      dominantSpeakerPriority: 'high',
      maxTracks: 3,
      renderDimensions: {
        high: {width: 1080, height: 720},
        standard: {width: 640, height: 480},
        low: {width: 320, height: 240}
      }
    }
  }
});  

Android SDK

Coming soon

iOS SDK

Coming soon

The following table summarizes the meaning of the different bandwithProfile parameters:

Parameter (bandwidthProfile.video) Meaning
mode Specifies the algorithm that controls the bandwidth allocation. Possible values are: grid, collaboration and presentation. Defaults to grid. More information here.
maxSubscriptionBitrate Specifies the maximum downlink video bitrate this Participant may consume in bps. For mobile devices it defaults to 2,400,000, for desktop browsers it defaults to 8,000,000 in Group Rooms, and 4,000,000 in Small Group Rooms. More information here.
dominantSpeakerPriority Specifies the minimum priority that will be assigned to the video tracks published by the dominant speaker. Possible values are high, standard and low. It defaults to standard. More information here.
maxTracks Specifies the maximum number of visible video tracks. Track filtering is based on priority (first) and N-Loudest (second) policy. By default it is 0 (i.e. unlimited). More information here.
renderDimensions Specifies the desired UI display dimensions used to render video tacks. Dimensions are specified per-priority in pixels. Defaults are:
  • HD (1280x720) for high.
  • VGA (640x480) for standard.
  • QCIF (176x144) for low.
More information here.

Understanding Track Switch-Offs

When a Participant’s downlink bandwidth is insufficient, congestion and packet loss may appear causing significant degradation to both audio and video quality. To avoid congestion, the Network Bandwidth Profile API algorithms monitor and estimate the available downlink bandwidth and decrease video tracks quality accordingly. However, any real-time video track has a lower limit under which bandwidth cannot be further reduced. In the Twilio Video engine this limit is at around 30Kbps for VP8 and H.264 and at approximately 60Kbps for VP8 Simulcast. Hence, if bandwidth keeps degrading, at some point all video tracks will reach that limit. Beyond that point, congestion will be severe enough that the application becomes unusable. To avoid this problem, Twilio bandwidth allocation algorithms may completely switch-off the less relevant video tracks.

Twilio bandwidth allocation algorithms may completely switch-off the less relevant video tracks.

Track Switch-offs

What track switching-off means

When a video track is switched-off for a Participant, Twilio’s SFU removes all the track downlink traffic. This means that, for that specific Participant, the track will consume zero bandwidth. For managing track switch-offs in your application, you may find useful the following information:

  • Track switch-offs are always subscriber side. In other words, only the remote tracks received at a Participant can be switched-off.
  • Track switch-offs happen “per-Participant”. This means that a track may be off for a given Participant (e.g. one having poor bandwidth) but on for others.
  • Switched-off tracks will appear as frozen video in the UI (i.e. will show the last received frame). Hence, a switched-off track will not be rendered as a black video.

How Track switch-offs work

The algorithms that control track switch-off enforce the following rules:

  • Audio tracks are never switched-off.
  • The Network Bandwidth Profile mode influences the switch-off behavior. Check the mode documentation below for further details.
  • Tracks with higher priority will always be switched off last. Hence, for example, if a track with priority high is switched off, then it is guaranteed that all tracks with priority standard or low have been switched off previously.
  • Among tracks with the same priority, the track being associated to the participant with higher speaking activity will be switched off last. Hence, for example, if two standard tracks are candidates to be switched off, the one associated to the participant who speaks less will be preferred.

Track Switch-ons

After a track has been switched-off, it may eventually be switched on again. Track switch-on happens based on the following principles:

  • As with the switch-off process, track switch-ons happen “per-Participant”.
  • To avoid oscillations, the off-to-on switching process has hysteresis. This means that it’s not enough for the available bandwidth to recover, but that recovery must sustain during a reasonable time period (typically around 20 seconds) to have the track back to on.
  • When multiple tracks are off, the track with highest priority will be switched on first.

Programming with Video Track Switch-offs/ons

Each time a track is switched off/on for a Participant the Twilio Media Server sends a notification. Developers can subscribe to those events that are published at the RemoteVideoTrack object, as the following code snippets illustrate:

JavaScript SDK (v2.0.0-beta14+)

remoteTrackPublication.on('subscribed', remoteTrack => {
  remoteTrack.on('switchedOff', () => {
    //You may update your UI accordingly
    // You can also determine whether a particular RemoteTrack is switched off.
    assert.equal(remoteTrack.isSwitchedOff, true);
    console.log(`The RemoteTrack ${remoteTrack.name} was switched off`);
  });

  remoteTrack.on('switchedOn', () => {
    //You may update your UI accordingly
    // You can also determine whether a particular RemoteTrack is switched off.
    assert.equal(remoteTrack.isSwitchedOff, false);
    console.log(`The RemoteTrack ${remoteTrack.name} was switched on`);
  });
});

Android SDK

Coming soon

iOS SDK

Coming soon

Working with Network Bandwidth Profiles

Understanding mode

The main objective of the Network Bandwidth Profile API is to split the available downlink bandwidth among a Participant’s subscribed video tracks. This is performed by an algorithm that determines how much bandwidth is allocated to each track as a function of the track priority and of the specified Network Bandwidth Profile. The mode parameter controls the behavior of the algorithm. Currently, mode can take the following values: grid (default), collaboration and presentation.

Using grid mode

Grid mode should be used in use-cases where all subscribed video tracks are equally important. When this mode is used, the available bandwidth is split uniformly among all the tracks independently on their priorities. This makes grid mode useful for UIs displaying all video tracks in a matrix, so that none of them is enhanced over the rest.

Grid mode should also be used when scalable video codecs are not possible. In other words: if most participants in a room do not use VP8 Simulcast then, most probably, grid mode will generate higher quality of experience than any other mode.

When a participant is configured to use grid mode, the available downlink bandwidth is split uniformly among all the received tracks, which are treated as equals independently of their priority. Hence, when this mode is used, developers will typically use just the same priority and for all the video tracks or will set the same renderDimensions for all the used priorities.

Grid mode is useful in use-cases where all the video tracks are displayed in a matrix so that none of them is enhanced over the rest.

The following rules of thumb may help you understand how grid mode works:

  • If bandwidth is large enough, all video tracks will be assigned their maximum bandwidth. This maximum depends on the renderDimensions and more specifically, is proportional to the display area (i.e. width x height).
  • If bandwidth decreases, the available amount will be split equally among all the tracks. Hence, all tracks drop their bandwidth by the same fraction at the same time.
  • If bandwidth decreases further video tracks will start switching-off in lower to higher priority order. Notice that the track with highest priority will never be switched off.

Using collaboration mode

Collaboration mode is for applications where some video tracks are more important than others but we still want to keep the rest of the video tracks visible. For example, in videoconferencing meetings having the typical UI layout where the dominant speaker is rendered in the central area and the rest of participants are rendered as thumbnails in a bottom row.

When a participant is configured to use collaboration mode, the available bandwidth is split in proportion to the render area (i.e. renderDimensions) assigned to each track. Hence, more relevant tracks are going to be allocated higher bandwidth. Due to this, collaboration mode requires Simulcast to be used by most video publishers (or at least by higher relevance publishers). In other words, unless there is a good reason for it, developers should avoid using collaboration mode together with plain VP8 or H.264 codecs.

Collaboration mode is recommended for videoconferencing meetings where higher priority tracks are depicted with higher relevance in the UI.

The following information may help you understand how collaboration mode works:

  • If bandwidth is large enough, all video tracks will be assigned their maximum bandwidth. This maximum depends on the renderDimensions and more specifically, is proportional to the display area (i.e. width x height).
  • If bandwidth decreases, the available amount will be split among tracks in proportion to the renderDimensions so that tracks having double area will be assigned double bandwidth.
  • If bandwidth decreases even further, the allocated amount will go down in that proportion while trying to keep all tracks visible. This means that, if necessary, higher priority tracks will decrease their quality to avoid lower priority switch-offs. If bandwidth continues decreasing, at some point all tracks will get to their minimum and switch-offs will start in lower to higher priority order. Notice that the track with highest priority will never be switched off.

Using presentation mode

Presentation mode is for use-cases where some video tracks are critical for the end-user experience and must be preserved at any cost. For example, in webinars where a speaker shares some kind of presentation to an audience of viewers. In this case, viewers webcams should rather be switched-off if the webinar screen share quality is at risk.

When a participant is configured to use presentation mode, higher priority tracks will be assigned all the bandwidth they require so that only when all the tracks of a given priority are at their maximum the lower priority ones will be allocated. As happens with collaboration, this mode requires most participants to publish tracks (and specifically the highest priority track) with a scalable video codec such as VP8 Simulcast.

Remark also that, if bandwidth is large enough, both collaboration and presentation behave equivalently and assign the maximum bandwidth to each track in proportion to its renderDimensions. However, when it's not possible to allocate all that bandwidth both modes diverge. In that case, presentation has the objective of keeping the highest priority video track quality, while collaboration is designed for keeping higher priority tracks with higher quality only if the continuity of the rest of tracks is not compromised.

When presentation mode is used, if the available bandwidth decreases the allocation algorithm tries to preserve highest priority video track quality, even if it’s at the cost of completely switching-off lower priority tracks.

The following may help you understand how presentation mode works:

  • If bandwidth is large enough, all video tracks will be assigned their maximum bandwidth. This maximum depends on the renderDimensions and more specifically, is proportional to the display area (i.e. width x height).
  • If bandwidth decreases, higher priority video tracks will remain at their maximum while lower priority tracks will decrease, even if that decrease causes them to switch-off.
  • If bandwidth decreases even further, lower priorities will be switched-off before decreasing higher priority tracks bandwidth. Only when all lower priority tracks are switched-off the higher priority tracks may decrease their allocation.

Understanding maxSubscriptionBitrate

Sometimes developers want to limit the total downlink bandwidth consumed by their applications. There may be many reasons for this:

  • To minimize the battery consumption on mobile devices.
  • To save costs.
  • To reserve higher qualities to premium users.
  • etc.

Using maxSubscriptionBandwidth developers can limit the total downlink bandwidth. This may be useful in applications that want to control the consumed network or battery resources.

When maxSubscriptionBitrate is set in a Participant, that Participant video downlink will never consume more than the specified value expressed in bps. However note that it may consume less as far as the actual available network bandwidth is below.

By design, the maximum downlink bandwidth is capped to 8Mbps in Group Rooms and to 4Mbps in Small Group Rooms. Hence, values of maxSubscriptionBitrate over those limits will have no effect. Default value is:

  • maxSubscriptionBitrate: 0 (meaning “no limit”), in desktop browsers. Hence, desktop browsers default to the above mentioned caps.
  • maxSubscriptionBitrate: 2,400,000 in mobile SDKs.

It is important to remark that the use of maxSubscriptionBitrate for setting a bandwidth constraint may generate switch-off in your video tracks. For example, if you work in presentation mode with a high priority video track with default renderDimensions and set maxSubscriptionBitrate to say 500,000, then you will be probably experiencing permanent track switch-offs for all the tracks except the main one.

Understanding dominantSpeakerPriority

The Dominant Speaker refers to the Participant having highest audio activity at a given time. You can activate dominant speaker detection following Twilio’s official documentation. This feature is useful when you want to enhance the dominant speaker in your UI: for example, by rendering her video in the central area and with larger size. To do this appropriately, the Network Bandwidth Profile API should be used to set a higher priority to the dominant speaker video tracks. However, as the dominant speaker changes dynamically, it may be hard to figure out what are the tracks that should be prioritized at any time. To solve this problem the Network Bandwidth Profile API exposes the dominantSpeakerPriority parameter. Using it, developers can set the minimum priority that should be automatically assigned to the dominant speaker video tracks.

To understand how dominantSpeakerPriority works imagine a Group Room with dominant speaker detection activated where a Network Bandwidth Profile is defined with dominantSpeakerPriority: 'high'. Imagine that in that Room all video tracks are set to their default priority (i.e. standard). In that case, when Participant Alice becomes dominant speaker, automatically all Alice’s video tracks will become high priority. If later Bob becomes dominant speaker, automatically Alice’s video tracks will go back to standard, as she isn’t dominant any longer, and Bob’s video tracks will be upgraded to high.

Dominant speaker priority makes it possible to change dynamically the priority of the dominant speaker's video tracks.

As the rest of parameters in the Network Bandwidth Profile API, dominantSpeakerPriority works per-Participant. This means that the priority upgrades only happen on the subscriptions of the specific participant where that Network Bandwidth Profile has been defined. This also means that the dominant speaker is relative to those subscriptions. Let’s illustrate that with an example. Imagine the following Room:

  • Participant Alice subscribes to Bob’s and Carl’s tracks
  • Participant Bob subscribes to all the Room tracks.
  • Participant Carl subscribes only to Bob’s and to Dave’s tracks.
  • Participant Dave subscribes only to Alice’s tracks.

Imagine also that, at a given time, the speaking activity of the Participants goes in this order:

  • Alice (with highest speaking activity), Dave, Bob and Carl (with lowest speaking activity).

As the dominant speaker is the Participant with highest speaking activity among all subscriptions, then the following will hold for that specific instant:

  • For Alice: the dominant speaker is Bob.
  • For Bob: the dominant speaker is Alice.
  • For Carl: the dominant speaker is Dave.
  • For Dave: the dominant speaker is Alice.

In addition, please notice the following:

  • This property indicates the minimum priority of the dominant speaker video tracks. In other words, dominantSpeakerPriority can only upgrade the priority of the dominant speaker but never downgrade it. For example, if a developer sets dominantSpeakerPriority: 'standard' and the dominant speaker has a video track published with priority high, that video track will stay as high.
  • By default Network Bandwidth Profiles have dominantSpeakerPriority: 'standard'.
  • If dominant speaker detection has not been activated using the Dominant Speaker Detection API, then setting dominantSpeakerPriority will have no effect.

Understanding maxTracks

This parameter may be useful in large Group Rooms. For example, in a Group Room with 40 Participants it is not particularly useful to render 39 video tracks on each Participant’s UI. This will unnecessarily increase the consumed bandwidth and battery. Remark that in a typical scenario the 40 Participants will not be talking at the same time. Instead, experience shows at any time interval lasting less than 5 seconds there are rarely more than 4 speakers. Hence, in most use-cases, a conversation can be tracked just keeping on screen the 4 Participants with highest speaking activity.

When setting maxTracks to N, Twilio guarantees that at any given time no more than N video tracks will be on. Hence, Twilio keeps only the N most relevant tracks for you based on the following:

  • Higher priority tracks are preferred.
  • Within a priority, tracks having higher speaking activity are preferred. This is sometimes called an N-Loudest policy. Note that N-Loudest (i.e the N participants with higher speaking activity at a given time) is not the same than the other policy called Last-N (i.e. the last N dominant speakers). Twilio Network Bandwidth Profiles do not support Last-N for maxTracks.

Just for illustration, if you have a videoconferencing application with:

  • A screen share track with priority high.
  • 40 webcam tracks with priority standard.
  • You now set maxTracks: 5.

Then, the algorithm will only allow the following 5 tracks to be on:

  • The screen share track.
  • The 4 webcam tracks corresponding to the 4-loudest participants (i.e. the 4 participants with the highest speaking activity at that time).

Remark also that maxTracks: 0 should be read as “unlimited” meaning that Twilio will try to send all the subscribed tracks to the Participant. Note also that, by default, maxTracks: 0.

Understanding renderDimensions

Video bandwidth depends on resolution. For example, if a track is to be rendered in FullHD (1920x1080) it may make sense to reserve 4Mbps for it. However, for a thumbnail with CIF (352x240) size that amount would not be justified.

Using renderDimensions developers can specify the display size, in pixels, they plan to use to render video tracks on a Participant’s UI. Thanks to this, Twilio can determine what’s the maximum bandwidth that should be allocated to each video track. This means that, even if the available bandwidth is in excess, our algorithms will never allocate more than what is needed to render the track with he appropriate specified size. This has multiple positive side effects:

  • Network resources are preserved and used only when it makes sense.
  • Bandwidth is only allocated to tracks that really needed.
  • Battery life is enhanced.

In the Network Bandwidth Profile API, renderDimensions are specified per priority:

high: Specifies the render width and height (in pixels) of high priority tracks. Defaults to HD resolution:

high: {
  width: 1280,
  height: 780
}

standard: Specifies the render width and height (in pixels) of standard priority tracks. Defaults to VGA resolution:

standard: {
  width: 640,
  height: 480
}

low: Specifies the render width and heigh (in pixels) of low priority tracks. Defaults to QCIF resolution:

low: {
  width: 176,
  heigh: 144
}

It is important to remark that, from the practical perspective, our algorithms use renderDimensions to compute the maximum video Track bandwidth. This has multiple implications you should be aware of:

  • The renderDimensions must be understood as a hint to the desired actual rendering dimensions of video Tracks. That means that renderDimensions don’t need to be exact. In other words, you don’t need to invest much of your developing time figuring out how to update your renderDimensions every time the display size changes. You must concentrate instead on providing approximate hints that keep the actual proportions of your UI. That means that if a video track A is to be rendered with double size than video track B, then A’s priority should have double renderDimensions than B’s.
  • The renderDimensions don’t need to match with the actual video track dimensions. The video track dimensions will depend on aspects such as your capture constraints and the status of the network. Remember that renderDimensions are just a hint that you provide to help the bandwidth allocation algorithms and not a specification on actual video width and height.
  • The Network Bandwidth Profile API implicitly assumes that all video tracks within a given priority are to be rendered using the same size. Experience shows that in most common use-cases this assumption is true. However, if you have a use-case where you really need different dimensions our recommendation is to set renderDimensions so that the resulting area (width x height) is the average of all the render areas of the tracks on that priority.

Video Codecs and the Network Bandwidth Profile

As stated above, collaboration and presentation modes are only recommended when scalable video codecs such as VP8 Simulcast are used. Using them together with plain VP8 and H.264 will sometimes generate undesired behaviors caused by the interdependencies that appear among the different track subscribers. This may lead to problems like the following:

  • The subscribed bitrate of a non-scalable track may be significantly under the available bandwidth of that subscriber, even if the publisher has a high speed network. For example, you may have a publisher and a subscriber connected to a GB cable network but have them communicating low quality video at 50Kbps due to to a third Participant connecting from a deficient network.
  • When Participants publish multiple non-scalable video tracks, the bandwidth allocation for all of them will drop to the one of the lowest priority track. This may generate priority inversions in the allocation. For example, if a Participant is publishing a VP8 webcam with priority low and a VP8 screen-share with priority high, from the perspective of the bandwidth allocation, the screen-share will behave as having priority low. Notice that switch-offs will still preserve priority other though.

The Network Bandwidth Profile API Vs the Track Subscription API

The Track Subscription API and the Network Bandwidth Profile API both deal with what happens in the downstream link between Twilio’s SFU and Twilio’s SDKs. Other than that, they have different (but complementary) functions and are orthogonal meaning that they should not have side effects among each other. You can work safely with them as long as you understand the following:

  • The Track Subscription API determines which specific video tracks a given participant is subscribed to.
  • The Network Bandwidth Profile API determines how the bandwidth is allocated among such subscribed video tracks.

Hence, with the Track Subscription API you can decide WHICH tracks each participant receives while with the Network Bandwidth Profile API you can decide HOW such tracks behave in terms of quality and bandwidth allocation.

Luis Lopez
Rate this page:

ヘルプが必要ですか?

誰しもが一度は考える「コーディングって難しい」。そんな時は、お問い合わせフォームから質問してください。 または、Stack Overflow でTwilioタグのついた情報から欲しいものを探してみましょう。