The purpose of this section is to provide an introduction to digital TV transmission issues for those of you who are interested. None of the material in here is really needed for an MHP developer, or even for someone developing MHP middleware. If you want that, take a look at the introduction to MPEG and digital TV systems.
Instead, this is an introduction to how data gets from the camera or tape to the viewer's screen. This is mainly about broadcast engineering, and the elements of the system that software developers don't usually see. This is by no means complete, and is nothing more than a sketch to show you some of what goes on in the rest of the system.
A typical digital TV transmission setup looks something like this:
This equipment is normally all connected together using high-speed connections like SDI (Serial Digital Interface) or ASI (Asynchronous Serial Interface) which are standard in the TV field. In addition to this, all of the equipment will me connected via ethernet to a control system and monitoring equipment to make sure that nothing goes wrong (or that if something does go wrong, the viewer doesn't see it). There will normally be a large number of some of these components, including some redundant spares in the event of problems. A typical head-end will contain many MPEG encoders and multiplexers, for instance. Now that we've seen how it's put together, let's examine each of these components in more detail.
The encoder is used to take an analog signal and convert it to MPEG-2. This is more commonly used in live shows - for other shows, we may have a selection of pre-encoded MPEG streams that we can play out from a dedicated playout system. This playout system is usually a highly customized PC or workstation with a large high-speed disk array and a number of digital interfaces for transmitting the data to the rest of the transmission system.
An encoder can generate two types of MPEG stream. Constant bit-rate streams always have the same bit-rate, no matter what the complexity of the scene they contain. If the signal is too complex to be coded at the specified bit-rate, the quality of the encoding will be reduced. If the scene takes less data to code than the specified bit-rate, it will be stuffed with null packets until the correct bit-rate is reached. This makes later parts of the processing easier, because the fact the bit-rate does not change makes things easier to predict later, but it does waste bandwidth.
Most encoders can now produce variable bit-rate MPEG streams as well. In this case, the bit-rate of the stream can be adjusted dynamically, as more or less bandwidth is needed to encode the images with a given picture quality. Since some scenes take significantly more bandwidth to encode than others, this lets the picture quality be maintained throughout a show while the bandwidth changes. The fact that the bit-rate of the stream can change doesn't mean that it will reach higher levels than a constant bit-rate encoding of the same stream of course: the operator can usually set the maximum bit-rate that the encoder can use, and the encoder will reduce the quality of the encoded output, if necessary to meet this.
Most broadcasters today use variable bit-rate encoding because it offers better quality while using lower bandwidth. In particular, variable bit-rate encoding lets us make maximum use of the available bandwidth at the multiplexing stage.
One MPEG stream on its own isn't much use to us as a TV broadcast. Even several MPEG streams aren't terribly useful, because we have no way of associating them with each other. What we really need is a single stream containing all the MPEG streams needed for a single service, or ideally multiple services. A transport stream, in other words.
The multiplexer takes one or more MPEG streams and converts them into a single transport stream. The input streams may be individual elementary streams, transport streams or even raw MPEG data - most multiplexers can handle a range of input types.
The multiplexer actually does a number of jobs - multiplexing the data is one of the more complex of these, for a variety of reasons. Each transport stream typically has a fixed bandwidth available to it, which depends on the transmission medium and the way the transmission network is set up. One of the jobs of the multiplexer is to fit a set of services in to this bandwidth. The easy way of doing this is to use constant bit-rate MPEG streams, because then the operator knows exactly how much bandwidth each stream will take, and setting up the multiplexer is easy. This gets pretty inefficient, though, since some streams may be using less than their share of the bandwidth, while others may need to reduce the picture quality in order to fit in their allocated share. This wasted space is a real problem, since the transmission costs are high enough (especially in a satellite environment) that you want to make maximum use of your bandwidth.
The way round this is to use variable bit-rate MPEG streams and a technique known as statistical multiplexing. This system takes advantage of the statistical properties of the multiplexed stream when compared to the properties of the several independent streams. While the bit-rate of each individual stream can vary considerably, these variations are smoothed out when we consider ten or fifteen streams (video plus audio for five to seven services) multiplexed together. Each stream will have different bit-rate needs at each point in time, and these differences will partially cancel one another out at any given time. Some streams will need a higher bit-rate than average at that time, but others will probably need less than average. This makes the bit-rate problems easier to handle, since they are now less severe. By maintaining a separate buffer model for each stream, the multiplexer can decide how to order packets in the most efficient way, while making sure that there are no glitches in any of the services.
At some points, the streams being multiplexed may have a bit-rate that is higher than the available bandwidth. A statistical multiplexer will use another one of the statistical features on MPEG streams to handle this situation. Since most MPEG streams only reach their peak bandwidths at fairly wide intervals for fairly short periods, delaying one or more of the streams will move the peak to a point where the bandwidth is available to accommodate it. This is another reason to maintain a buffer model for each stream - to ensure that these peaks are not moved to a point where they would cause a glitch in the service.
In some older statistical multiplexing systems, the multiplexer and encoders are connected and can communicate with one another. In particular, the multiplexer can provide feedback to the encoders and set the bit-rate that they encode their streams at. The feedback from the multiplexer means that if one stream needs more bandwidth than it's currently getting, the bandwidth for that stream can be increased temporarily at the expense of the others. This doesn't use true variable bit-rate encoding, since in many cases the streams are actually constant bit-rate streams, where the bit-rate used to encode them changes from time to time.
Despite appearances, this system is less flexible than true statistical multiplexing, because if the total bit-rate of the streams is higher than the available bandwidth, then the quality of one of the streams must be reduced. This isn't necessary in the case of the latest generation of statistical multiplexers, where these peaks can often be moved slightly to accommodate them. The other place where flexibility is lost is in the need for a connection between the encoder and the multiplexer. In practical terms, this means that the multiplexer and encoder have to be on the same site, or at least that the encoder feeds only one multiplexer at a time. In these days of remote processing, that can cause problems. Without this need, a network can handle streams where they have no control over the encoder, such as streams from remote sites, from other networks or from a playout system. This offers some big advantages in terms of bandwidth saving.
Since we may not want to give our content away for free, we need some way of encrypting our services. This is handled by the conditional access (or CA) system. The algorithm that's used for this is proprietary to each CA vendor, although there are some open (but not publicly-known) algorithms such as the DVB Common Scrambling Algorithm. Manufacturers are understandably nervous about disclosing the algorithms they use, because the costs of having the algorithm cracked are huge - in some European markets, as much as 30% of subscribers were believed to be using hacked smart cards at one point. Even the DVB Common Scrambling Algorithm requires STB manufacturers to sign a non-disclosure agreement before they can use it.
In a DVB system, scrambling can work at either the level of the entire transport stream, or on the level of individual elementary streams. There's no provision for scrambling a service in its own right, but the same affect is achieved by scrambling all of the elementary streams in a service. In the case of scrambled elementary streams, not all of the data is actually scrambled - the packet headers are left unscrambled so that the decoder can work out their contents and handle them correctly. In the case of transport stream scrambling, only the headers of the transport packets are left unencrypted - everything else is scrambled.
As well as encrypting the data that's supposed to be encrypted, the CA system adds two types of data to the stream. These are known as CA messages, and consist of Entitlement Control Messages (ECM) and Entitlement management Messages (EMM). Together, these control the ability of individual users (or groups of users) to watch scrambled content. The scrambling (and descrambling) process relies on three pieces of information:
The control word is encrypted using the service key, providing the first level of scrambling. This service key may be common to a group of users, and typically each encrypted service will have one service key. This encrypted control word is broadcast in an ECM approximately once every two seconds, and is what the decoder actually needs to descramble a service.
Next, we have to make sure that authorized users (i.e. those who have paid) can decrypt the control word, but that only authorized users can decrypt it. To do this, the service key is itself encrypted using the user key. Each user key is unique to a single user, and so the service key must be encrypted with the user key for each user that is authorized to view the content. Once we've encrypted the service key, it is broadcast as part of an EMM. Since there is a lot more information to be broadcast (the encrypted service key must be broadcast for each user), these are broadcast less frequently - each EMM is broadcast approximately every ten seconds.
One thing to note is that the encryption algorithms used may not be symmetrical. To make things easier to understand we're assuming that the same key is used for encryption and decryption in the case of the service and user keys, but this may not be the case.
When the receiver gets a CA message, it's passed to the CA system. In the case of an EMM, the receiver will check whether the EMM it intended for that receiver (usually by checking the CA serial number or smart card number), and if it is, it will use its copy of the user key to decrypt the service key.
The service key is then used to decrypt any ECMs that are received for that service and recover the control word. Once the receiver has the correct control word, it can use this to initialize the descrambling hardware and actually descramble the content.
While not all CA systems use the same algorithms (and it's impossible to know, because technical details of the CA algorithms aren't made public), they all work in basically the same way. There may be some differences, and the EMMs may or instance be used for other CA-related tasks besides decrypting service keys, such as controlling the pairing of a smart card and an STB so that the smart card will work correctly in that receiver.
In order to generate the EMMs correctly, the CA system needs to know some information about which subscribers are entitled to watch which shows. The Subscriber Management System, or SMS, is used to set which channels (or shows) an individual subscriber can watch. This is typically a large database of all the subscribers that is connected to the billing system and to the CA system, and is used to control the CA system and decide which entitlements should be generated for which users. The SMS and CA system are usually part of the same package from the CA vendor, and are tied together pretty closely.
The ECMs and EMMs are broadcast as part of the service (see the introduction to MPEG if you're unclear on the concept of a service). The PIDs for the CA data are listed in the Conditional Access Table (CAT), and different PIDs can be used for ECMs and EMMs. This makes it easier for remultiplexing, where some of the CA data (the ECMs) may be kept, while other data (the EMMs) may be replaced.
While NDS and Nagravision are the two most common CA systems out there, other CA systems are provided by Conax, Irdeto Access, Philips (the CryptoWorks system), and France telecom (the Viaccess system), for example. There are other systems from companies like Motorola and GI who make CA systems, but these are not often used in DVB systems. DVB systems can offer pluggable encryption modules using the DVB common interface (CI), which uses a PCMCIA card to contain the encryption hardware and software. This means that the user can switch encryption systems (for instance, if they change their cable company) without having to replace the entire STB. This is a big advantage for open standards, and really enables the move from a vertical market to a horizontal one.
Some companies (NDS for instance) are not convinced of the security of the DVB CI system, and so not all CA systems are available as CI modules. You have to remember, though, that these interfaces aren't necessary for a system to be DVB-compliant. They're a useful feature, but not required.
ATSC uses a similar, though slightly more secure, mechanism called the POD (Point Of Deployment) module, known as CableCARD in OpenCable systems. These are more widely deployed in US markets, and all OCAP receivers will include a CableCARD slot.
Before we can transmit our signal we need to make sure that it will be received correctly. This means some way of identifying and correcting errors in the stream. To do this we add some extra error correction data to the MPEG packets, in order to allow us to correct data. The most common requirement in DTV systems is for an MPEG stream to be quasi-error free (QEF), which means a bit error rate of approximately 1x10-10, or one erroneous bit every 1 hour of video for a 30 Mbits/sec stream. Since we have to be able to correct the errors in real-time, the process is called Forward Error Correction (FEC)
Different transmission mechanisms (cable, satellite or terrestrial) all have different characteristics including different noise levels. A satellite signal for instance can have a lot of errors introduced by conditions in the atmosphere. A terrestrial signal may have errors introduced by reflections from buildings, or by the receiving aerial not being aligned correctly. These different conditions mean that very efficient error correction mechanisms are needed. DVB and ATSC systems all use Reed-Solomon encoding to add a first layer of protection. This adds a number of parity bytes to each packet. Typically, this 16 parity bytes are added to a 188-byte packet, which means that an 8-byte error can be corrected. Larger errors can be detected but not corrected.
Once this is done, a further layer or error correction coding is added to improve things still further. Common coding mechanisms at this stage are trellis coding and viterbi coding. These exploit the fact that data is not sent one bit at a time, but is instead sent as 'symbols' that can carry several bits of data. In trellis coding, symbols are grouped together to form 'trellises.' For a group of three symbols, a modulation scheme that stores eight bits per symbol can store 512 separate values. By using a subset of these as 'valid' values, the network operator can introduce some extra redundancy into the signal. The effect of this is that each symbol may carry fewer bits of data, but for every group of three symbols, it's possible to correct one erroneous symbol by choosing the value for that symbol that gives a valid trellis. This is the approach used by US digital terrestrial systems. DVB systems use Viterbi coding instead, which is a modification of trellis coding that uses a slightly different algorithm to find the best matching trellis.
To strengthen the error correction, another technique called interleaving may be added. This helps avoid situations where a burst of noise (for example, a lightning strike causing electrical interference) can corrupt data past the point where FEC can fix it. After the data has FEC added, but before it is transmitted, the data is written to a RAM buffer and then read out in a different order. For instance, if we assume that our RAM buffer is a two-dimensional array with ten rows and ten columns, the data may be written to the buffer starting at row 1 and working down to row 10, then read from starting at the top of column 10 and working back to column 1. This means that bytes from the same packet (which will share error correction) are spread over a longer transmission period and are less vulnerable to burst noise.
At the receiver, the process is reversed, and the original order of the bytes can be restored. The interleaving scheme described here isn't the only possible one, and other (more memory-efficient) techniques will often be used instead.
Once we've added error correction, we need to do one more thing before it can be prepared for transmission. If the digital bitstream contains a large run of 1's, then there will be a (small) current flowing in the transmission and reception equipment. This is a Bad Thing, and so some randomization is needed to make sure that there is never a long run of 1's or 0's in the bitstream and to disperse the energy in the signal across all of its bandwidth. To do this, a simple randomizer is used, as shown in the diagram below. The process is symmetrical, so the same hardware is used to de-randomize the signal in the receiver.
Every eight transport packets, the randomizer is reset and its register is loaded with the bit sequence
100101010000000. Of course, the randomizer and the de-randomizer must both reset themselves at the same point in the stream, or the input can't be recreated. This is done using the sync bytes from the transport packets. These are not scrambled, so the start of a packet can always be identified, and at every eighth packet, the value of the sync byte is inverted (from 0x47 to 0xB8). This is the signal for the de-randomizer to reset itself, making sure that both the randomizer and the de-randomizer are synchronized correctly.
By doing this, we make sure that the energy is dispersed across the signal spectrum. While it's not strictly necessary, DVB does require that a transmitter and receiver do this before transmitting the signal. The randomizer and its inputs are standardized by DVB in the standards for satellite (EN 300 421), cable (EN 300 429) and terrestrial transmission (EN 300 744). ATSC defines its own randomizer, which can be seen in the ATSC digital TV specification (ATSC A/53c).
Now we have a digital stream that is almost ready for broadcast. However, we can't directly broadcast digital data - first we have to modulate it - convert it to an analog signal so that we can broadcast it using radio signals or electrical voltages in a cable.
As we've already seen, each of the different transmission mechanisms has different characteristics, and different strengths and limitations. So, each type of signal uses a different modulation scheme. The modulation scheme is just the way of converting digital information into an analog signal so that it can be transmitted. I'm not going to examine these in too much detail, because it's really not interesting to us as MHP developers. The table below describes which modulation scheme is used by each of the transmission mechanisms in a DVB environment.
Cable and satellite use a similar modulation scheme (it's actually the same scheme, with different parameters). The main difference is that satellite signals are more prone to errors and so use a less efficient way of sending the data that provides a bigger difference between symbols, making correct demodulation easier. Terrestrial broadcasts use a different scheme in order to provide a much stronger resistance to errors caused by reflected signals.
The USA uses several different schemes for modulation, as shown in the table below. While the modulation schemes in use are relatively straightforward, the situation for other parts of the system is messy at best. Some satellite providers follow the DVB standards, while other satellite networks and many cable networks use proprietary systems such as DigiCipher II (both cable and satellite) and DSS (satellite only). In all cases they appear to use the same modulation scheme, but details of error correction and other parameters may be different. Cool.stf's North American MPEG-2 information page has more details about the modulation in DigiCipher II and DSS on satellite systems, as well as details about the use of DVB standards in the US.
The modulation is carried out by a device called, surprise, surprise, a modulator. This takes the digital transport stream as an input, and produces an analog output that can be passed onto the transmission equipment. The modulator is the last stage in the process that takes a digital input - after this, everything is analog and we're into the world of radio engineering.
Typically, signals are modulated to a lower frequency than they are broadcast at. Since the broadcast frequencies can be very high (up to 30GHz in the case of satellite transmissions, and up to 950MHz for cable signals), modulating the signals at these frequencies can be hard. So, what happens instead is that the frequencies are modulated at a lower frequency, which is then converted to a higher frequency before transmission. This is done using an upconverter. Basically, this does nothing else except convert the signal from one frequency to another, much higher, frequency. In this case, that other frequency is the one used by the network that you're broadcasting on. Each transport stream will be broadcast on a different frequency, and so the upconverter will have different settings for each transport stream that it handles.
If you want to know more details, Digital Television -MPEG-1, MPEG-2 and principles of the DVB systems by Herve Benoit is a good but expensive read. A cheaper alternative is to read Agilent Application Note 1298, Digital Modulation in Communication Systems - An Introduction which will provide you with all the detail you need.
Once you have a modulated signal, the signal is ready for transmission. All you need then is a transmitter, and antenna (in the case of terrestrial or satellite) or a cable network, and an audience...