What Is a Color Space?

A color space allows people and software to communicate colors unambiguously using a numeric representation.

A triplet of code values such as [0.506, 0.266, 0.266] by itself is not enough to specify a color. Those code values must be interpreted with respect to a particular color space. The color represented by those three numbers will be different in different color spaces.

There are two categories of color space:

Device-independent color spaces describe colors in absolute terms.
Device-dependent color spaces depend on the characteristics of specific hardware.

Device-independent Color Spaces

To fully match a triplet of code values to a specific color, a device-independent color space must define the following characteristics:

The meaning of the three primary values in terms of CIE colorimetry.
One or more data types and encodings.
The image state.
The associated viewing conditions.

Examples of device-independent color spaces include ACES and the ICC Profile Connection Space.

Primary Values

The primary values can be thought of as the coordinate axes used to define a color "point" in a color space. Device-independent color spaces define their primaries with reference to CIE colorimetry values — in that respect, CIE colorimetry provides a kind of universal reference frame or "world" coordinate system for converting between color spaces. In a given viewing environment, two colors with the same colorimetry will look the same to a typical human observer.

Some examples of different systems of primaries include:

The coordinates of red, green, and blue specified by ITU-R BT.709 (also known as "Rec. 709") for HD video. These primaries are also used for sRGB (which has a different gamma).
The coordinates specified by ITU-R BT.601 (Rec. 601) for SD video.
The "P3" primaries specified by DCI and SMPTE for digital cinema projectors.

Data Type & Encoding

To interpret the numeric code values, it is necessary to know the data type of the numbers, for example, whether they are meant to be 8-bit, 10-bit, 12-bit, or 16-bit integers, or floating-point values. In addition, it's necessary to know the values' encoding, that is, whether the code values represent intensities on a linear scale or a logarithmic scale, and whether gamma has been applied.

Image State

The notion of "image state" is a standard framework (ISO 22028-1) for grouping color spaces that share similar characteristics and which require similar processing. There are three main image states.

Scene-referred images are high-dynamic-range images. They use code values that are proportional to the luminance or radiance in the scene, whether that is a live set or a virtual scene in a 3D application like Maya. No tone-mapping has been performed, and code values greater than 1.0 are allowed. If the code values are encoded on a linear scale, then the images are called scene-linear. Most OpenEXR files are scene-linear.
Output-referred images are normal dynamic range images. They have been tone-mapped, for example, using an S-shaped curve to compress super-whites as well as increase contrast to compensate for viewing conditions. The maximum code value is 1.0 (after normalization in the case of integers), and the values are not proportional to luminance in the original scene. Output-referred images are theoretically ready for display. However, this does not necessarily mean that they are ready for display on a specific device — for example, they may have been tone-mapped but still require a specific gamma for display on a particular monitor. Examples of output-referred images include sRGB, HD video, digital cinema (DCI), and so on. Output-referred images are also called "display-referred".
Intermediate-referred images are somewhere between scene-referred and output-referred. They have had some color processing performed, so the code values are not proportional to scene luminance, but they are still not ready for display. Examples of intermediate-referred images include log encodings like Cineon-style film scans, Academy Density Exchange (ADX), some digital cinema camera outputs, and so on.

There is some confusion about how to convert between image states in the context of a "linear workflow" for CG rendering and compositing. Much of the confusion comes from the word "linear" — there are actually different kinds of linear encodings: scene-referred and output-referred. It is extremely important to understand the difference between scene-linear images and output-linear images (also called "linearized output-referred").

In both cases, the encoding is proportional to luminance — in other words, no gamma encoding has been applied. However, in scene-linear images, the values are proportional to the luminance of the scene but in output-linear images, the values are proportional to the luminance of the display. The mathematics that are used to render computer graphics assume the color space is linear, and almost always this means a scene-linear color space rather than an output-linear one.

To prepare a scene-linear image for display, you need to do more than simply apply a gamma encoding. Because the scene-linear image has a high dynamic range and will be viewed on a device with a limited dynamic range in a different viewing environment, you need to apply a tone map before the gamma encoding to produce an image that looks correct.

Conversely, to convert a video image to scene-linear, it is not sufficient to simply remove the gamma encoding. You also need to apply an inverse tone map to restore the luminance values of the original scene.

There is one notable exception: if an output-referred image is used as a texture to control diffuse reflectance or a similar property, then it might not be suitable to apply an inverse tone map. See Color Managing Textures and Maps.

To make matters more confusing, video images are also sometimes called "linear" (as opposed to "log"). Video images are actually output-referred with gamma, so they must have their gamma encoding removed to become output-linear, and then must be "untone-mapped" to become scene-linear.

Viewing Conditions

Because human vision is adaptive, the appearance of color stimuli depends on the viewing environment. For example, a piece of paper will appear to be "white" under both bright daylight and a dimmer tungsten light bulb, even though it is lit by different amounts and hues of light.

Aspects of the viewing environment that control the appearance of a color include:

The absolute luminance level of the image or scene. For example, a white shirt seen outdoors might have a luminance of 30,000 candelas per square meter, but its reproduction in a cinema might be only about 30 candelas per square meter — a factor of 1000 times dimmer.
The "surround", that is, the color and brightness of objects in the field of view around the image. For example, the surround is dark in the case of cinema in a theater, dim for television in a home environment (or rather, it should ideally be dim), and normal (or none) for a real-world scene instead of an image.
The adaptive white point. This is the color that is considered "white" after an observer has adapted to a given viewing environment.

The huge difference in absolute luminance level and surround between a typical outdoor daylit scene and a cinema or television viewing environment is one of the reasons that tone-mapping must include a contrast boost to scene-linear colors to make them look good on a projector or display.

The adaptive white point can be specified in one of several ways. One way is to refer to the chromaticity of a standard illuminant, such as illuminant A or the D series (D50, D55, D65, and D75) all specified by CIE. Another way is to refer to the correlated color temperature (CCT) as measured in kelvins (K). A third way is to specify the chromaticity coordinates — for example, the DCI/SMPTE calibration white for digital cinema is CIE {x = 0.314, y = 0.351}.

To compensate for differences in the adaptive white point between two environments, a chromatic adaptation transform is used to preserve the color appearance. For example, a chromatic adaptation that converts colors intended for a D65 display to the equivalent colors for a 9300K display must increase the saturation of the reds.

Device-dependent Color Spaces

Device-dependent color spaces rely on the characteristics of a particular camera, monitor, projector, printer, or other device. Sending the same numeric color code value to a digital cinema projector as well as to a motion picture film recorder will result in different colors.

However, devices can be characterized. Characterization involves precisely measuring their responses in terms of absolute colorimetry. In this way, characterization provides a means to convert between device-dependent and device-independent color spaces. sRGB and AdobeRGB are essentially virtual device-dependent spaces that have been characterized well enough to use them as if they were device-independent.

In order for the characterization to remain valid, a device must be calibrated. Calibration involves adjusting the device to meet the "aim" (that is, the intended primaries, white point, and gamma) corresponding to that characterization. This process must be repeated periodically because devices' responses drift with use over time. For more information, see Calibrating Your Monitor.