Content Invalidation Jobs

Content Invalidation Jobs, or simply “jobs” as they are sometimes known, are ways of forcing cache servers to treat content as no longer valid, bypassing their normal caching policies.

In general, this should be unnecessary, because a well-behaved Origin should be setting its HTTP caching headers properly, so that content is only considered valid for some appropriate time intervals. Occasionally, however, an Origin will be too optimistic with its caching instructions, and when content needs to be updated, cache servers need to be informed that they must check back with the Origin. Content Invalidation Jobs allow this to be done for specific patterns of assets, so that cache servers will check back in with the Origin and verify that the content they have cached is still valid.

The model for Content Invalidation Job as API objects is given in Content Invalidation Job as a Typescript interface..

#8 Content Invalidation Job as a Typescript interface.
/** This is the form used to create a new Content Invalidation Job */
interface ContentInvalidationJobCreationRequest {
    deliveryService: string;
    invalidationType: "REFRESH" | "REFETCH";
    regex: `/${string}` | `\\/${string}`; // must also be a valid RegExp
    startTime: Date; // RFC3339 string
    ttlHours: number;
}

/**
 * This is the form used to return representations of Content Invalidation
 * Requests to clients.
 */
interface ContentInvalidationJob {
    assetUrl: string;
    createdBy: string;
    deliveryService: string;
    id: number;
    invalidationType: "REFRESH" | "REFETCH";
    startTime: Date; // RFC3339 string
    ttlHours: number;
}

Asset URL

This property only appears in responses from the Traffic Ops API (and in the bodies of PUT requests to jobs, where the scheme and host/authority sections of the URL is held immutable). The Asset URL is constructed from the Regular Expression used in the creation of a Content Invalidation Job and the Origin Server Base URL of the Delivery Service for which it was created. It is a URL that has a valid regular expression as its path (and may not be “percent-encoded” where a normal URL typically would be). Requests from CDN clients for content that matches this pattern will trigger Content Invalidation behavior.

Created By

The username of the user who created the Content Invalidation Job is stored as the Created By property of the Content Invalidation Job.

Delivery Service

A Content Invalidation Job can only act on content for a single Delivery Service - invalidating content for multiple Delivery Services requires multiple Content Invalidation Jobs. The Delivery Service property of a Content Invalidation Job holds the xml_id of the Delivery Service on which it operates.

Changed in version 4.0: In earlier API versions, this property was allowed to be either the integral, unique identifier of the target Delivery Service, or its xml_id - this is no longer the case, but it should always be safe to use the xml_id in any case.

ID

The integral, unique identifier for the Content Invalidation Job, assigned to it upon its creation.

Invalidation Type

Invalidation Type defines how a cache server should go about ensuring that its cache is valid.

The normal operating mode for a Content Invalidation Job is to force the cache server to send a request to the Origin to verify that its cache is valid. If that is the case, no extra work is done and business as usual resumes. However, some Origins are misconfigured and do not respond as required by HTTP specification. In this case, it is strongly advised to fix the Origin so that it properly implements HTTP. However, if an Origin is sending cache-able responses to requests, and cannot be trusted to verify the validity of cached content based on cache-controlling HTTP headers (e.g. If-Modified-Since) instead returning responses like 304 Not Modified even when the content has in fact been modified, and if correcting this behavior is not an option, then the cache server may be forced to pretend that the content it has was actually invalidated by the Origin and must be completely re-fetched.

The two values allowed for a Content Invalidation Job’s Invalidation Type are:

REFRESH

A REFRESH Content Invalidation Job instructs cache servers to behave normally - when matching content is requested, send an upstream request to (eventually) the Origin with cache-controlling HTTP headers, and trust the Origin’s response. The vast majority of all Content Invalidation Jobs should most likely use this Invalidation Type.

REFETCH

Rather than treating the cached content as “stale”, the cache servers processing a REFETCH Content Invalidation Job should fetch the cached content again, regardless of what the Origin has to say about the validity of their caches. These types of Content Invalidation Jobs cannot be created without a proper “semi-global” refetch_enabled Parameter.

Caution

A “REFETCH” Content Invalidation Job should be used only when the Origin is not properly configured to support HTTP caching, and will return invalid or incorrect responses to conditional requests as described in section 4.3.2 of RFC 7234. In any other case, this will cause undo load on both the Origin and the requesting cache servers, and “REFRESH” should be used instead.

Regular Expression

The Regular Expression of a Content Invalidation Job defines the content on which it acts. It is used to match URL paths (including the query string - but not including document fragments, which are not sent in HTTP requests) of content to be invalidated, and is combined with the Origin Server Base URL of the Delivery Service for which the Content Invalidation Job was created to obtain a final pattern that is made available as the Asset URL.

Note

While the Traffic Ops API and Traffic Portal both require the Regular Expression to begin with / (so that it matches URL paths), the Traffic Ops API allows optionally escaping this leading character with a “backslash” \, while Traffic Portal does not. As / is not syntactically important to regular expressions, the use of a leading \ should be avoided where possible, and is only allowed for legacy compatibility reasons.

Table 45 Aliases/Synonyms

Name

Use(s)

Type

Path Regex

In Traffic Portal forms

unchanged (String, str, etc.)

regex

In raw Traffic Ops API requests and responses, internally in multiple components

unchanged (String, str, etc.)

Start Time

Content Invalidation Jobs are planned in advance, by setting their Start Time to some point in the future (the Traffic Ops API will refuse to create Content Invalidation Jobs with a Start Time in the past). Content Invalidation Jobs will have no effect until their Start Time.

TTL

The TTL of a Content Invalidation Job defines how long a Content Invalidation Job should remain in effect. This is generally expressed as an integer number of hours.

Table 46 Aliases/Synonyms

Name

Use(s)

Type

parameters

In legacy Traffic Ops API versions

A string, containing the TTL in the format TTL:Actual TTLh

ttlHours

In Traffic Ops API requests and responses

Unchanged (unsigned integer number of hours)