Back home

Front-end delivery in the era of high-frequency publishing needs to redesign caching and compression collaboration

As resources become more and more fragmented and versions become more and more frequent, it is often not the compression rate that really gets out of control first, but the release rhythm of cache keys, dictionary versions, and return-to-origin costs.

Once front-end resources enter a high-frequency release rhythm, performance issues will soon no longer be as simple as “turning on Brotli”. The first screen slows down, the back-to-origin traffic increases, and the edge node CPU jitters. On the surface, it seems that the compression is not aggressive enough. Looking deeper, it is often the caching and compression that are optimized separately, and finally undermine each other on the publishing link.

This kind of problem is generally not exposed in the first version. At the beginning, the team would only see some scattered signals: a small change caused a breakdown in the hit rate of static resources, an abnormal increase in edge compression CPU on the eve of a major promotion, and the return packet volume in the grayscale stage did not match the official traffic. If you continue to check, the clues usually converge to the same thing: although the resource content has only changed a little, the cache key, chunk splitting and compressed input have become a different set of things, and the transport layer is forced to swallow the entire cost again.

Until the resource hash is stable, the compression benefits are simply untenable.

After front-end projects are released in parallel with multiple pages, multiple routes, and multiple teams, the most easily overlooked aspect is file name stability. As long as the chunk segmentation drifts slightly, even if the business code only changes the copy of a button, the final product may also rewrite the hash of a series of public bundles. What the caching system sees is a batch of brand new objects, and what the compression system sees is a batch of inputs that appear for the first time.

At this time, no matter how high the compression rate is, it cannot save the hit rate from collapse. The old files are still lying in the edge nodes, and the new files have been rekeyed; the browser’s local cache is completely invalid, and the CDN has to re-pull the source, re-compress, and re-distribute. A small business change is magnified into repeated work for the entire transmission link.

The really useful action is usually not to continue to adjust the compression level, but to first control the stability of the released product:

  • Public dependencies are cut into separate layers to reduce business changes and bring basic packages together to change the hash.
  • Avoid mixing high-frequency changes such as timestamps and build numbers directly into product content
  • Let the code near the same route fall into stable chunks as much as possible instead of being reshuffled every time it is compiled.

Only when the resource identity is stabilized can the cache be continuously reused and the compression results have cumulative value.

High-frequency press conference rewrites the compression problem into a dictionary version problem

As resources become more and more fragmented, single-file Brotli or gzip is still important, but it is no longer everything. The real cost starts to shift toward duplicate pieces: framework runtime code, style templates, interface type declarations, packaging layers generated by packagers, are often highly similar between batches of versions. With a fast release tempo, these riffs will be transferred over and over again.

The problem is that the compression dictionary can easily turn from an optimization to a disruption if it is not managed along with the release cadence. If the dictionary is switched in advance, the new dictionary referenced by the old page will be mismatched; the dictionary is cut into too many pieces, and the number of versions to be maintained by edge nodes increases sharply; the dictionary update is not synchronized with the resource online, and the objects that should have been hit are returned to full transmission.

This is also a very practical change in recent front-end delivery: caching strategies and compression protocols can no longer be maintained by different teams. Resource versions, dictionary versions, and cache key spaces are essentially the same publishing issue.

A hierarchical approach like the following is usually more stable than “unified strong compression for the entire site”:

const policy = {
  immutableAssets: 'public, max-age=31536000, immutable',
  releaseManifest: 'public, max-age=60, stale-while-revalidate=30',
  sharedDictionary: 'versioned-by-release-train'
}

The key is not the configuration itself of these three lines, but the constraints it expresses: long life cycle resources, short life cycle list, and compressed dictionary version must evolve together according to the same release rhythm.

The pressure of returning to the source is often not that the file is too large, but that the failure method is too rough.

Another very common misjudgment is to attribute the increase in bandwidth directly to the weight of the page. Pages are certainly getting heavier, but the more dangerous amps online are usually the way to go.

If you purge by directory, prefix or even the entire site every time you publish, the cache layer will lose its memory instantly. At this time, even if the file size does not continue to grow, the return-to-origin peak value will be pushed up by itself. As soon as the source is returned, the edges are recompressed, the objects are rewarmed, and the browser is redownloaded. The publishing window will change from a small step change to a full site relocation.

In this type of scenario, the most valuable thing is the controllable failure radius:

  • Only invalidate manifest, HTML and mutable resources that have actually changed
  • Try not to purge static files with hash, and hand them over to new references for natural switching.
  • Split the release into the order of “first upload new resources, then cut references, then recycle old resources” instead of clearing them all at once

What is really sensitive about the transfer cost is not only the file size, but also how the system decides which content must be re-fetched.

The applicable boundary is determined together with resource scale and release frequency.

This set of co-design is not required for all sites. For projects with a small number of static pages, a small resource package, and a release frequency of weekly or even monthly, using traditional hash file names plus Brotli pre-compression is usually stable enough.

Caching and compression together quickly become a delivery infrastructure once these characteristics are in place:

  • Released multiple times a day, with grayscale, rollback or regional launches
  • The front-end product is large in size, has many public dependencies, and has complex chunk relationships.
  • CDN, object storage, edge compression and browser caching simultaneously participate in the transmission link
  • The traffic is so high that the cache hit rate and return-to-origin peak value will directly reflect the cost and stability.

After front-end delivery enters this stage, compression is no longer just “making files smaller”, and caching is no longer just “storing more copies of content”. What the two decide together is: for a small business change, whether it is just to send one more chunk, or whether the entire transmission link needs to be run again. The more frequently you publish, the more expensive this difference becomes.

FAQ

What to read next

Related

Continue reading