Back home

China's open source model is more likely to be slowed down than blocked.

What really becomes brittle is distribution, updates and dependency chains

When this kind of discussion falls into the project, it will eventually converge into a colder sentence: It is difficult to completely erase the open source model. What really becomes brittle first is the assembly line that revolves around the model. As long as one of the model files, images, check values, inference environment, and evaluation scripts is broken, what the team will feel is not “whether this model still exists in the world”, but “whether this upgrade can be reproduced.”

What really gets stuck is usually entrances and updates.

Official custody is the easiest to shut down first. Web pages, APIs, download pages, mirror sites, as long as the entrance is centralized, payment, legal affairs, CDN, regional restrictions, and account policies can all narrow it down. The same is true for cloud inference. Once the business outsources model capabilities to a certain hosting point, the blockade does not need to delete the model from the world. As long as the accessibility, quota, payment and regional restrictions are tightened, the system will start to shake.

But once the weight has dispersed, the situation changes. The open source model does not only live on a certain homepage, it also lives on local disks, build caches, image warehouses, and artifact storage built by the team. What you can control is more the speed at which distribution continues than the copies that already exist. To make the situation clear, the biggest impact is often not “whether you can still download a certain version”, but “whether you can stably get the same set of tokenizers, chat templates, quantization packages and dependency instructions in the future.”

It is also the most underestimated here. The first time you run the model, the risk seems to be over; the real trouble is often the second time. The second time I wanted to roll back, the image was no longer there; the second time I wanted to reproduce, the quantification format had changed; the second time I wanted to upgrade, the inference code and weight version did not match; the second time I wanted to verify, the evaluation set and preprocessing script had been changed. On the surface, there is only one missing download link, but in fact, what is missing is a complete set of repeatable supply chains.

So this type of “seal” is more like a deceleration than a deletion. What can be significantly weakened is the speed of communication, cloud access, version synchronization and ecological confidence; what is difficult to be completely erased are the weighted copies, local deployment capabilities and secondary distribution capabilities that have spread. Once the open source model enters enough machines, the risk changes from “can it exist” to “can it evolve stably”.

This is also where domestic teams are most likely to miss the mark. After integrating the model into the product, it is easy to only focus on the first round of effects and forget that the model is actually a dependency. Once a dependency has only a single point of entry, the single point will become a control point; once a dependency does not have version locking, upgrades will become a random event; once a dependency does not have an offline copy, the so-called “own ability” will be revealed after a certain mirror fails.

The more stable approach is not to imagine that there will be no blockade, but to break the blockade into several affordable small problems in advance: the weight and the runtime are stored separately, the download address and the verification value are saved together, the inference environment is made to be rebuilt offline, the evaluation results are archived by version, and the rollback path is equally clear as the release path. In this way, even if the upstream suddenly shuts down, the product will only lose one entrance, and the entire capability will not be offline at the same time.

The real moat of the open source model has never been “no one dares to manage it”, but “when it is managed, it is already difficult to manage it to a point.” There are many entrances that can be tightened, and it is difficult to recover the copies that have spread out.