Back home

The real breakthrough of China’s open source model is the collaboration network

The weight can be implemented, and updates, reviews and consensus will be more fragile.

When talking about “whether it will be sealed” in the open source model, the easiest thing to look at is to regard the weight file as everything.

After the weights are downloaded, the model itself often does not disappear so easily. What is easier to break first is the network that revolves around it: mirror sites, evaluation sets, inference templates, fine-tuning scripts, problem fixes, default deployment parameters, and the consensus in the community that “this version can run, and that version should not be touched.”

The part that can hit the ground is the least afraid of breaking.

As long as an open source model has entered a local warehouse, object storage or intranet image, no matter how tightened the external world is, the file will usually still be there. Offline copies, internal caches, and historical build products will all delay the question of “whether it can still be used” for a long time.

This is also the biggest difference between the open source model and pure cloud services. Once a cloud service is blocked, the entrance is often gone; even if the upstream service of the open source model is stopped, the weights, tokenizer, and inference image in hand can continue to run. The question is not “do you have it?” but “can you continue to use it in the same way as others?”

What is really crisp is the synchronization relationship

Just because the model can continue to run, does not mean that the team can continue to keep up with it.

The first things to loosen up are usually synchronization relationships:

  • The upstream released a new version, but the internal mirror did not keep up in time.
  • The evaluation set has been revised, and the regression results can no longer be aligned with the old records.
  • The chat template or tokenizer has been moved a little, but the output style has changed a lot.
  • A certain fix only entered the community PR, not the corporate intranet image
  • The default quantization, default context length, and default sampling parameters are each drifted apart.

These things don’t look big on their own, but stacking them together will break the “same model” into several parts.

At this stage, the real harm caused by external restrictions is not to erase a weighted document from the world, but to break up the fact that “everyone is looking at the same thing.” The team is still talking about the same model name, but what they actually get is a combination package with different versions, different templates, and different parameters.

Reviews, fixes and experience will be broken together

Once an open source model enters the real workflow, the real value is usually not the weight itself, but the judgment accumulated around the weight.

Which version is more stable, which tokenizer will break long text, which set of sampling parameters is more suitable for customer service scenarios, which fine-tuning script will increase the illusion, these experiences all rely on continuous exchange. As long as the collaboration network remains, everyone can still tinker around the same baseline; once the collaboration network is broken, each team will slowly develop its own private version.

Private versions are not a bad thing, but the price creeps up:

  • Return to baseline becomes increasingly difficult to reuse
  • Accident review becomes increasingly difficult to align
  • Fix patch becoming increasingly difficult to sync
  • The same problem will appear repeatedly in different teams

At this time, it looks like “the model is still there”, but in fact it has become “many local copies that are barely usable”, and there is no common update path between them.

What is really worth worrying about is not blocking, but forking

The open source model is difficult to be completely sealed like an online API because the replicability is there. What we should really be wary of is that after external pressure breaks up distribution, repair, and collaboration, the model begins to diverge along the rhythms of different organizations.

Once there are more forks, it is no longer a question of “can it be downloaded?” but “who can guarantee that this is still the same type of thing?” This matter will directly increase the access cost: new reviews need to be redone, old faults need to be re-explained, version differences need to be rearranged, and the team has to make up its own rollback and freezing strategies for each forked line.

The resilience of the open source model is indeed stronger than that of pure cloud services; but its vulnerability is also very clear, not whether the weight has been taken away, but whether the collaboration network can continue to maintain the same name as the same thing.