Repository layering and state consistency issues
What is really difficult to manage is that local cache, memory state, remote packet return and UI derived state are all secretly writing the "truth"
When many Android projects are in a state of confusion, the first reaction is to continue adding layers.
ViewModel -> UseCase -> Repository -> LocalDataSource -> RemoteDataSource When this string is laid out, the code does look neater. The problem is, neatness and consistency are not the same thing. Many teams are making the Repository more and more like a “unified portal”, but in the end they find that the page status is more difficult to infer: the list and details are inconsistent, the collection status jumps back and forth, the UI does not change after the request is successful, and a set of old data appears after the process is rebuilt.
My judgment is: **The value of Repository layering lies in clear state sources and write boundaries. As long as the memory cache, local database, remote packet return and UI derived state can all change their values, the neater the layering, the harder it is to maintain state consistency. **
The real problem is that there is more than one truth
A common situation is that Repository can “unify data management”, which is only half correct.
Of course, Repository can package the network, local cache, and disk persistence, but if you don’t continue to ask “who is the source of the truth,” Repository just wraps multiple states into the same class name.
The most common path out of control is this:
- The page reads the local database first and displays the old value immediately;
- Initiate a remote request at the same time, and update the memory cache after returning the packet;
- In order to pursue a smooth interaction, first directly change the UI state and do optimistic update;
- Another page reads another value from the singleton field of Repository;
- Finally, the asynchronous download of the database is completed, and the old page is pushed back.
At this time, what you see on the surface is that the “architecture is hierarchically complete”. In fact, there are already four sets of states in the system competing for the right to interpret.
They answer different questions respectively:
- The database wants to answer “Can it be restored next time it is started?”;
- The memory cache wants to answer “Is this access fast?”;
- The remote end returns the packet and wants to answer “What did the server just say?”;
- UI state wants to answer “how should the interface be rendered at this moment”.
These things are all important, but being important does not mean that they can all be the source of the truth.
If there is no clear definition of “who is responsible for persistent truth values, who is only responsible for derived presentation, and who can only read but not write”, the Repository will slowly degenerate into a state transfer station. It captures all the complexity but eliminates none of it.
Repository is most easily abused as a coordinator that “can change anything”
The problem with many codes is not that the Repository is too thin, but that it is too powerful.
A typical Repository often does these things at the same time:
- Make network requests;
- Read and write Room;
- Maintain memory map; -Fields needed to assemble the UI;
- Rollback optimistic update on failure;
- Conveniently send events to notify other modules to refresh.
It seems very concentrated, but in fact it is the “data access layer”, “state coordination layer”, “caching strategy layer” and “domain rule layer” rolled into a ball.
Once Repository is responsible for both “read aggregation” and “multi-source write coordination”, it will naturally enter an awkward state: anyone can change data through it, but no one can quickly tell which observers a change will ultimately affect and which writeback path will be triggered.
For example, a collection operation, many implementations are like this:
suspend fun toggleFavorite(id: String) {
memory[id] = !(memory[id] ?: false)
dao.updateFavorite(id, memory[id]!!)
api.toggleFavorite(id)
}
This code is conveniently short, but it mixes three levels of semantics:
- The UI wants to give immediate feedback, so change the memory first;
- I want to keep the local consistency, so I write the library immediately;
- The server is the real arbiter, but the results are returned at the end.
The problem is not that “change local first” must be wrong, but that the failure semantics are not defined.
How to converge if the interface times out but the server actually succeeds? If the local write succeeds but the remote write fails, who will roll back? If two pages are clicked as favorites at the same time, which one will prevail in the end?
Once these issues are not explicitly designed, the Repository just hides the race condition in a seemingly clean method.
Flow can propagate status, which does not mean it automatically guarantees consistency.
In recent years, Android has been fond of connecting Flow, StateFlow, and SharedFlow to the Repository, and then exposing a “responsive data source” to the upstream. This is certainly better than calling back everywhere, but it often creates the illusion that as long as I stream the data, the consistency problem will disappear.
Won’t.
Responsive flow solves how changes are propagated, not who determines the changes.
The following pattern is very common:
val userFlow = combine(
dao.observeUser(id),
memoryStateFlow,
remoteRefreshStateFlow
) { local, memory, remote ->
mergeUser(local, memory, remote)
}
The biggest risk of this code is not that it is ugly in writing, but that mergeUser() often quietly introduces business decisions:
- The name is based on the remote end;
- Whether it is online or not depends on the memory;
- Whether it has been read will be determined locally;
- The loading state is additionally hung on the UI.
What is needed in the end is a “stitching result that can barely render the page at this moment.”
This type of splicing is very convenient on the read path, but it can easily get out of control on the write path, because it is already difficult to answer:
- Which level should a certain field be changed?
- After a layer changes, do other layers need to be synchronized?
- After the process is rebuilt, which fields can still be rebuilt;
- Which fields will be overwritten with new values during offline recovery.
So the strange phenomenon in many projects is: the more beautifully written the data flow is, the more metaphysical the status bugs become. The root cause is that there is no single accountable source of state in the system.
What should really be controlled is the writing boundary
The most important constraint in Repository design is “where there is write permission.”
If a business object can be modified by UI optimistic update, modified by Repository memory cache, pushed back by database observer, and overwritten by interface return packets, sooner or later it will encounter order inconsistency problems.
Rather than continuing to add abstractions, I recommend clarifying the writing boundaries first:
1. Choose the source of the truth first
Not all scenarios require “the local database is the only source of truth”, but a primary source must be selected.
- In scenarios where offline priority and list recovery are possible, the local database should usually be used;
- In scenarios where real-time is strong and old values cannot be accepted, remote results may be used;
- Pure interface interaction states, such as expansion, selection, and input, should be explicitly left in the UI state and not poured back into the Repository.
The key is not to rely on the database to determine half of the fields, the other half of the fields to be determined in memory, and then rely on the UI to fill the hole when an error occurs.
2. Separate “derived state” and “persistent state”
A lot of the confusion comes from writing temporary display state back to the persistence layer.
For example:
isLoadingisRefreshingisExpandedpendingRetryCount
These states can determine how the UI is drawn, but they should not be mixed with business truth values in the same entity and spread around.
Once the derived state is put into the public model of the Repository, it will be mistakenly reused between different pages and different life cycles. In the end, it is not even clear that “this field still retains the value of the last page”.
3. Make the writing path less than the reading path
Reads can be aggregated and writes can be closed.
You can put the database, memory, and remote refresh signals together when reading to give the page a sufficient model; but when writing, it is best to only take one controlled path and let it decide:
- Whether to write locally first;
- Whether compensation is required; -Whether it is allowed to overwrite old versions;
- Whether to include a version number or timestamp;
- What semantics the UI should see after a failure.
The more write entries the system allows, the more consistency depends on “don’t make mistakes”. It’s not design, it’s luck.
A common counter-example: In order to “experience silky smoothness”, make changes first before talking about it
The easiest way to write bad state consistency is the small decision of “this interaction is very simple, let’s change it locally first”.
For example, likes, collections, followings, and readings are all too easy to be regarded as “change the UI first, then fail”. The problem is that once they cross pages, cross lists, and cross caching tiers, they are no longer small decisions.
Failure cases usually look like this:
- Click Favorite on the details page, and the button will light up immediately;
- The list page also monitors the same Repository memory state, so it lights up synchronously;
- The interface times out and the Repository triggers rollback;
- But the list page has already obtained the old value because of the database observer, and the rollback order is different from the details page;
- The user returns to the previous level and sees that the status of the two pages is inconsistent;
- After the killing process is restarted, it reverts to the third result.
The most annoying thing about this kind of problem is that it doesn’t always recur, so the team can easily attribute it to “Flow timing issues”, “Compose reorganization issues” or “sporadic network fluctuations”.
In fact, the root cause is simpler: ** allows multiple layers to have the qualification to write the final result at the same time. **
This service is layered for accountability, not formal neatness
I’m not against Repository layering. Without Repository, many Android projects would be even more messy.
But what Repository really should provide is:
- Can you explain where the reading path comes from? -Write down what rulings the path passed through and whether accountability can be held;
- Who will prevail when an error occurs and whether it can be recovered;
- What is shared between pages is the business true value or the temporary display state, and whether it can be separated.
If these questions cannot be answered, no matter how beautiful the layering is, it will only be a visual order.
It makes the code look more like an architecture diagram, but it doesn’t necessarily make the state look more like a system.
Applicable boundaries
This article mainly focuses on:
- Have local cache or Room;
- Multiple page sharing status;
- Simultaneously pursue first-screen speed, offline recovery and instant interactive feedback;
- Use Repository + Flow/StateFlow to organize data reading and writing.
If the application is very light, the data is almost always a one-time request, the page is ready to use, and there is no need for cross-page synchronization, then even if the Repository is written simply and crudely, the state consistency problem will not be particularly prominent.
The real trouble is for projects that are “medium to large but not yet big enough to be completely platformed”: there are more and more functions, and the data sources are becoming more and more complex, but the team is still using the early method of “package a layer of Repository first and then talk about it” to support it. At this stage, systems that are neat in structure and chaotic in behavior are most likely to appear.
Summary
The most common misunderstanding of Repository layering in Android is to mistake “unified access entrance” for “unified natural state”.
Unified entrances can only reduce confusion on the call surface; only by clearing the source of the truth, closing the write boundary, and defining failure semantics in advance can we truly reduce state fights.
Otherwise what is needed is a neatly layered set of things that are harder to hold accountable.
读完之后,下一步看什么
如果还想继续了解,可以从下面几个方向接着读。