Swift Concurrency Series 06｜Common problems in Swift concurrency: race conditions, repeated requests and state confusion

Swift ConcurrencyJuly 26, 2024 at 04:30 AM5 min read

The real trouble is that these problems often manifest themselves as sporadic outages in the business rather than explicit breakdowns.

Topic / Swift Concurrency # Swift Concurrency # Swift # Concurrency

The most frustrating thing about concurrency bugs is that they often don’t feel like bugs.

It more often manifests itself online as these ambiguous questions:

The user said “sometimes it flashes”
Test says “Occasionally old data appears”
The product said “I just cut the filter, why did it jump back again?”
There is no clear crash in the log, but the page status is just wrong.

In other words, many concurrency problems look more like “occasional business exceptions” than “obviously technically broken”.

So in this article, I don’t want to just talk about the definition of terms, but directly focus on a more real list page scenario and break down the three most common types of problems:

Competition
Repeat request
state of confusion

And how they grow in real code.

1. First look at a page that is so real that it couldn’t be more real.

Suppose there is an article list page that supports these operations:

Automatic loading when the page enters for the first time
Pull down to refresh
Switch categories
Enter keyword search
Click “Retry”

Many projects are written like this at the beginning:

@MainActor
final class ArticlesViewModel: ObservableObject {
  @Published var items: [Article] = []
  @Published var isLoading = false
  @Published var errorMessage: String?
  @Published var selectedCategory: String = "all"
  @Published var keyword: String = ""

  let repository: ArticlesRepository

  init(repository: ArticlesRepository) {
    self.repository = repository
  }

  func onAppear() {
    Task {
      await load()
    }
  }

  func refresh() {
    Task {
      await load()
    }
  }

  func retry() {
    Task {
      await load()
    }
  }

  func categoryChanged(to value: String) {
    selectedCategory = value
    Task {
      await load()
    }
  }

  func keywordChanged(to value: String) {
    keyword = value
    Task {
      await load()
    }
  }

  func load() async {
    isLoading = true
    errorMessage = nil

    do {
      items = try await repository.fetchArticles(
        category: selectedCategory,
        keyword: keyword
      )
    } catch {
      errorMessage = error.localizedDescription
    }

    isLoading = false
  }
}

When this code is first written, everyone usually thinks it is “quite smooth”:

Yes async/await
The code is straightforward
Every entrance works

But as long as the page is actually used, concurrency problems will soon arise.

2. The first type of problem: the race condition is a default order that does not exist.

Still this code. Its core problem is not that it opens a lot of Task, but that it defaults to these things happening in the order you want:

The request sent first will be returned first.
When the old request comes back, the current filtering conditions have not changed.
The start and end of loading always correspond to one-to-one

But asynchronous systems do not guarantee these orders for the team.

For example, the user operates as follows:

Enter the page and request A to issue
Immediately switch to the “iOS” category and request B to send
Enter the keyword swift again to request C to issue

At this time, if the return order is:

C comes back first
Come back after A
B comes back last

According to the current code, the three results will be changed to items. In other words, what is displayed on the final page depends on who comes back last, not who corresponds to the current user intention.

This is the most typical race condition:

The code secretly relies on order, but the order is not constrained at all.

3. The second type of problem: The root cause of repeated requests is usually that the entrance is not closed.

Looking at the ViewModel above, there are at least five entries that will trigger load():

onAppear
refresh
retry
categoryChanged
keywordChanged

Each entrance has its own Task. This is certainly legal from a syntax perspective, but from an engineering perspective it means:

There is no unified scheduling point for similar tasks
No one knows whether there is already a similar task running
When new tasks appear, old tasks have no clear fate

Then “repeat requests” are no longer accidental, but a natural product of the structure.

So in concurrency management, I rarely ask:

“Why is there an extra request here?”

I more often ask:

“How many entrances are there to the same type of tasks? Are there any substitution relationships between them?”

If you cannot answer these two questions, repeated requests are almost inevitable.

4. The third type of problem: The status is disordered, often because the expired results are still eligible to be written.

A common situation is that as long as the request returns successfully, the result should be accepted.

This is usually fine in synchronous systems, but often wrong in concurrent systems.

Because the most critical issue in a concurrent scenario is:

**Is this result still considered a valid result for the current page? **

For example:

The current page has been switched to keyword = "swift"
The result is from the old request keyword = ""

The result is real, successful, and in the right format, but it has expired. If it is still allowed to write the UI, the state will be wrong.

Therefore, in a concurrent system, “the result is correct” and “the result is valid” are two different things. On the surface, many page problems appear to be wrong results, but in fact, it is closer to not being able to judge whether they are still qualified to be implemented.

5. Don’t rush to use complex tools first. The first step is to close similar tasks.

What the above code needs most is to do a very simple thing first:

**Give similar tasks a unified entrance. **

For example, first load the list like this:

@MainActor
final class ArticlesViewModel: ObservableObject {
  @Published private(set) var state: ViewState = .idle
  @Published private(set) var items: [Article] = []
  @Published var selectedCategory: String = "all"
  @Published var keyword: String = ""

  private let repository: ArticlesRepository
  private var loadTask: Task<Void, Never>?

  init(repository: ArticlesRepository) {
    self.repository = repository
  }

  func reload() {
    let request = RequestContext(
      category: selectedCategory,
      keyword: keyword
    )

    loadTask?.cancel()
    loadTask = Task {
      await performLoad(request: request)
    }
  }

  private func performLoad(request: RequestContext) async {
    state = .loading

    do {
      let result = try await repository.fetchArticles(
        category: request.category,
        keyword: request.keyword
      )

      guard !Task.isCancelled else { return }
      guard request.category == selectedCategory,
         request.keyword == keyword else { return }

      items = result
      state = .loaded
    } catch is CancellationError {
      // 取消不更新页面
    } catch {
      guard !Task.isCancelled else { return }
      state = .failed(error.localizedDescription)
    }
  }
}

This code does several very key things:

There is only one holding point for similar loading tasks loadTask
When a new task arrives, the old task will be canceled first
Freeze the “current context” to RequestContext when sending a request
After the result is returned, it will be verified whether it still corresponds to the current page

Note that what’s really important here is that the task relationships start to become clear.

6. “Freezing request context” is so critical

Many concurrency articles talk about task cancellation, but not enough emphasis on “context snapshot”. But in the page business, it’s very important.

For example, when requesting:

selectedCategory = "ios"
keyword = "swift"

Then these two values should not dynamically read the latest values on the current ViewModel after the request flies out. Otherwise you will often get a very strange state:

When sending a request, it is a set of parameters
Another set of parameters is used when verifying the results

So a very practical principle is:

When initiating an asynchronous task, freeze the business context that the task really depends on.

In this way, there will be a clear basis for judging “whether this result is still the current result” later.

7. Many concurrency bugs end up with “too many status write entries”

A common situation is that when encountering a concurrency problem, you will immediately think of:

Do you want to lock it?
Do you want to be an Actor?
Do you want to switch threads?

Of course these are sometimes important, but in page-level scenarios, the more common problems are actually:

There are too many places to write items
Too many places can be changed isLoading
Too many entrances can send requests directly

Once the state write entries are scattered, even if there is no real data competition, the phenomenon of “the combination is wrong” will occur.

So when I do this kind of troubleshooting, I usually ask the following questions first:

Which codes have the authority to change this status
Which tasks have the right to end the current loading
Which results have the right to overwrite the current list

Once these issues are not addressed, it is usually only a matter of time before bugs develop.

8. An evolution sequence closer to the real project

If you really want to solve this kind of problem, I suggest evolving in this order instead of introducing too many mechanisms at the beginning:

1. Close the entrance to similar tasks

First, let “list loading” have only one unified entrance, instead of sending its own request for each UI event.

2. Clarify the task replacement relationship

Which tasks should be concurrent and which should cancel old tasks and only keep the last one.

3. Freeze request context

Collect the key business parameters relied upon when making requests into a clear object.

4. Add validity judgment to the result

Not all successfully returned results are eligible to change the current page.

5. Finally, consider more complex shared state isolation

For example, cross-page shared cache, cross-module resource coordination, then look at Actor, unified coordinator and other solutions.

This order is more stable because it solves the business concurrency relationship first, rather than introducing more complex technical vocabulary first.

9. Conclusion: The essence of most business concurrency problems is “no modeling of task relationships”

Race conditions, duplicate requests, and state confusion seem to be three problems, but the actual root causes are often very close:

Who is the same task as who, no modeling
New tasks are coming, what to do with old tasks, there is no modeling
Is the result still valid? There is no modeling.
Where can I write my status without closing it?

So to rephrase this article in a shorter way, I would say:

Most concurrency issues in business appear to be incompetent with concurrency syntax, but in fact they are closer to failing to clearly model task relationships, result validity, and state write permissions.

Once these three things start to become clear, a lot of “accidental confusion” will go away more easily than you think.

FAQ

Continue reading

Swift Concurrency · 3 tags

Swift Concurrency Series 08｜Asynchronous code organization in real projects

What is really difficult is whether the entire asynchronous link can remain clear for a long time.

Swift Concurrency · 3 tags

Swift Concurrency Series 07｜Common pitfalls when combining SwiftUI with async/await

The real pitfall is usually not in the syntax, but in whether the "page life cycle" and "task life cycle" are aligned.

Swift Concurrency · 3 tags

Swift Concurrency Series 05｜The difference between Actor and traditional thread-safe writing methods

Actors are not "Swift locks". What they really change is how shared state is organized.

Back home View same category

Swift Concurrency Series 06｜Common problems in Swift concurrency: race conditions, repeated requests and state confusion

1. First look at a page that is so real that it couldn’t be more real.

2. The first type of problem: the race condition is a default order that does not exist.

3. The second type of problem: The root cause of repeated requests is usually that the entrance is not closed.

4. The third type of problem: The status is disordered, often because the expired results are still eligible to be written.

5. Don’t rush to use complex tools first. The first step is to close similar tasks.

6. “Freezing request context” is so critical

7. Many concurrency bugs end up with “too many status write entries”

8. An evolution sequence closer to the real project

1. Close the entrance to similar tasks

2. Clarify the task replacement relationship

3. Freeze request context

4. Add validity judgment to the result

5. Finally, consider more complex shared state isolation

9. Conclusion: The essence of most business concurrency problems is “no modeling of task relationships”

What to read next

Want more posts about Swift Concurrency?

Want to keep following #Swift Concurrency?

Want to explore another direction?

Continue reading