使用 Atom 避免 ArrayBuffers 中的竞态条件 - Avoiding race conditions in SharedArrayBuffers with Atomics

2017-07-06

字数: 5.1k 阅读时长: 32mins 阅读次数: 次

分类：翻译

标签：内存管理 WebAssembly

原文：Avoiding race conditions in SharedArrayBuffers with Atomics

本文译自Lin Clark 内存管理的卡通介绍系列，渣翻译，因此附上英文原文。

内存管理
ArrayBuffers 和 SharedArrayBuffers 的卡通介绍
使用 Atom 避免 ArrayBuffers 中的竞态条件

In the last article, I talked about how using SharedArrayBuffers could result in race conditions. This makes working with SharedArrayBuffers hard. We don’t expect application developers to use SharedArrayBuffers directly.

在上一篇文章中，我谈到了使用 SharedArrayBuffers 可能导致竞态条件。这使得使用 SharedArrayBuffers 很难。我们不希望应用程序开发人员直接使用SharedArrayBuffers。

But library developers who have experience with multithreaded programming in other languages can use these new low-level APIs to create higher-level tools. Then application developers can use these tools without touching SharedArrayBuffers or Atomics directly.

但是，具有其他语言的多线程编程经验的类库开发人员可以使用这些新的底层 API 来创建更高级别的工具。应用程序开发人员可以直接使用这些更高级别的工具而不用直接接触 SharedArrayBuffers 或 Atomics。

Even though you probably shouldn’t work with SharedArrayBuffers and Atomics directly, I think it’s still interesting to understand how they work. So in this article, I’ll explain what kinds of race conditions concurrency can bring, and how Atomics help libraries avoid them.

即使你可能不直接使用 SharedArrayBuffers 和 Atomics，我认为了解它们是如何工作的是有趣的。所以在本文中，我将介绍并发可以带来什么样的竞态条件，以及 Atomics 如何帮助库避免它们。

But first, what is a race condition?

但首先，什么是竞态条件？

竞态条件：你以前见过的一个例子

A pretty straightforward example of a race condition can happen when you have a variable that is shared between two threads. Let’s say one thread wants to load a file and the other thread checks whether it exists. They share a variable, fileExists, to communicate.

当一个变量在两个线程之间共享时，一个很直接的竞态示例就可能发生。假设一个线程想加载一个文件，另一个线程检查它是否存在他们共享一个变量 fileExists 进行通信。

Initially, fileExists is set to false.

刚开始，fileExists 设置为 false。

As long as the code in thread 2 runs first, the file will be loaded.

只要线程2的代码首先运行，文件就被加载。

But if the code in thread 1 runs first, then it will log an error to the user, saying that the file does not exist.

但如果线程1中的代码首先运行，那么它将向用户记录一个错误，表示该文件不存在。

But that’s not the problem. It’s not that the file doesn’t exist. The real problem is the race condition.

但这不是问题，这并不是文件不存在的问题，真正的问题是竞争状况。

Many JavaScript developers have run into this kind of race condition, even in single-threaded code. You don’t have to understand anything about multithreading to see why this is a race.

许多 JavaScript 开发人员遇到过这种竞争状况，即使是单线程编码。先看看为什么这会引起竞争，你不必先理解多线程的内容。

However, there are some kinds of race conditions which aren’t possible in single-threaded code, but that can happen when you’re programming with multiple threads and those threads share memory.

然而，有些类型的竞态条件在单线程代码中是不可能的发生的，但是当你使用多线程进行编程并且这些线程共享内存时，可能会发生这些情况。

不同类别的竞态条件和 Atomics 对象如何帮忙解决

Let’s explore some of the different kinds of race conditions you can have in multithreaded code and how Atomics help prevent them. This doesn’t cover all possible race conditions, but should give you some idea why the API provides the methods that it does.

我们来讨论一些你可能在多线程代码中碰到的不同类型的竞态条件，以及 Atomics 如何防止它们。这并不包括所有可能的竞态条件，但应该给你一些了解，了解为什么 API 提供处理它们方法。

Before we start, I want to say again: you shouldn’t use Atomics directly. Writing multithreaded code is a known hard problem. Instead, you should use reliable libraries to work with shared memory in your multithreaded code.

在我们开始之前，我想再说一遍：你不应该直接使用 Atomics。编写多线程代码是一个已知的难题。相反，你应该使用可靠的库在多线程代码中共享内存。

With that out of the way…

单一操作的竞赛条件

Let’s say you had two threads that were incrementing the same variable. You might think that the end result would be the same regardless of which thread goes first.

假设你有两个线程递增同一变量。你可能认为无论哪个线程先操作，最终的结果将是一样的。

But even though, in the source code, incrementing a variable looks like a single operation, when you look at the compiled code, it is not a single operation.

即使在源代码中，增加一个变量看起来像一个操作，但当你查看编译后的代码时，它并不是一个单一的操作。

At the CPU level, incrementing a value takes three instructions. That’s because the computer has both long-term memory and short-term memory. (I talk more about how this all works in another article).

在 CPU 级别，增加一个值需要三条指令。这是因为计算机有长期存储和短期存储。（我在另一篇文章中更多地谈论这一切如何工作）。

All of the threads share the long-term memory. But the short-term memory—the registers—are not shared between threads.

所有线程共享长期存储。但短期存储-寄存器不会在线程之间共享。

Each thread needs to pull the value from memory into its short-term memory. After that, it can run the calculation on that value in short-term memory. Then it writes that value back from its short-term memory to the long-term memory.

每个线程都需要将内存中的值从其内存中取出。之后，在短期存储中对该值进行计算。然后它将这个价值从短期存储中写回到长期存储。

If all of the operations in thread 1 happen first, and then all the operations in thread 2 happen, we will end up with the result that we want.

如果线程1中的所有操作先执行，然后线程2中的所有操作都执行，我们最终得到我们想要的结果。

But if they are interleaved in time, the value that thread 2 has pulled into its register gets out of sync with the value in memory. This means that thread 2 doesn’t take thread 1’s calculation into consideration. Instead, it just clobbers the value that thread 1 wrote to memory with its own value.

但是如果它们时间上交错执行，则线程 2 被拉入其寄存器的值与内存中实际的值不同步。这意味着线程2会忽略线程1的计算结果。相反，它用自己的值写入内存消除了线程1写的值。

One thing atomic operations do is take these operations that humans think of as being single operations, but which the computer sees as multiple operations, and makes the computer see them as single operations, too.

原子操作所做的一件事是转换那些人们认为是单一操作，而计算机视为多个操作的操作，转换它们，使计算机也将它们视为单个操作。

This is why they’re called atomic operations. It’s because they take an operation that would normally have multiple instructions—where the instructions could be paused and resumed—and it makes it so that they all happen seemingly instantaneously, as if it were one instruction. It’s like an indivisible atom.

这就是为什么它们被称为原子操作。这是因为它们将有多个指令——指令可以暂停和恢复——的操作，使这些操作全部发生在瞬间，就像是一个单独指令，这就像一个不可分割的原子。

Using atomic operations, the code for incrementing would look a little different.

使用原子操作，递增变量的代码看起来稍有不同。

Now that we’re using Atomics.add, the different steps involved in incrementing the variable won’t be mixed up between threads. Instead, one thread will finish its atomic operation and prevent the other one from starting. Then the other will start its own atomic operation.

现在我们使用的是 Atomics.add，递增变量所涉及的不同步骤不会在线程之间混合。相反，一个线程将完成其原子操作，并阻止另一个线程启动。然后另一个线程开始自己的原子操作。

The Atomics methods that help avoid this kind of race are:

有助于避免这种竞争的 Atomics 方法有：

You’ll notice that this list is fairly limited. It doesn’t even include things like division and multiplication. A library developer could create atomic-like operations for other things, though.

你会注意到这个列表是相当有限的。它甚至不包括除法和乘法等。不过，库开发人员可以为其他操作创建类似的原子操作。

To do that, the developer would use [Atomics.compareExchange](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics/compareExchange). With this, you get a value from the SharedArrayBuffer, perform an operation on it, and only write it back to the SharedArrayBuffer if no other thread has updated it since you first checked. If another thread has updated it, then you can get that new value and try again.

为此，开发人员可以使用Atomics.compareExchange。使用它，你可以从 SharedArrayBuffer 获取值，对其执行操作，并且只有在你首次检查后没有其他线程更新该值的情况下才将其写回 SharedArrayBuffer。如果另一个线程已更新，那么你可以获得该新值，然后重试。

多个操作间的竞态条件

So those Atomic operations help avoid race conditions during “single operations”. But sometimes you want to change multiple values on an object (using multiple operations) and make sure no one else is making changes to that object at the same time. Basically, this means that during every pass of changes to an object, that object is on lockdown and inaccessible to other threads.

这些原子操作有助于在“单次操作”期间避免竞态条件。但有时你想要更改对象上的多个值（使用多个操作），并确保没有其他人同时对该对象进行更改。这意味着在一个线程对对象的更改期间，该对象处于锁定状态，其他线程无法访问。

The Atomics object doesn’t provide any tools to handle this directly. But it does provide tools that library authors can use to handle this. What library authors can create is a lock.

Atomics对象不提供任何工具来直接处理这个问题。但它提供了库作者可以用来处理这个问题的工具，库作者可以用这些工具创建锁。

If code wants to use locked data, it has to acquire the lock for the data. Then it can use the lock to lock out the other threads. Only it will be able to access or update the data while the lock is active.

如果代码想要使用锁定的数据，它必须获取数据的锁。然后它可以使用锁来锁定数据阻止其他线程使用。只有当数据锁是活动的时候，才能访问或更新数据。

To build a lock, library authors would use Atomics.wait and Atomics.wake, plus other ones such as Atomics.compareExchange and Atomics.store. If you want to see how these would work, take a look at this basic lock implementation.

要构建一个锁，库作者可以使用 Atomics.wait 和 Atomics.wake，以及其他诸如 Atomics.compareExchange 和Atomics.store。如果你想看看它们是如何工作的，看看这个基本锁实现。

In this case, thread 2 would acquire the lock for the data and set the value of locked to true. This means thread 1 can’t access the data until thread 2 unlocks.

在这种情况下，线程2将获取数据锁，将数据的 locked 值设置为 true。这意味着线程1无法访问这些数据，直到线程2解锁这些数据。

If thread 1 needs to access the data, it will try to acquire the lock. But since the lock is already in use, it can’t. The thread would then wait—so it would be blocked—until the lock is available.

如果线程1需要访问数据，它将尝试获取锁。但是由于锁已经在使用，所以不能获得锁。线程1被阻止，然后等待直到锁可用。

Once thread 2 is done, it would call unlock. The lock would notify one or more of the waiting threads that it’s now available.

一旦线程2完成，它将调用解锁（unlock）。该锁将通知一个或多个在等待线程现在可用。

That thread could then scoop up the lock and lock up the data for its own use.

线程1可以使用锁锁定自己使用的数据。

A lock library would use many of the different methods on the Atomics object, but the methods that are most important for this use case are:

一个锁的库可以使用 Atomics 对象上的许多不同方法，但对于这类用例最重要的方法是：

指令重新排序引起的竞态条件

There’s a third synchronization problem that Atomics take care of. This one can be surprising.

这是 Atomics 需要处理的第三个同步问题，这个问题令人惊奇。

You probably don’t realize it, but there’s a very good chance that the code you’re writing isn’t running in the order you expect it to. Both compilers and CPUs reorder code to make it run faster.

也许你没有意识到，但是你写的代码并不总是按照你期望的顺序运行。编译器和 CPU 为了运行速度更快，都会重新排序代码。

For example, let’s say you’ve written some code to calculate a total. You want to set a flag when the calculation is finished.

例如，假设你写了一些代码来计算总和，你想在计算结束时设置一个标志。

To compile this, we need to decide which register to use for each variable. Then we can translate the source code into instructions for the machine.

为了编译这个，我们需要判断每个变量使用哪个寄存器，然后可以将源代码转换为机器指令。

So far, everything is as expected.

What’s not obvious if you don’t understand how computers work at the chip level (and how the pipelines that they use for executing code work) is that line 2 in our code needs to wait a little bit before it can execute.

到目前为止，一切都如预期。

我们的代码中的第2行需要稍等一下才能执行，如果你不了解计算机在芯片级别的工作原理（以及它们执行代码工作的流水线），你可能并不知道这一点。

Most computers break down the process of running an instruction into multiple steps. This makes sure all of the different parts of the CPU are busy at all times, so it makes the best use of the CPU.

大多数计算机将运行的指令分解成多个步骤。这样可以确保 CPU 的所有不同部分始终处于忙碌状态，这样充分利用 CPU。

Here’s one example of the steps an instruction goes through:

Fetch the next instruction from memory
Figure out what the instruction is telling us to do (aka decode the instruction), and get the values from the registers
Execute the instruction
Write the result back to the register

以下是一个指令执行步骤的例子：

从内存中读取下一条指令
找出指令告诉我们做什么（也就是解码指令），并从寄存器获取值
执行指令
将结果写回寄存器

So that’s how one instruction goes through the pipeline. Ideally, we want to have the second instruction following directly after it. As soon as it has moved into stage 2, we want to fetch the next instruction.

这就是指令如何通过流水线。理想情况下，我们希望在这条指令结束后直接执行第二条指令。然而实际上，一旦进入流水线阶段 2，就会获取下一条指令。

The problem is that there is a dependency between instruction #1 and instruction #2.

问题是指令1和指令2之间存在依赖关系。

We could just pause the CPU until instruction #1 has updated subTotal in the register. But that would slow things down.

我们可以先暂停 CPU ，直到指令1更新了寄存器中的 subTotal，但这样会减慢执行速度。

To make things more efficient, what a lot of compilers and CPUs will do is reorder the code. They will look for other instructions which don’t use subTotal or total and move those in between those two lines.

为了使执行更有效率，很多编译器和 CPU 都会重新排序代码。它们寻找不使用 subTotal 或 total 的其他指令，并将它们移动到两行之间。

This keeps a steady stream of instructions moving through the pipe.

这保证了稳定的指令流通过管道。

Because line 3 didn’t depend on any values in line 1 or 2, the compiler or CPU figures it’s safe to reorder like this. When you’re running in a single thread, no other code will even see these values until the whole function is done, anyway.

因为第3行不依赖于第1行或第2行中的任何值，所以编译器或 CPU 可以像这样重新排序。当你运行在单个线程中时，在整个功能完成之前，其他代码无论怎样都不会看到这些值。

But when you have another thread running at the same time on another processor, that’s not the case. The other thread doesn’t have to wait until the function is done to see these changes. It can see them almost as soon as they are written back to memory. So it can tell that isDone was set before total.

但是当另一个线程在同时运行另一个处理器上时，情况并非如此。另一个线程不需要等到功能全部完成才能看到这些更改。一旦它们被写回内存，另一个线程就能看到他们。所以可以说 isDone 实际是在total 之前设置的。

If you were using isDone as a flag that the total had been calculated and was ready to use in the other thread, then this kind of reordering would create race conditions.

如果你使用 isDone 作为一个标志，total已经被计算出来并准备在另一个线程中使用，那么这种重新排序就会创造竞态条件。

Atomics attempt to solve some of these bugs. When you use an Atomic write, it’s like putting a fence between two parts of your code.

Atomics 试着解决一些这样的错误。当你使用 Atomic 写入时，就像在两个部分之间放一个围栏。

Atomic operations aren’t reordered relative to each other, and other operations aren’t moved around them. In particular, two operations that are often used to enforce ordering are:

原子操作相对于彼此不重新排序，其他操作也不会移动到其中间。特别地，经常用于强制顺序的两个操作是：

All variable updates above Atomics.store in the function’s source code are guaranteed to be done before Atomics.store is done writing its value back to memory. Even if the non-Atomic instructions are reordered relative to each other, none of them will be moved below a call to Atomics.store which comes below in the source code.

函数源代码中所有的变量更改在 Atomics.store 将其值重写回内存之前都将保证完成更改。即使非原子指令相对于彼此被重新排序，它们也不会被移动到 Atomics.store 的调用之后。

And all variable loads after Atomics.load in a function are guaranteed to be done after Atomics.load fetches its value. Again, even if the non-atomic instructions are reordered, none of them will be moved above an Atomics.load that comes above them in the source code.

并且函数中所有变量加载保证在 Atomics.load 获取其值完成之后加载给变量。同样的，即使非原子指令被重新排序，它们也不会被移动到源代码 Atomics.load 之前获取值。

Note: The while loop I show here is called a spinlock and it’s very inefficient. And if it’s on the main thread, it can bring your application to a halt. You almost certainly don’t want to use that in real code.

注意：这里显示的 while 循环称为自旋锁，非常低效。如果它在主线程上，它可以使你的应用程序停止。不要在真实的代码中使用它。

Once again, these methods aren’t really meant for direct use in application code. Instead, libraries would use them to create locks.

再次，这些方法并不意味着需要直接用在应用程序代码中。相反，库将使用它们来创建锁（而你直接调用库的结果即可）。

总结

Programming multiple threads that share memory is hard. There are many different kinds of race conditions just waiting to trip you up.

共享内存的多线程编程很难，有很多不同种类的竞态条件陷阱。

This is why you don’t want to use SharedArrayBuffers and Atomics in your application code directly. Instead, you should depend on proven libraries by developers who are experienced with multithreading, and who have spent time studying the memory model.

这就是为什么你会不想直接在应用程序代码中使用 SharedArrayBuffers 和 Atomics。相反，你应该依赖具有多线程经验的开发人员开发的经过验证的库，以及他们花时间研究的内存模型。

It is still early days for SharedArrayBuffer and Atomics. Those libraries haven’t been created yet. But these new APIs provide the basic foundation to build on top of.

SharedArrayBuffer 和 Atomics 还处于早期阶段。这些库尚未创建，但是这些新的API提供了基础。