超级面板
文章目录
最新文章
最近更新
文章分类
标签列表
文章归档

JavaScript 是如何工作的:V8 引擎内部机制及5个诀窍编写优化代码的技巧

原文:How JavaScript works: inside the V8 engine + 5 tips on how to write optimized code

Couple of weeks ago we started a series aimed at digging deeper into JavaScript and how it actually works: we thought that by knowing the building blocks of JavaScript and how they come to play together you’ll be able to write better code and apps.

几周前,我们开始了一系列旨在深入挖掘 JavaScript 及其实际上如何工作的系列文章:我们认为,通过了解 JavaScript 的构建单元以及它们如何一起工作,你将能够编写更好的代码和应用程​​序。

The first post of the series focused on providing an overview of the engine, the runtime and the call stack. This second post will be diving into the internal parts of Google’s V8 JavaScript engine. We’ll also provide a few quick tips on how to write better JavaScript code — best practices our development team at SessionStack follows when building the product.

该系列的第一篇文章集中在介绍引擎,运行时和调用堆栈的概念。这第二篇文章将会深入 Google V8 JavaScript 引擎的内部。我们还将提供一些关于如何编写更好的 JavaScript 代码的小技巧 - 这是我们的开发团队在 SessionStack 最佳原则。

概述(Overview)

A JavaScript engine is a program or an interpreter which executes JavaScript code. A JavaScript engine can be implemented as a standard interpreter, or just-in-time compiler that compiles JavaScript to bytecode in some form.

JavaScript 引擎是一个执行 JavaScript 代码的程序或解释器。 一个 JavaScript 引擎可以实现为标准解释器,也可以是以某种形式将 JavaScript 编译为字节码的即时编译器。

This is a list of popular projects that are implementing a JavaScript engine:

  • V8 — open source, developed by Google, written in C++
  • Rhino — managed by the Mozilla Foundation, open source, developed entirely in Java
  • SpiderMonkey — the first JavaScript engine, which back in the days powered Netscape Navigator, and today powers Firefox
  • JavaScriptCore — open source, marketed as Nitro and developed by Apple for Safari
  • KJS — KDE’s engine originally developed by Harri Porten for the KDE project’s Konqueror web browser
  • Chakra (JScript9) — Internet Explorer
  • Chakra (JavaScript) — Microsoft Edge
  • Nashorn, open source as part of OpenJDK, written by Oracle Java Languages and Tool Group
  • JerryScript — is a lightweight engine for the Internet of Things.

这是一个正在实现的 JavaScript引擎的热门项目列表:

  • V8 :开源的,由 Google 开发的,用 C++ 编写
  • Rhin:由 Mozilla 基金会管理,开放源代码,完全用 Java 开发
  • SpiderMonkey : 第一个 JavaScript 引擎,过去使用在 Netscape Navigator 中,现在工作在 Firefox
  • JavaScriptCore : 开源,由Nitro推出,由苹果公司开发,用在 Safari 中
  • KJS :最初由 Harri Porten 开发,用于 KDE项目的 Konqueror 网络浏览器
  • Chakra (JScript9) : Internet Explorer
  • Chakra (JavaScript) : Microsoft Edge
  • Nashorn:开源,作为 OpenJDK 的一部分,由 Oracle Java 语言和工具组编写
  • JerryScript : 是物联网的轻量级引擎

为什么创建 V8 引擎?(Why was the V8 Engine created?)

The V8 Engine which is built by Google is open source and written in C++. This engine is used inside Google Chrome. Unlike the rest of the engines, however, V8 is also used for the popular Node.js runtime.

由 Google 构建的 V8 引擎是开源的,用 C++ 编写。该引擎在 Google Chrome 内使用。然而,与其他引擎不同的是 V8 也被用于流行的 Node.js 运行时。

V8 was first designed to increase the performance of JavaScript execution inside web browsers. In order to obtain speed, V8 translates JavaScript code into more efficient machine code instead of using an interpreter. It compiles JavaScript code into machine code at execution by implementing a JIT (Just-In-Time) compiler like a lot of modern JavaScript engines do such as SpiderMonkey or Rhino (Mozilla). The main difference here is that V8 doesn’t produce bytecode or any intermediate code.

V8 最初被设计用于提高 Web 浏览器中 JavaScript 执行的性能。为了获得更快的运行速度,V8 将 JavaScript 代码转换为更有效的机器代码,而不是使用解释器。它通过实现JIT(即时)编译器,就像许多现代 JavaScript 引擎(如SpiderMonkey或Rhino(Mozilla))做的的,将JavaScript代码编译成机器代码。与他们相比,最主要的区别在于 V8 不会产生字节码或任何中间代码。

过去 V8 有2个编译器(V8 used to have two compilers)

Before version 5.9 of V8 came out (released earlier this year), the engine used two compilers:

  • full-codegen — a simple and very fast compiler that produced simple and relatively slow machine code.
  • Crankshaft — a more complex (Just-In-Time) optimizing compiler that produced highly-optimized code.

在 V8 5.9 版本发布之前(今年早些时候发布),引擎使用两个编译器:

  • full-codegen - 一个简单而非常快速的编译器,可以生成简单而且相对较慢的机器代码。
  • Crankshaft - 更复杂(即时)优化编译器,可以生成高度优化的代码。

The V8 Engine also uses several threads internally:

  • The main thread does what you would expect: fetch your code, compile it and then execute it
  • There’s also a separate thread for compiling, so that the main thread can keep executing while the former is optimizing the code
  • A Profiler thread that will tell the runtime on which methods we spend a lot of time so that Crankshaft can optimize them
  • A few threads to handle Garbage Collector sweeps

V8引擎还内部使用多个线程:

  • 主线程执行你所期望的:获取代码,编译然后执行它
  • 还有一个单独的线程用于编译,所以主线程在前者正在优化代码时可以继续执行
  • Profiler 线程将告诉运行时,我们花费大量时间的方法,以便 Crankshaft 编译器可以优化它们
  • 几个处理垃圾收集器扫描的线程

When first executing the JavaScript code, V8 leverages full-codegen which directly translates the parsed JavaScript into machine code without any transformation. This allows it to start executing machine code very fast. Note that V8 does not use intermediate bytecode representation this way removing the need for an interpreter.

当第一次执行JavaScript代码时,V8利用full-codegen直接将解析后的 JavaScript 转换为机器代码,而无需任何转换。这使得它能够非常快地开始执行机器代码。注意,V8不会使用中间字节码表示,从而无需解释器。

When your code has run for some time, the profiler thread has gathered enough data to tell which method should be optimized.

当你的代码运行了一段时间后,Profiler 线程已经收集了足够的数据来判断应该优化哪个方法。

Next, Crankshaft optimizations begin in another thread. It translates the JavaScript abstract syntax tree to a high-level static single-assignment (SSA) representation called Hydrogen and tries to optimize that Hydrogen graph. Most optimizations are done at this level.

接下来,Crankshaft 从另一个线程开始优化。它将 JavaScript 抽象语法树转换为称为Hydrogen的高级静态单赋值(SSA)表示,并尝试优化Hydrogen图。大多数优化都是在这个级别完成的。

内联(Inlining)

The first optimization is inlining as much code as possible in advance. Inlining is the process of replacing a call site (the line of code where the function is called) with the body of the called function. This simple step allows following optimizations to be more meaningful.

第一个优化是提前内联(Inlining)尽可能多的代码。内联是将被调用函数的函数体替换到调用位置(函数所在的代码行)的处理过程。这个简单的步骤让以下优化更有意义。

隐藏类(Hidden class)

JavaScript is a prototype-based language: there are no classes and objects are created using a cloning process. JavaScript is also a dynamic programming language which means that properties can be easily added or removed from an object after its instantiation.

JavaScript 是一种基于原型的语言:没有使用克隆创建类和对象的过程。 JavaScript 也是一种动态编程语言,这意味着在实例化之后,可以轻松地从对象中添加或删除属性。

Most JavaScript interpreters use dictionary-like structures (hash function based) to store the location of object property values in the memory. This structure makes retrieving the value of a property in JavaScript more computationally expensive than it would be in a non-dynamic programming language like Java or C#. In Java, all of the object properties are determined by a fixed object layout before compilation and cannot be dynamically added or removed at runtime (well, C# has the dynamic type which is another topic). As a result, the values of properties (or pointers to those properties) can be stored as a continuous buffer in the memory with a fixed-offset between each. The length of an offset can easily be determined based on the property type, whereas this is not possible in JavaScript where a property type can change during runtime.

大多数 JavaScript 解释器都使用类似字典的结构(基于哈希函数)将对象属性值的位置存储在内存中。这种结构使得检索 JavaScript 中的属性的值比在 Java 或 C# 这样的非动态编程语言中更昂贵。在 Java 中,所有对象属性都是由编译前的固定对象布局确定的,并且不能在运行时动态添加或删除(C# 具有动态类型,这是另一个话题了)。因此,属性值(或指向这些属性的指针)可以作为连续缓冲区存储在存储器中,它们之间具有固定偏移量,偏移量的长度可以根据属性类型容易地确定。而在 JavaScript中,属性类型可能会在运行时间内发生变化,这样做是不可能的。

Since using dictionaries to find the location of object properties in the memory is very inefficient, V8 uses a different method instead: hidden classes. Hidden classes work similarly to the fixed object layouts (classes) used in languages like Java, except they are created at runtime. Now, let’s see what they actually look like:

由于使用字典来查找对象属性在内存中的位置是非常低效的,所以 V8 使用不同的方法替代:隐藏类。隐藏类工作原理类似于 Java 语言中使用的固定对象布局(类),除了它们在运行时被创建。现在,我们来看看它们的实际情况:

function Point(x, y) {
this.x = x;
this.y = y;
}
var p1 = new Point(1, 2);

Once the new Point(1,2) invocation happens, V8 will create a hidden class called C0.

一旦 new Point(1,2) 被调用,V8 将创建一个隐藏的类 C0

No properties have been defined for Point yet, so C0 is empty.

没有为 Point 定义属性,因此C0为空。

Once the first statement this.x = x is executed (inside the Point function), V8 will create a second hidden class called C1 that is based on C0. C1 describes the location in the memory (relative to the object pointer) where the property x can be found. In this case, x is stored at offset 0, which means that when viewing a point object in the memory as a continuous buffer, the first offset will correspond to property x. V8 will also update C0 with a class transition which states that if a property x is added to a point object, the hidden class should switch from C0 to C1. The hidden class for the point object below is now C1.

一旦执行了第一个语句this.x = x(在Point函数中),V8将创建一个基于C0的第二个隐藏类C1C1描述了可以找到属性x的内存中的位置(相对于对象指针)。在这种情况下,在偏移 0 处存储x,这意味着当将存储器中的点对象作为连续缓冲器查看时,第一个偏移将对应于属性x。 V8也会用类转换来更新C0,也就是说,如果将一个属性x添加到点对象,则隐藏类应该从C0切换到C1。下面的点对象的隐藏类现在是C1

Every time a new property is added to an object, the old hidden class is updated with a transition path to the new hidden class. Hidden class transitions are important because they allow hidden classes to be shared among objects that are created the same way. If two objects share a hidden class and the same property is added to both of them, transitions will ensure that both objects receive the same new hidden class and all the optimized code that comes with it.

每次将新属性添加到对象中时, 旧的隐藏类都将用转换路径更新为新的隐藏类。隐藏类转换非常重要, 因为它们允许在以相同方式创建的对象之间共享隐藏类。如果两个对象共享一个隐藏类, 并且将相同的属性添加到它们中, 则转换将确保两个对象都收到相同的新隐藏类和随之而来的所有优化代码。

This process is repeated when the statement this.y = y is executed (again, inside the Point function, after the this.x = x statement).

当执行语句this.y = y(在Point函数内部,在this.x = x语句之后)时,会重复此过程。

A new hidden class called C2 is created, a class transition is added to C1 stating that if a property y is added to a Point object (that already contains property x) then the hidden class should change to C2, and the point object’s hidden class is updated to C2.

一个名为C2的新隐藏类被创建,类转换将被添加到C1,表示如果将属性y添加到Point对象(已包含属性x),则隐藏类应更改为C2,点对象的隐藏类也更新为C2

Hidden class transitions are dependent on the order in which properties are added to an object. Take a look at the code snippet below:

隐藏类的转换取决于将属性添加到对象的顺序。看下面的代码片段:

function Point(x, y) {
this.x = x;
this.y = y;
}
var p1 = new Point(1, 2);
p1.a = 5;
p1.b = 6;
var p2 = new Point(3, 4);
p2.b = 7;
p2.a = 8;

Now, you would assume that for both p1 and p2 the same hidden classes and transitions would be used. Well, not really. For p1, first the property a will be added and then the property b. For p2, however, first b is being assigned, followed by a. Thus, p1 and p2 end up with different hidden classes as a result of the different transition paths. In such cases, it’s much better to initialize dynamic properties in the same order so that the hidden classes can be reused.

现在,你可以假设对于p1和p2,将使用相同的隐藏类和转换。实际并不相同。对于p1,首先将添加属性a,然后添加属性b。但是,对于p2,首先分配b,然后再分配a。因此,由于不同的转换路径,p1p2最终会有不同的隐藏类。在这种情况下,以相同的顺序初始化动态属性要更好,以便隐藏的类可以重用。

内联缓存(Inline caching)

V8 takes advantage of another technique for optimizing dynamically typed languages called inline caching. Inline caching relies on the observation that repeated calls to the same method tend to occur on the same type of object. An in-depth explanation of inline caching can be found here.

V8利用另一种称为内联缓存的技术来优化动态类型语言。内联缓存依赖于往往发生在同一类型对象上的对同一方法的重复调用的观察。可以在这里找到内联缓存的深入解释。

We’re going to touch upon the general concept of inline caching (in case you don’t have the time to go through the in-depth explanation above).

我们将简要说明内联缓存的一般概念(如果你没有时间通过​​上面的深入解释)。

So how does it work? V8 maintains a cache of the type of objects that were passed as a parameter in recent method calls and uses this information to make an assumption about the type of object that will be passed as a parameter in the future. If V8 is able to make a good assumption about the type of object that will be passed to a method, it can bypass the process of figuring out how to access the object’s properties, and instead, use the stored information from previous lookups to the object’s hidden class.

那么它是如何工作呢? V8维护在最近的方法调用中作为参数传递的对象类型的缓存,并使用该信息对将来作为参数传递的对象类型做出假设。如果 V8 能够对未来传递给该方法的对象类型做出一个很好的假设,那么它可以绕过如何访问对象的属性的过程,而是使用来自先前查找的对象的隐藏类存储的信息。

So how are the concepts of hidden classes and inline caching related? Whenever a method is called on a specific object, the V8 engine has to perform a lookup to the hidden class of that object in order to determine the offset for accessing a specific property. After two successful calls of the same method to the same hidden class, V8 omits the hidden class lookup and simply adds the offset of the property to the object pointer itself. For all future calls of that method, the V8 engine assumes that the hidden class hasn’t changed, and jumps directly into the memory address for a specific property using the offsets stored from previous lookups. This greatly increases execution speed.

那么隐藏类和内联缓存的概念如何相关?无论何时在特定对象上调用方法,V8引擎必须对该对象的隐藏类执行查找,以确定访问特定属性的偏移量。在同一个隐藏类的两次成功调用相同的方法之后,V8省略了隐藏的类查找,并将属性的偏移量添加到对象指针本身。对于该方法的所有将来的调用,V8引擎假定隐藏类没有改变,并使用先前查找中存储的偏移量直接跳转到特定属性的内存地址。这大大提高了执行速度。

Inline caching is also the reason why it’s so important that objects of the same type share hidden classes. If you create two objects of the same type and with different hidden classes (as we did in the example earlier), V8 won’t be able to use inline caching because even though the two objects are of the same type, their corresponding hidden classes assign different offsets to their properties.

内联缓存也是为什么同一类型的对象共享隐藏类的重要的原因。如果你创建两个相同类型的对象和不同的隐藏类(如前面的示例),V8将无法使用内联缓存,因为即使两个对象的类型相同,它们的相应隐藏类为其属性分配不同的偏移量。

两个对象基本相同,但是`a`和`b`属性按照不同的顺序创建

编译到机器码(Compilation to machine code)

Once the Hydrogen graph is optimized, Crankshaft lowers it to a lower-level representation called Lithium. Most of the Lithium implementation is architecture-specific. Register allocation happens at this level.

一旦 Hydrogen 图被优化,Crankshaft 将其降低到称为 Lithium 的较低级别表示。大多数 Lithium 的实现都是针对架构的。寄存器分配发生在这个级别。

In the end, Lithium is compiled into machine code. Then something else happens called OSR: on-stack replacement. Before we started compiling and optimizing an obviously long-running method, we were likely running it. V8 is not going to forget what it just slowly executed to start again with the optimized version. Instead, it will transform all the context we have (stack, registers) so that we can switch to the optimized version in the middle of the execution. This is a very complex task, having in mind that among other optimizations, V8 has inlined the code initially. V8 is not the only engine capable of doing it.

最后,Lithium 被编译为机器码。然后发生称为 OSR 其他事:堆栈替换。在我们开始编译和优化一个明显长期运行的方法之前,我们可能会运行它。 V8 不会忘记它刚刚缓慢执行的结果,不会再次运行它。相反,它将转换所有的上下文(堆栈,寄存器),以便我们可以在执行过程中切换到优化版本。这是一个非常复杂的任务,请记住,除了其他优化之外,V8在初始化的时候已经内联了代码。 V8不是唯一能够做到这一点的引擎。

There are safeguards called deoptimization to make the opposite transformation and revert back to the non-optimized code in case an assumption the engine made doesn’t hold true anymore.

有一种称为去优化的保护措施,作出相反的转换,并恢复为非优化代码,以防引擎之前做的的假设不再成立(假设隐藏类没有改变)。

垃圾回收(Garbage collection)

For garbage collection, V8 uses a traditional generational approach of mark-and-sweep to clean the old generation. The marking phase is supposed to stop the JavaScript execution. In order to control GC costs and make the execution more stable, V8 uses incremental marking: instead of walking the whole heap, trying to mark every possible object, it only walk part of the heap, then resumes normal execution. The next GC stop will continue from where the previous heap walk has stopped. This allows for very short pauses during the normal execution. As mentioned before, the sweep phase is handled by separate threads.

对于垃圾收集,V8采用传统的标记-清除的扫描方法处理 old generation 。标记阶段应该停止执行JavaScript。为了控制 GC 成本并使执行更加稳定,V8使用增量式标记:而不是遍历整个堆,尝试标记每一个可能的对象,相反,只是遍历一部分堆,然后恢复正常执行。下一个 GC 将继续从之前的遍历停止的位置开始。这允许在正常执行期间有非常短的暂停。如前文所述,扫描阶段由单独的线程处理。

Ignition and TurboFan

With the release of V8 5.9 earlier in 2017, a new execution pipeline was introduced. This new pipeline achieves even bigger performance improvements and significant memory savings in real-world JavaScript applications.

随着V8 5.9 的版本在2017年早些时候发布,新的执行流程被推出。这个新的管道体系在实际的 JavaScript 应用程序中实现了更大的性能改进和显着的内存节省。

The new execution pipeline is built on top of Ignition, V8’s interpreter, and TurboFan, V8’s newest optimizing compiler.

这个新的执行管道建立在 V8 的新解释器on top of Ignition和V8的最新优化编译器TurboFan之上。

You can check out the blog post from the V8 team about the topic here.

你可以在V8团队中查看有关这个主题的博文。

Since version 5.9 of V8 came out, full-codegen and Crankshaft (the technologies that have served V8 since 2010) have no longer been used by V8 for JavaScript execution as the V8 team has struggled to keep pace with the new JavaScript language features and the optimizations needed for these features.

由于 V8 5.9版本的出炉,V8将不再使用 full-codegenCrankshaft (自2010年起服务于 V8 的技术),因为V8团队努力跟上新的JavaScript语言功能,这些功能需要优化。

This means that overall V8 will have much simpler and more maintainable architecture going forward.

这意味着整体V8将会有更简单和更易维护的架构。

Web和Node.js性能提升的基准测试

These improvements are just the start. The new Ignition and TurboFan pipeline pave the way for further optimizations that will boost JavaScript performance and shrink V8’s footprint in both Chrome and Node.js in the coming years.

这些改进只是一开始。新的 Ignition 和 TurboFan 管道为进一步优化铺平了道路,这将在未来几年内提升JavaScript性能并缩小V8在Chrome和Node.js中的占地面积。

Finally, here are some tips and tricks on how to write well-optimized, better JavaScript. You can easily derive these from the content above, however, here’s a summary for your convenience:

最后,这里有一些关于如何编写更优的,更好的 JavaScript 的技巧。你可以从上面的内容中轻松获得这些内容,但是,为了方便起见,这里有一个摘要:

如何编写优化的JavaScript(How to write optimized JavaScript)

1.Order of object properties: always instantiate your object properties in the same order so that hidden classes, and subsequently optimized code, can be shared.

对象属性的顺序:始终以相同的顺序实例化对象属性,以便可以共享隐藏类和随后优化的代码。

2.Dynamic properties: adding properties to an object after instantiation will force a hidden class change and slow down any methods that were optimized for the previous hidden class. Instead, assign all of an object’s properties in its constructor.

动态属性:在实例化后向对象添加属性将强制隐藏类更改,并减缓任何为先前隐藏类优化的方法。相反,在其构造函数中分配对象的所有属性。

3.Methods: code that executes the same method repeatedly will run faster than code that executes many different methods only once (due to inline caching).

方法:重复执行相同方法的代码将比只执行一次的代码(由于内联缓存)运行得快。

4.Arrays: avoid sparse arrays where keys are not incremental numbers. Sparse arrays which don’t have every element inside them are a hash table. Elements in such arrays are more expensive to access. Also, try to avoid pre-allocating large arrays. It’s better to grow as you go.
Finally, don’t delete elements in arrays. It makes the keys sparse.

数组:避免键值不是增量数字的稀疏数组。不包含每个元素的稀疏数组是一个哈希表。访问这种数组元素的代价更加昂贵。另外,尽量避免预分配大数组。最好随着你的使用而增长。最后,不要删除数组中的元素。它使键值稀疏。

5.Tagged values: V8 represents objects and numbers with 32 bits. It uses a bit to know if it is an object (flag = 1) or an integer (flag = 0) called SMI (SMall Integer) because of its 31 bits. Then, if a numeric value is bigger than 31 bits, V8 will box the number, turning it into a
double and creating a new object to put the number inside. Try to use 31 bit signed numbers whenever possible to avoid the expensive boxing operation into a JS object.

标记值:V8 使用 32位 表示对象和数字。它使用一个位来知道它是一个对象(flag = 1)还是一个称为SMI(SMall Integer)的整数(flag = 0),因为它是31位。因此,如果一个数值大于31位,V8 将会把数字转换为 double,并创建一个新对象把数字放在里面。尽可能使用31位有符号数字,以避免将数字转换为jc对象的昂贵的装箱操作。

We at SessionStack try to follow these best practices in writing highly optimized JavaScript code. The reason is that once you integrate SessionStack into your production web app, it starts recording everything: all DOM changes, user interactions, JavaScript exceptions, stack traces, failed network requests, and debug messages. 

我们在 SessionStack 尝试遵循这些最佳做法来编写高度优化的 JavaScript 代码。原因是一旦将 SessionStack 集成到生产网络应用程序中,它就开始记录所有内容:所有DOM更改,用户交互,JavaScript异常,堆栈跟踪,失败的网络请求和调试消息。

With SessionStack, you can replay issues in your web apps as videos and see everything that happened to your user. And all of this has to happen with no performance impact for your web app.

使用SessionStack,你可以将Web应用中的问题重现为视频,并查看发生在你用户的一切。所有的一切都必须重新,并且不对你的 web 应用程序的性能产生影响。

There is a free plan that allows you to get started for free.

这里是一个免费的计划,让你免费入门

资源