Understanding the V8 JavaScript Engine - 雙語字幕

I'm Lydia Halley,
and I'm going to give you a pretty,
like, high-level walk-through of basically everything that happens going from our human-friendly JavaScript file all
the way down to something that computers can understand.
And there are,
like, so many parts to this process, but for now I'm only going to focus on two things, namely browser side of things, and V8 side of things.
And of and V8 side of is the JavaScript engine used in Chromium-based browsers and in node as well.
First let's go all the way back to the beginning.
So we're trying to load a website that uses a small calc.js script.
And as we're trying to load the website,
the HTML parser encounters a script tag and tries to fetch the Couch.js file from either the network or maybe cache or a service worker that prefetched a file,
either way, a stream of bytes get returned that gets sent to the byte stream decoder.
Relaxer.
And this is part of the parser that takes care of decoding the stream of bytes and generating tokens based on the data it received.
For example, it sees that the bytes deco to fum.
and it generates a token say like,
hey, I know this function is a keyword in JavaScript and it creates a token based on that and it'll just continue to do so for the rest of the stream as well.
And it's generating these tokens,
it's actually sending them all down to the parser and the parser then goes ahead and creates notes based on the token.
can step match a certain syntax rule in JavaScript, for example, a variable declaration or a function statement.
And based on these notes, a parser generates an abstract syntax tree that represents our program.
Now, this one is very much simplified because in real life, it also contains some extra information about our program.
program, but now, this will suffice.
And also,
while it's doing that,
it's checking for syntax errors, because the tokens themselves may be valid, but maybe they may not actually match a certain syntax rule.
Finally, it's known for the JavaScript engine to do its work, because this ESD is actually sent down to V8's Ignition Interpreter.
And this interpreter for generating the bytecode that it based on the AST that it received.
And we can actually see the bytecode that gets generated with the print bytecode flag in node.
So for example,
this bytecode for our calc function,
if we invoke it with an object containing an x, y, and z key, it would look something like this.
And this may seem like a lot of data, but there's actually only two parts here that are really important.
So ignition uses registers in order to execute the bytecode.
And there's registers like R0 and R1, but there's also an accumulator register that the bytecodes use for their input and output, or both.
And then there's also registers like a 0 that are used for the values that got passed to.
function.
And this makes more sense as we're walking through the bytecode, don't worry.
So in this case, we passed an object containing an x, y, and z key to the function.
So this is where the second part of the generated output is important because a zero points
to a shape table that contains information on where to find those properties on the object that we passed to the function.
Alright, so now let's see what those byte codes actually do.
So in the very first line, we see LDA name property byte code.
So LDA specifies that a value gets loaded into the accumulator,
and that the value is the named property from the object that we passed to the function stored in a zero,
and the property itself can be found on index zero.
So we see that the value on index zero maps to x.
So we load the value of the x property of the object that we passed to the function.
So the numeric value of 10 in this case.
Then we multiply the current value in the accumulator by the small integer.
And then star r0 specifies that the current value of the accumulator has to get stored in register r0.
Then again we load a property and store this into the accumulator, but this time it's from the second index, which points to y.
And y has a value of 20, so the value of the accumulator is now 20.
And 2 specifies again a difference.
the accumulator has to get stored in register R2.
We again load a named property from A1,
sorry, A0 into the accumulator, the value in a third index this time, which maps to Z as a of 30.
So we multiply the current value of the accumulator with the value that's currently stored in register R2.
So one more step, we have to add the values stored at register r0 to the current value of the accumulator.
So this means that we're adding 500 plus 600 is 1100.
And finally, we'll return the value of the accumulator, which is 1100.
Now, the bytecode that is generated by the bytecode generator also goes through some smaller,
the bytecode actually gets executed and it's possible to run this on our machines.
So finally, we have something that our machines can work with.
Now, you may have noticed that I skipped some things in a bytecode.
So let's see what's up with them.
This is actually part of VA's optimizations.
Because when we pass an object to VA's,
such as the x, y, and z object in this case, it creates a shape for that specific object structure.
And if you're reading like documentation or blog posts,
this is also referred to as a hidden class or a map,
but it's kind of confusing because we also have classes in JavaScript and we have maps in JavaScript, but it's not a JavaScript class.
JavaScript map.
So shape is the way to go because we don't have those natively in JavaScript.
So a shape is basically just a structure of that object.
And this shape contains pointers to the offsets on which we can find the values of the properties on the object,
because even though we only specify the x,
y, and z properties, there are many, many more built in properties and objects that also all have their location somewhere stored in memory.
So when we're trying to access a property on the object,
for example, x, it can now just get it quicker by checking, okay, does this object have the same shape?
Yeah, it Okay, cool.
Now I...
know the offset.
Shapes are really useful for an optimization technique that V8 uses, namely inline caching.
With inline caching,
we basically store the results from previous operations so that the next time we call the exact same operation, we already know the result.
Now each time we do a property lookup,
it can just simply store the results for the offset that the last time it did look up.
So in the future, when we're trying to perform the exact same action, it can simply just get the result from the inline cache.
Now, these inline caches are not only beneficial for the interpreter, but they also generate really valuable feedback for the turbofan optimizer.
So finally,
we can go back to the bytecode example,
because these values are actually references to a feedback vector slot, where it stores information about the execution of the functions.
And this includes information from, like, arithmetic operations, and it's the fact that so far we've only added numbers, which result in a numeric value.
One useful example of this is the fact that in JavaScript you can also concatenate strings with the plus operator,
which would have to be handled way differently, internally.
But so far it knows, okay, I've only had numerical values, that's fine.
Now, let's say that we're invoking the kelp function hundreds of times.
hot.
Because although the bytecode is already really fast, V8 actually uses turbofan, the turbofan optimizer in order to generate machine code.
So based on the byte code and the generated feedback for specific code blocks,
it can generate optimized architecture specific machine code that can run directly on your machine.
So the next time that we invoke the function, it can just skip over the bytecode and immediately execute the machine code instead.
However, there is one problem in JavaScript, namely that it's dynamically typed.
So we can invoke the calc function with the same object,
like hundreds and thousands of absolutely no guarantee that this will always be the case in the future.
For example,
we can also invoke the calc function with an empty object or just with an X key or just an X and Y key.
I don't know why you would do it, but it's possible.
So we're all those different types of objects VA generates a new shape that contains the new different properties.
So previously, we saw that the inline cache contained a field with a value of the shape of the object and then the corresponding offset.
However, if we passed multiple objects, so multiple shapes got generated,
we also have to update the inline cache in order for it to point to multiple shapes and they're offset.
So now, when we're trying to load a property from a specific object, it first...
in order for it to find the object that contains that specific property,
which could result in a linear search, which is not very optimal, but.
Now, previously we generated machine code for the calc function when it's only
done invoked with one type of object, namely the object with the x, y, and z keys.
However, if we call the calc function again, but with a different shape, turbo fans shape check fails, in which case we can no longer use this optimized machine code, and we actually have to
de-optimize back to the generated bytecode.
And this is a pretty expensive operation that you mostly want to avoid.
So the inline cache of the calc function is also up to it's actually got multiple shapes now.
Now the calc function again can get hot and optimized even after de-optimization.
Although turbofen has to handle it a little bit differently when it's encountered multiple shapes.
And these inline caches actually also have multiple states, because if an inline cache is only seen one type of object, it's considered monomorphic.
which is pretty much the best case scenario,
because in that case we can just generate optimized machine code and assume that in the future this function will just get invoked
with the same object shape.
Now, if a cache has two or four different shapes, it's considered polymorphic, and if we're continuously just invoking it with whatever random types it's considered megamorphic, in which case it's like, you know what, never
mind, I want to try to optimize it.
So, as you can see, although it's pretty nice for us sometimes as developers that JavaScript is dynamically typed, it's not so great for the compiler.
And it can really only work with speculations and just assume that in the future, we will use the speculations.
翻譯語言
選擇翻譯語言

解鎖更多功能

安裝 Trancy 擴展,可以解鎖更多功能,包括AI字幕、AI單詞釋義、AI語法分析、AI口語等

feature cover

兼容主流視頻平台

Trancy 不僅提供對 YouTube、Netflix、Udemy、Disney+、TED、edX、Kehan、Coursera 等平台的雙語字幕支持,還能實現對普通網頁的 AI 劃詞/劃句翻譯、全文沉浸翻譯等功能,真正的語言學習全能助手。

支持全平臺瀏覽器

Trancy 支持全平臺使用,包括iOS Safari瀏覽器擴展

多種觀影模式

支持劇場、閱讀、混合等多種觀影模式,全方位雙語體驗

多種練習模式

支持句子精聽、口語測評、選擇填空、默寫等多種練習方式

AI 視頻總結

使用 OpenAI 對視頻總結,快速視頻概要,掌握關鍵內容

AI 字幕

只需3-5分鐘,即可生成 YouTube AI 字幕,精準且快速

AI 單詞釋義

輕點字幕中的單詞,即可查詢釋義,並有AI釋義賦能

AI 語法分析

對句子進行語法分析,快速理解句子含義,掌握難點語法

更多網頁功能

Trancy 支持視頻雙語字幕同時,還可提供網頁的單詞翻譯和全文翻譯功能

開啟語言學習新旅程

立即試用 Trancy,親身體驗其獨特功能