[Paper Review] The introduction of 3D Gaussian Splatting

[Paper Review] The introduction of 3D Gaussian Splatting - Phụ đề song ngữ

I recovered the 3rd decal transplanting paper, which has resamptly become an issue.

It consists of two videos.

First, I will introduce the 1st decal transplanting is in this video.

Second, I will discuss the algorithm in detail in the next video.

The problem this paper wants to serve is the same as enough.

When about 100 images are given with the camera poses, it synthesizes on images for an uneven camera pose.

When solving the task, speed and quality become dominant.

The speed can be divided into train speed and input speed.

From a quality perspective, minimum of the 360, which Google unveiled at CBPR 2022 during the best performance.

In terms of train speed, instant NGP released by NVIDIA at CBPR 2022 during the best performance.

Gosh, it's pretty short.

It's a different performance on each side as you will be controlling into inference time.

If you think of a releasing of service of a DER, minimum 360 has a good call.

For the training speed, it takes too long.

Re-separally, Google published a follow-up paper at ICCV2023 called the Gym Luck, which has a good quality and fast training speed.

But this also takes about an hour on 8-week 100GPS.

Insta360 has a very fast train speed with only 7 minutes, but in comparison, the current is not good.

From a real-time service perspective, the input speed is the best at 100ms.

But it's measured on the A6000 GPU.

And when measured with a low specific view, it exceeds 1 second and cannot be serviced.

Going back to gaseous tracking and looking at the iteration's 3K model, this is the similar quality as in a normal system.

turns in 30 minutes on 1A6000, and renders in a real time at 7ms.

Looking at the 7K model, the quality is higher than 30p, but the transfer is faster.

Our platform renders in a real time at 6ms,

since it is 627ms on the 6,000 CPU, real-time rendering that does not exceed 1 second is possible even on regular GPU.

And you can view 6,000 spectraher results in Maria.

I selected the first four synths.

You can see this thing from a different color close in a real time.

And let's look at these numbers based on the results published on the official website.

The first thing compares with the MIMDA to 360, which has the best performance from a card perspective.

The contrast frame shows high quality at the leaf, and the spoke of a bike.

The Stunti NGP has the best performance in a train speed before The third NGP doesn't have high frequency pixels in the most area.

Our products are using an iconic model that increases speed by delivering the neural network in node research.

Karsha's painting has extremely high image quality compared with the plan of sales.

Last thing shows how zero it is to read the ground truth.

I will try to explain the concept of the paper from the tile away.

First the solidical color.

The formula Gaussian distribution is 1D Gaussian, which forms this curve with the mean and various.

The probability test function has this formula starting with a 2D Gaussian,

the formula is defined by the dimension value in K as a multivariate formula.

Since the two dimensions contain two dimensions, the symmetics is used in the formula instead of the variance.

The description is your probability that the function formula, with the k, you go to 3 in this formula.

and it takes the form of 3D lips I couldn't find a feature that it actually matches to explain it

but in the 3D lips it would look like this

feature The average value is the center of the Gaussian and the covariance expresses the size of the Gaussian.

The test value is high as the average, and the water away is from the average, the test value shouldn't.

These leaves are expected.

The tracking means that something with water is scattered when it hits a certain place.

You can imagine so the leaves being scattered in one scene like this.

However, the bottle is trained with the cord.

I recorded with 11 seconds at 30 fps and 7 20p resolution with an iPhone and I'm trying to read this 300 or images.

Then I've seen the credits.

Basically, you can see that images are rendered even in camera photos that were not used at training.

When rendering lives mode, you can see that a set of linear lives of various colors, shapes, and represent one thing.

If you reduce the scale of the list,

you see that it takes the form of a white card, remembering the black side that has not been turned at all.

You can see needle, shape, or both the lines.

This is of our radius field, your field levels to our radius field and uses our neural network.

We can guess from the title 3-Lion Splicing as your level's field without using a neural network.

Gradients in the radius field refers to the flow rate over the light, everything in each direction, and each projection space.

And you can think of it as modeling the visible light space that our eye can see.

The feature of the field is that the different colors even in the same positive position are represented with different view angles.

Next, we compare how it works from existing techniques from a representation perspective.

Now, I will synthesize images with an image frame located one distance forward from the camera in the world coordinate system.

For fonts, ray marks for each pixel and draw straight by simply 128 points are even interactive.

Then, the separate points are input into the MFP, and current as they are calculated, and

the color-based is calculated by accumulating depth using the volume rendering formula.

It continues an implicit representation because it's a structure that can calculate color and intensity of low MFP,

because the points are separate in an empty space.

is inefficient in terms of computation.

Point Cloud is a third of discrete points to render an image realistically with the Point Cloud because of added density.

To make it easier to understand, you can think of it as drawing only thoughts on drawing paper.

When a scene is represented by all your stats points,

A huge amount of computation is used to retrieve a one-shover box size to throw an image.

3D Gaussian splat initially presented as a set of Gaussians.

If discrete, like a point cloud, doesn't use MLP, and that's explicit in the presentation.

If the minimum vehicle of a point cloud is one point, the minimum minimum of Gaussians splatting is one Gaussian.

Compared to your point cloud, the same is expressed with the small number of minimum units.

And only the area of real density is projected onto the image plane to reduce the amount of computation.

I'll cover it in detail in the next video.

Lastly, how are we able to achieve the best rendering speeds?

First of all, it's similar to the city dimension.

In a point cloud,

it takes a lot of time to render each point densely,

and in the next example, important rendering of continuous space requires a lot of calculations.

Second,

It's designed with the explicit structure,

calcium was designed with poison, apostate and cobbiness, and most of them color was designed to be the spherical harmonics function.

In the sprinkler cordless system, the top foot color, the color is covered by optimizing the co-efficient according to the camera pose.

It causes reduce the training time and the landing time.

The world is time-based weatherization.

The image is divided into 16 by 16 pixels of time.

Each Gaussian is projected and sorted from each time and upper composition is performed by Deaf's author.

Then there is performed in parallel processing included on a type basis.

Ngôn ngữ dịch

Chọn

Tóm tắt Xuất bản Luyện tập nói

Mở khóa nhiều tính năng hơn

Cài đặt tiện ích Trancy để mở khóa nhiều tính năng hơn, bao gồm phụ đề AI, định nghĩa từ AI, phân tích ngữ pháp AI, nói chuyện AI, v.v.

Tương thích với các nền tảng video chính

Trancy không chỉ cung cấp hỗ trợ phụ đề song ngữ cho các nền tảng như YouTube, Netflix, Udemy, Disney+, TED, edX, Kehan, Coursera, mà còn cung cấp dịch từ/câu bằng AI, dịch toàn văn sâu sắc và các tính năng khác cho các trang web thông thường. Đây là một trợ lý học ngôn ngữ đa năng thực sự.

Trình duyệt trên tất cả các nền tảng

Trancy hỗ trợ tất cả các trình duyệt trên tất cả các nền tảng, bao gồm tiện ích trình duyệt iOS Safari.

Nhiều chế độ xem

Hỗ trợ chế độ xem rạp, đọc, kết hợp và các chế độ xem khác để có trải nghiệm song ngữ toàn diện.

Nhiều chế độ luyện tập

Hỗ trợ luyện viết câu, đánh giá nói, trắc nghiệm nhiều lựa chọn, viết theo mẫu và các chế độ luyện tập khác.

Tóm tắt video AI

Sử dụng OpenAI để tóm tắt video và nắm bắt nhanh nội dung chính.

Phụ đề AI

Tạo phụ đề AI chính xác và nhanh chóng trên YouTube chỉ trong 3-5 phút.

Định nghĩa từ AI

Chạm vào từ trong phụ đề để tra cứu định nghĩa, với định nghĩa dựa trên AI.

Phân tích ngữ pháp AI

Phân tích ngữ pháp câu để nhanh chóng hiểu ý nghĩa câu và nắm vững các điểm ngữ pháp khó.

Nhiều tính năng web khác

Ngoài phụ đề song ngữ cho video, Trancy còn cung cấp dịch từ và dịch toàn văn cho các trang web.