Jump to content
  • entries
    943
  • comments
    5,899
  • views
    924,359

Terrain Normal Update Multithreading


Josh

1,734 views

 Share

Multithreading is very useful for processes that can be split into a lot of parallel parts, like image and video processing. I wanted to speed up the normal updating for the new terrain system so I added a new thread creation function that accepts any function as the input, so I can use std::bind with it, the same way I have been easily using this to send instructions in between threads:

shared_ptr<Thread> CreateThread(std::function<void()> instruction);

The terrain update normal function has two overloads. Once can accept parameters for the exact area to update, but if no parameters are supplied the entire terrain is updated:

virtual void UpdateNormals(const int x, const int y, const int width, const int height);
virtual void UpdateNormals();

This is what the second overloaded function looked like before:

void Terrain::UpdateNormals()
{
	UpdateNormals(0, 0, resolution.x, resolution.y);
}

And this is what it looks like now:

void Terrain::UpdateNormals()
{
	const int MAX_THREADS_X = 4;
	const int MAX_THREADS_Y = 4;
	std::array<shared_ptr<Thread>, MAX_THREADS_X * MAX_THREADS_Y> threads;
	Assert((resolution.x / MAX_THREADS_X) * MAX_THREADS_X == resolution.x);
	Assert((resolution.y / MAX_THREADS_Y) * MAX_THREADS_Y == resolution.y);
	for (int y = 0; y < MAX_THREADS_Y; ++y)
	{
		for (int x = 0; x < MAX_THREADS_X; ++x)
		{
			threads[y * MAX_THREADS_X + x] = CreateThread(std::bind((void(Terrain::*)(int, int, int, int)) & Terrain::UpdateNormals, this, x * resolution.x / MAX_THREADS_X, y * resolution.y / MAX_THREADS_Y, resolution.x / MAX_THREADS_X, resolution.y / MAX_THREADS_Y));
		}
	}
	for (auto thread : threads)
	{
		thread->Resume();
	}
	for (auto thread : threads)
	{
		thread->Wait();
	}
}

Here are the results, using a 2048x2048 terrain. You can see that multithreading dramatically reduced the update time. Interestingly, four threads runs more than four times faster than a single thread. It looks like 16 threads is the sweet spot, at least on this machine, with a 10x improvement in performance.

Image1.png.efeaecf3ceaa854ddf31fa18454fdb80.png

  • Like 3
 Share

0 Comments


Recommended Comments

The reason four threads was less than 25% the speed of one is because some calculations were being skipped. I fixed that and the numbers are a little higher now, but still form the same curve.

  • Like 2
Link to comment
Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...