klepto2 Posted February 14, 2024 Share Posted February 14, 2024 With the provided code I simulate the behaviour of a much more complex problem to show the problem in an extreme way. In the production ready code i get a slower mem increase, but it slows down the program and leads to a out of memory exception. Things to notice in the program: The memusage increases fast The displayed instance count is much higher than it should I let the program instantiate 100 planes, so the actual count should be 101 (with the orignal plane) In the loop the instances are cleared and reinstantiated. #include "UltraEngine.h" #include "ComponentSystem.h" //#include "Steamworks/Steamworks.h" using namespace UltraEngine; SIZE_T PrintMemoryInfo() { auto myHandle = GetCurrentProcess(); //to fill in the process' memory usage details PROCESS_MEMORY_COUNTERS pmc; //return the usage (bytes), if I may if (GetProcessMemoryInfo(myHandle, &pmc, sizeof(pmc))) return(pmc.WorkingSetSize); else return 0; } int main(int argc, const char* argv[]) { #ifdef STEAM_API_H if (not Steamworks::Initialize()) { RuntimeError("Steamworks failed to initialize."); return 1; } #endif RegisterComponents(); auto cl = ParseCommandLine(argc, argv); //Load FreeImage plugin (optional) auto fiplugin = LoadPlugin("Plugins/FITextureLoader"); //Get the displays auto displays = GetDisplays(); //Create a window auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR); if (!AttachConsole(ATTACH_PARENT_PROCESS)) { if (AllocConsole()) { freopen("conin$", "r", stdin); freopen("conout$", "w", stdout); freopen("conout$", "w", stderr); } } else { auto consoleHandleOut = GetStdHandle(STD_OUTPUT_HANDLE); auto consoleHandleIn = GetStdHandle(STD_INPUT_HANDLE); auto consoleHandleErr = GetStdHandle(STD_ERROR_HANDLE); if (consoleHandleOut != INVALID_HANDLE_VALUE) { freopen("conout$", "w", stdout); setvbuf(stdout, NULL, _IONBF, 0); } if (consoleHandleIn != INVALID_HANDLE_VALUE) { freopen("conin$", "r", stdin); setvbuf(stdin, NULL, _IONBF, 0); } if (consoleHandleErr != INVALID_HANDLE_VALUE) { freopen("conout$", "w", stderr); setvbuf(stderr, NULL, _IONBF, 0); } } //Create a framebuffer auto framebuffer = CreateFramebuffer(window); //Create a world auto world = CreateWorld(); auto camera = CreateCamera(world); camera->SetClearColor(0.125); camera->SetFov(70); camera->Move(0, 2, -8); //Create light auto light = CreateDirectionalLight(world); light->SetRotation(45, 35, 0); light->SetColor(2); world->RecordStats(true); shared_ptr<Entity> main_instance = CreatePlane(world, 10.0,10.0, 256,256); vector<shared_ptr<Entity>> instances; //Main loop while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false) { instances.clear(); for (int i = 0; i < 100; i++) { instances.push_back(main_instance->Instantiate(world)); } world->Update(); world->Render(framebuffer); #ifdef STEAM_API_H Steamworks::Update(); #endif window->SetText("Instances: " +String(world->renderstats.instances) + " MEM: " + String(PrintMemoryInfo() / 1024) + " kb"); } #ifdef STEAM_API_H Steamworks::Shutdown(); #endif return 0; } 1 Quote Windows 10 Pro 64-Bit-Version NVIDIA Geforce 1080 TI Link to comment Share on other sites More sharing options...
Josh Posted February 14, 2024 Share Posted February 14, 2024 The number of instances drawn will normally be higher than the number of instances that exist in view: 1 depth prepass + 1 main pass + 3 for each directional light. That would account for about 500 instances, and your app is reporting 1000. It's probably reporting that number incorrectly based on some stuff that never gets drawn, but I will investigate. I can confirm the memory increase. It should not be too hard to figure out what is failing to release. Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
Josh Posted February 15, 2024 Share Posted February 15, 2024 Okay, I can confirm double the meshes were being drawn as needed. This was an easy fix. I am trying to track down the source of the memory increase. It does not appear to have anything to do with the rendering thread. 1 Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
Josh Posted February 15, 2024 Share Posted February 15, 2024 With this simple example that excludes the entities from any world we can see the mem usage is very stable. #include "UltraEngine.h" using namespace UltraEngine; int main(int argc, const char* argv[]) { auto displays = GetDisplays(); auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR); auto framebuffer = CreateFramebuffer(window); auto world = CreateWorld(); shared_ptr<Entity> main_instance = CreatePivot(nullptr); vector<shared_ptr<Entity>> instances; while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false) { instances.clear(); for (int i = 0; i < 100; i++) { instances.push_back(main_instance->Instantiate(nullptr)); } world->Update(); world->Render(framebuffer); window->SetText("MEM: " + String(GetMemoryUsage() / 1024 / 1024) + " mb"); } return 0; } Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
Josh Posted February 15, 2024 Share Posted February 15, 2024 Okay, two more things I found: I added a global list of entities, but since it stores weak pointers it won't get cleaned up until the user calls GetEntities() if large numbers of entities are deleted. I added an internal call to this method in the world Update method, so it will always trim the list of dead entities. Your example is adding entities in faster than the rendering thread can keep up. The entities are being added each frame, but the rendering thread keeps them around a little bit longer, so it is being overwhelmed. There is a renderable entity limit of 65536 (some entities take up more than one slot, so it can be a little less than this). This is because the engine stores entity IDs as unsigned short integers on the GPU. There is a tendency for things to grow a little bit when numbers of items fluctuate, due to the nature of how memory resizing is implemented. STL vectors for example get bigger when they need to, but they don't release memory if the are resized to a small size, with the idea they may need to grow again. In the same manner, I don't ever make GPU storage buffers smaller, I just let them grow as needed, and if the application needs less space, I just keep the extra memory as a reserve. As long as the memory display in VS studio looks flat after a few moments, then you are good. I will check to see if there is anything else I can improve and then do another build and upload these fixes. #include "UltraEngine.h" using namespace UltraEngine; int main(int argc, const char* argv[]) { //Get the displays auto displays = GetDisplays(); //Create a window auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR); //Create a framebuffer auto framebuffer = CreateFramebuffer(window); //Create a world auto world = CreateWorld(); auto camera = CreateCamera(world); camera->SetClearColor(0.125); camera->SetFov(70); camera->Move(0, 2, -8); shared_ptr<Entity> main_instance = CreatePlane(world, 10.0, 10.0, 256, 256); //shared_ptr<Entity> main_instance = CreateModel(world); vector<shared_ptr<Entity>> instances; int n = 0; //Main loop while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false) { instances.clear(); ++n; if (n == 100) { n = 0; for (int i = 0; i < 100; i++) instances.push_back(main_instance->Instantiate(world)); } world->Update(); world->Render(framebuffer); window->SetText("MEM: " + String(GetMemoryUsage() / 1024) + " kb"); } return 0; } 1 Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
Josh Posted February 15, 2024 Share Posted February 15, 2024 Update for this is available now. 1 Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
klepto2 Posted March 30, 2024 Author Share Posted March 30, 2024 @Josh I need to reopen this, the top sample produces a lot of memory again (0.9.5), also after a short time period it produces INVALID _VALUE errors. I know the rendering is async, but the instance count always says 505 instances instead of 100 (maybe +1 one for the camera). 1 Quote Windows 10 Pro 64-Bit-Version NVIDIA Geforce 1080 TI Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.