WIN10 Memory management Stress Test

MT32-Sound Pad Lite
Post Reply
Vincent Burel
Site Admin
Posts: 2008
Joined: Sun Jan 17, 2010 12:01 pm

WIN10 Memory management Stress Test

Post by Vincent Burel »

WIN10 brought different regressions regarding overall stability and performances compared to WIN7, especially for real time processing and consequently audio processing and audio streaming. The one we are tracking here is the System Page Fault Interruption Management that can generate big delay and create audio glitches during simple memory operations.

To demonstrate this, we created a minimal program simulating an audio thread to copy 128 channels in an internal BUS and in 128 delay lines. This is a very simple task done by any DAW working with 128 I/O or Tracks. In term of memory load it is like copying 256 KB every 5 ms in 2 + 128 address ranges. This is a very basic task we were doing on a Pentium I and PC100 RAM in the 90's.

For each frame, the program is measuring time passed to perform these “mem-copy” operations and count incidents when the time is above 80% of the real time. We are simulating a 256 samples buffer real time stream at 48 kHz, so we must process the buffer in less than 5.3ms (80% of 5.3m gives 4.2ms).
MMStressTest_WIN7_vs_VIN10.gif
MMStressTest_WIN7_vs_VIN10.gif (59.84 KiB) Viewed 68999 times
As it is shown on the screenshot above, WIN10 can generate big interruption, 186ms as the biggest one detected in less than 20 minutes, is just an eternity for a real time processing task.

REM: The test can be performed on optimal Windows configuration as well by removing maximum Windows options and services (no virtual memory, no onedrive, no win update, no visual effect, power management to max, no firewall, no defender…). The DSP peak problem remains.

To make the test in your computer right now, the program (EXE) can be download directly there:
https://download.vb-audio.com/Download_ ... _v1000.zip
Just run it during some hours (12 or 24h00) or until getting the first incident (will be displayed in red).

The source code of the Real Time Task performed is very simple:

Code: Select all

void MyAudioCallback(LPT_APPCTX lpapp, void * lpBuffer, int nbSample)
{
	int * lpSource, *lpTarget;
	int vi, fTurn;

	//simple BUS Copy (in this example, this is not enough to make appear the problem)
	lpSource = (int*)lpBuffer;
	lpTarget = lpapp->pInternalBUS2;
	for (vi=0;vi<NB_CHANNEL_IN_BUS;vi++)
	{
		memcpy(lpTarget, lpSource, sizeof(int)*nbSample);
		lpSource=lpSource+nbSample;
		lpTarget=lpTarget+nbSample;
	}
	//copy to delay line (we add this regular delay line copy, used in multi track recorder as DTR process for example).
	fTurn=0;
	lpSource = (int*)lpBuffer;
	for (vi=0;vi<NB_CHANNEL_IN_BUS;vi++)
	{
		lpTarget = lpapp->pDelayLineBuffer[vi];
		lpTarget = lpTarget + (lpapp->pDelayLineBuffer_nu[vi] * BUFFER_SIZE);

		memcpy(lpTarget, lpSource, sizeof(int)*nbSample);
		lpSource=lpSource+nbSample;

		lpapp->pDelayLineBuffer_nu[vi]++;
		if (lpapp->pDelayLineBuffer_nu[vi] >=DELAYLINE_NB_BUFFER) 
		{
			lpapp->pDelayLineBuffer_nu[vi]=0;
			fTurn=1; //count number of memory restart address
		}
	}
	if (fTurn != 0) lpapp->pDelayLineBuffer_turn++;
}
Download Link to get complete Source Code (WIN32 Minimal TEST Program):
https://download.vb-audio.com/Download_ ... _v1000.zip

CONCLUSION: It seems WIN10 has never been able to support audio streaming correctly up to now, since Microsoft changed something in the memory management that can generate huge time penalty and break any real time processing (audio or video). WIN10 RS4 update (February 2018) is expected to fix this problem. The program above will help us to validate it anyway...
Vincent Burel
Site Admin
Posts: 2008
Joined: Sun Jan 17, 2010 12:01 pm

WIN10 - Virtual Locked Memory ???

Post by Vincent Burel »

Regarding the problem of page fault interruption management under WIN10, we maybe can expect to avoid dumb PAGE OUT operating system process by using VirtualAlloc / VirtualLock functions. This very new and strange recommendation is followed by no-one until now, because VirtualAlloc/VirtualLock is not the usual method to allocate memory under Windows (for audio or video game application as well as for driver interface or any DLL components).

However we made some test with, to replace our memory that could have been allocated by a regular malloc or a "new" operator.

You can replace Malloc / Free function by VirtualAlloc / VirtualFree functions by using the following validated source code:

Code: Select all

#define VBT_INT32 int
static VBT_INT32 G_VirtualInit=0;

void * VBCALLCONV VB0_VirtualLockedMalloc(VBT_INT32 nbByte)
{
	DWORD nErr;
	int rep;
	void * lptr, * lpResult;
	if (G_VirtualInit == 0) return NULL;
	lptr=VirtualAlloc(NULL, nbByte, MEM_RESERVE, PAGE_NOACCESS);
	if (lptr == NULL) return NULL;
	lpResult=VirtualAlloc(lptr, nbByte, MEM_COMMIT, PAGE_READWRITE);
	if (lpResult == NULL)
	{
		VirtualFree(lptr,0, MEM_RELEASE);
		return NULL;
	}
	rep= VirtualLock(lptr,nbByte);
	if (rep == 0) 
	{
		nErr=GetLastError();
		VirtualFree(lptr,0,MEM_RELEASE);
		return NULL;
	}
	return lptr;
}
The Free function can be made like this:

Code: Select all

void VBCALLCONV VB0_VirtualLockedFree(void * lptr)
{
	BOOL fOk;
	SIZE_T rep;
	MEMORY_BASIC_INFORMATION info;

	memset(&info,0, sizeof(MEMORY_BASIC_INFORMATION));
	rep=VirtualQuery(lptr,&info,sizeof(MEMORY_BASIC_INFORMATION));
	if (rep != 0)
	{
		VirtualUnlock(lptr, info.RegionSize);
		fOk=VirtualFree(lptr, 0,MEM_RELEASE);
	}
	else fOk=VirtualFree(lptr,0,MEM_RELEASE);
}
Before using these functions, we must call SetProcessWorkingSetSize() to define the virtual Locked memory quota (and this is a problem because you migh don't know how much memory your application will require at the end... pending on user workflow).
Because In the reality this function works as pre-allocation function that will reserve the memory given by its first term: MinMB
...which can produce different problem to use external module (DLL) if reserving too much memory expected to be virtual-locked...

Code: Select all

VBT_INT32 VBCALLCONV VB0_VirtualLockedInit(__int64 MinMB, __int64 MaxMB)
{
	SIZE_T min,max;
	VBT_INT32  rep;
	HANDLE hProcess;
	min = (SIZE_T)(MinMB * 1024 * 1024);
	max = (SIZE_T)(MaxMB * 1024 * 1024);
	hProcess=GetCurrentProcess();
	rep=SetProcessWorkingSetSize(hProcess, min, max);
	if (rep == 0) return GetLastError();
	else 
	{
		G_VirtualInit=1;
		return 0;
	}
}
Vincent Burel
Site Admin
Posts: 2008
Joined: Sun Jan 17, 2010 12:01 pm

WIN10 will support audio in April 2018

Post by Vincent Burel »

The Memory Management Bug seems to have been corrected in Insider build 17074
that will be applied as official WIn10 update around April 2018...

The time for us to make some test on the famous VirtualLocked memory but it shows that it is simply not applicable... and never applied. Because If all memory used in a real audio thread must be “Virtual Locked”, why Microsoft itself did not use it? Audio buffer used or provided by audio interface are not Virtual Locked. WASAPI examples do not use VirtualLock on its memory or any audio buffer, even audio drivers are not using Virtual Locked memory...

Anyway we made a VirtualLock test implementation in our MT128 and we have allocated all the memory with VirtualAlloc / VirtualLock. It has improved the things but we still got some DSP PEAK, because the audio buffers, allocated by audio driver/interface were not virtual locked. This is the condition to use VirtualLocked memory: all components involved in the real time audio processing must use VirtualLocked memory, including DLL, Plug-ins and Drivers. But this has never been the case and it is not the case today.

The last problem is about using VirtualLock itself because it needs a kind of pre-allocation or pre-reservation for the entire process (see function SetProcessWorkingSetSize)… But for an audio application, a DAW or video game, it’s already complicated to know how much memory will be needed. So, how to know this information for external module, plug-ins or drivers? This way to manage memory allocation is currently absolutely not planned by today’s applications… So where does this recommendation come from?

In Conclusion, if using VirtualLocked memory can fix our current issue (long duration page fault interruption on real time process) this method cannot be applied today in a end user application context.
Post Reply