Genius Vision KB - How to handle Out-of-Memory issue
How to handle Out-of-Memory issue
32-bit application memory limitation
Too many cameras recording, or bitrate too high
Too many playback files caused by FileSize_MB too small
Disks too old or partially damaged
Using RAID-5 or RAID-6 could be a bad idea
Too much playback loading while recording
Measure actual disk performance
Measure actual network bandwidth usage
Case study: Out of memory caused by partially damaged disk
How to handle Out-of-Memory issue
Overview
Memory is an essential component for a working computer system, as well as a NVR server. Improper or incorrect system design could potentially lead to the exhaustion of memory and therefore generates "Out of memory" error. Like all resource bound issues, there is very little that a software developer (like Genius Vision) can do to resolve this type of issue or prevent it from happening. One must rely on skillful and experienced system designers to do that, whose expertise is beyond the scope of a software developer.
This article is not intended to resolve "Out of memory" issues, but rather provide some architectural overview of Genius Vision NVR memory usage, as it can be an important clue to find out the exact point of cause.
Memory Usage Architecture
Below illustrates the NVR software memory usage architecture in a block diagram:
This diagram is fairly straightforward, and can imply following important clues:
- Recording capacity: For a proper system design, the bandwidth to record data must be higher than total camera video input. Obviously, if bandwidth to record data is less than total camera video input, then it would inevitably cause memory accumulation in PC and therefore leads to "Out of memory" error.
- Playback capacity: Since recording and playback shares the same disk bandwidth, one must take this into consideration carefully. If too many people are simultaneously performing playback tasks, it could cause the hard disk bandwidth to jam and therefore lead to memory accumulation.
- CPU capacity: If the CPU of the PC is too busy doing other tasks, it could also cause memory accumulation.
Memory accumulation
In technical sense, the term "memory accumulation" refers to the temporary state that the software receives more input than it can output, therefore it must extend the memory usage in order to continue to work correctly.
Inside the safe range of memory usage, "memory accumulation" is an acceptable phenomena, as long as it stops before system runs out of memory, and then the software memory can reduce to normal operating range.
32-bit application memory limitation
Modern 32-bit Windows operation system provides access to a maximum 4GB memory, and it leaves only 2GB for user-mode applications. In practice, memory can fragment during software processes, therefore in our experience the memory of the NVR server process should not exceed 1.3GB or you might risk encountering "Out of memory" error.
Note: The aforementioned 1.3GB is not a fixed number but rather a heuristic value. Actual limit will depend on operation and environment. Also, 32-bit applications running under 64-bit operating system is still subject to all 32-bit memory limitations.
Common Issues Encountered
In this sections we will try to explain, that according to the memory usage architecture described above, some common issues that might have caused "Out of memory" error. Note this is not an exhaustive list and there could be other reasons behind this type of error.
Too many cameras recording, or bitrate too high
Total video input bandwidth exceeding the disk recording bandwidth will inevitably causes "Out of memory".
Too many playback files
Remember that there is a limit to everything.
To prepare the archiver subsystem for fast playback, the memory required is proportional to the number of existing recorded files. The memory is used simply for file indexing.
For example, the indexing memory required for 200MB file size is 5 times than the memory required for 1GB file size. For very large disk arrays, such as 20TB, 5 times of file count could be a major diaster.
Too many playback files caused by FileSize_MB too small
There are many default values in Storage Config that are computed carefully by the software automatically. Unless you are absolutely sure what you are doing, you should not change any of them. One parameter you should probably pay extra attention and that is FileSize_MB. Some users, for unknown reason, like to change FileSize_MB to a very small value (such as 200MB). This will increase memory loading significantly, as explained here.
If you have already changed the FileSize_MB long time ago, the software could generate too many recorded files that can cause memory to overflow. In this case, you must unload the files to other locations, otherwise you won't be able to run the system properly. By "unload", we mean you must physically move those file out of the live system.
To recover the moved file and playback is relative easy in Genius Vision NVR. Please refer to related document about "automatic diaster recovery". Briefly describing, a live NVR system can automatically detect any recorded files attached to the system by scanning all "gvrec" subdirectories on each drive.
If you accidentally changed any Storage Config parameters of a particular recording directory, you can restore its default value by deleting the changed entry (such as "D:\gvrec") and click OK. The system will automatically generate default value again.
Installing more than 4GB on 32-bit operating or 64-bit operating system, but still uses 32-bit NVR software
If you feel memory is not enough and installed 32GB memory on your 64-bit Windows, you might be forgetting that you can still be using 32-bit NVR software, which is still subject to the 32-bit application memory limitation, no matter it's installed on 32-bit Windows or 64-bit Windows. This is an intrinsic technical constraint imposed by any 32-bit application, as you can Google it on the Internet very easily.
NOTE!!If you don't understand this thoroughly, you can get a false sense of security that you have already "physically" solved memory limitation, but in reality you haven't. Meaning you can install 32GB of physical memory, but still limit by the user-mode 2GB. Again, this is not a limit imposed by us (who is only a software application developer), it's an intrinsic limit to the technologies evolved through the last decades.
The only product we have that supports native 64-bit application is Genius Vision (x64) NVR Enterprise Edition. The license price & structure of Enterprise Edition is different, and currently it's still under BETA testing.
Disks too old or partially damaged
Hard disks are known to wear-out after prolonged usage. Building an NVR server with old disks is particularly risky because old disks may contains bad-sectors that may lead to data corruption. Some disks are equipped with self-repair function that automatically moves data from bad-sectors to good-sectors so it does not show damages in an obvious way, but the self-repairing mechanism takes time, and therefore reduces the overall bandwidth. The read-write-heads of a damaged disk could also cause the disk to become much slower.
To reduce this risk, you should periodically engage maintenance to your NVR system by measuring disk performances. If the performance is degraded since last measurement, it could mean a danger signal that the disk is about to be irreversibly damaged.
In summary, old or damaged disks could slow down overall recording bandwidth and eventually causes "Out of memory" error.
Improper RAID configuration
Some cheap RAID controllers performs poorly when configured as disk array. Some trickily uses memory buffers to demonstrate high performance during sales process, but it cannot sustain prolonged recording operation (after long-time of operation it might slow down significantly). Remember using NVR means prolonged high-bandwidth usage and this cannot be achieved by some cheap RAID controllers. Careful and stressful testing must be conducted against the selected RAID controller to prevent this type of issue.
Using RAID-5 or RAID-6 could be a bad idea
Some RAID controller gives very poor performance on RAID-5 or RAID-6. This issue is very common in some low-end RAID controllers.
Don't blindly trust the performance of your RAID controller, you have to test it for actual performance for prolonged hours. (See Measure actual bandwidths, and Improper RAID configuration).
Additionally, all RAID system can engage in the state of "rebuilding" or "restoring redundancy". During this state, the RAID system will be very busy on other tasks. Simultaneously recording video to RAID system during this state could be risky (due to Memory accumulation).
It's also noteworthy to mention that because any RAID system accesses all drives at the same time for reading or writing a single block, the loading toward disk wearing-out is much higher than a system without RAID. In a system without RAID, only one disk will be accessed for a single block of data. In other words:
- The average lifespan for disks mounted on RAID is much shorter than that without RAID.
Too much playback loading while recording
Playback also eats disk bandwidth, which is shared with the recording bandwidth. Make sure you can control simultaneous playback loading under certain safe range by making proper calculation or testing. Remember that a disk operates much slower when performing read&write (playback&recording) at the same time than performing only writes (recording).
Local view loading too high
Local view on NVR servers sometimes eats CPU bandwidth and might cause the recording data accumulate in memory. This must also be looked after.
Improper System Configuration
Every computer system has its implicit limits that might not be explicit documented. Following is a list of examples of such cases:
- Added too many configuration items (cameras, channels, users, map, or policies, etc)
- Added too large recording space (which requires an index table that exceeds available memory).
- Too many network connection, with too heavy loading for each connection.
- Too many tasks performed concurrently.
- Other programs running in the same server that also consume memory.
Strategies
Measure actual disk performance
For various kinds of capacity, we suggest you to use professional and specific tools to measure actual number. For example, CrystalDiskMark is a tool commonly used to measure disk performance under various testing conditions. You need to input test cases resemble what data amount that will really be used and let it run for a relatively long time which resembles the loading of a continuously running NVR server.
Remember, disks are known to wear-out after prolonged usage. So disk performance will degrade over time. You must take charge periodical maintenance task to make sure all disks are in good health.
Measure actual network bandwidth usage
There are many readily available tools on the Internet that help you to do that. Example includes "iperf" and "NetMeter". It can give you the objective view of how much bandwidth you are giving into the NVR.
Divide and Conquer
To diagnose any problem of any complex system, it's wise to divide the system into smaller pieces that can operate independently. There should exist a boundary where a smaller system works while larger system doesn't. It would be an important clue to your problem.
For example:
- If you have a 100 cameras system that has problem, try to reduce to 50 cameras and see if problem persists.
- If you have a 50TB recording drive, try to reduce to 25TB or less. If you have a RAID controller, try to use the system without it.
- If you have a whole year of recorded video stacked in the NVR, remember it could take a lot of memory just to index those files. Try to "unload" the recorded video to somewhere else.
- If you have 10 users monitor live video & playback concurrently, try to reduce to 5 or less.
- If you are using multiple functions at the same time (intelligent detection, smart search, sync-playback, remote live view), try not to do that. Keep in mind these tasks, albeit convenient, are very resource consuming. A system multiplexing among too many tasks could easily exceed its capacity, no matter how good your hardware is.
Scientific method
Please refer to this article.
Case study: Out of memory caused by partially damaged disk
Synopsis
Here we present a real-world case that happen right in our lab, that we experienced Out of Memory that was caused by a later-diagnosed partially damaged disk.
Symptoms
After running smoothly for months, one of the NVR in our example system started to experience out of memory problem. It's random. Sometimes it ran for days consecutively and sometimes it crashed in few minutes. NVR system logs revealed that it crashed because of out of memory. We also noticed that there were corrupted recording slots, which usually indicated disk failure. Video data was cached in memory before it's written to disk. So any disk failure could cause video data to accumulate in memory, causing out of memory. By using task manager, we could easily verify our theory.
Investigation
Press CTRL+ALT+DEL to invoke Task Manager and use Resource Monitor for more detailed report.
Memory tab shows the typical graph of growing memory usage. Note that process GvActiveX.exe uses a total of 1.6 GB memory.
Disk tab shows a decrease of writing rate. This NVR has four channels installed and usually has a total bitrate of 40 Mbps. Keep in mind that the 'B/sec' and 'bps' are different. IP camera usually uses 'bps', which stands for bit per second, while B for hard disk performance means bytes. 1 byte equals 8 bits so this NVR should be writing at about 5 MB/sec. Resource Monitor constantly shows only 100 to 200 KB/sec. It's obviously too low.
Now look at a single disk. The blue line is always at 100%. It's Highest Active Time. The idea is like CPU time, indicates how busy the disk is. Constant 100% Highest Active Time means that the disk could not complete requests in time. We disabled the disk in Storage Config of NVR and memory no longer accumulates.
Now we have pinned down the root of the problem to a single hard drive. By replacing that drive, the NVR is back to normal again. Hard drive malfunction is very common for NVR systems, considering the amount of hard drives it uses and the rate of data writing.
Hot swap drive trays make life a lot easier. Trayless is even better.
Conclusion
Such problem could take months to fix if you don't know how to do it, but it's really not that difficult. The basic principle of troubleshooting is divide-and-conquer. There're only a handful of components in PC; disk, memory, CPU, motherboard, network adapter… Actually, in our experience most out-of-memory issues are related to hard drives. Our case is relatively simple. A lot of NVR systems put a lot more stress to a single PC and sometimes they have complicated storage system such as RAID. So please keep maintainability in mind when designing NVR systems (or any system). A maintainable system means that it's easy to manage, to diagnosis, and to replace or fix.
References
Page of