Hi,
Hope you are doing well.
The /etc/mwg-config entry THRESH_LOAD is evaluated by /usr/bin/mwg-monitor.
In more concrete by a package it’s including here:
require 'mwg-config/healthmon'
Which can be found here:
/usr/lib/ruby/1.8/mwg-config/healthmon.rb
Searching for THRESH_LOAD in this script you’ll find:
when 'THRESH_LOAD'
@@thresh_load = value.to_f
Looking again for variable threash_load, here are the relevant sections in the healthmon.rb:
Default definition:
@@thresh_load = 3.0
Evaluation:
def load_check
if File.readable? '/proc/loadavg'
proc_loadinfo = File.readlines("/proc/loadavg")
loads = proc_loadinfo.first.split(' ')
load_1m = loads[0].to_f / @ncpu
load_5m = loads[1].to_f / @ncpu
load_15m = loads[2].to_f / @ncpu
# puts "%.2f %.2f %.2f => %.2f %.2f %.2f" % [ loads[0].to_f, loads[1].to_f, loads[2].to_f, load_1m, load_5m, load_15m ]
if load_5m > @@thresh_load and load_1m > load_5m and load_5m > load_15m # load is becoming higher
message = "5 minute load average exceeds selected limit (%.2f / %.2f)." % [ load_5m, @@thresh_load ]
Logger.warning(message)
report(LOAD_INCIDENT,message,WARN)
else
Logger.debug("load (#{proc_loadinfo.first.strip}) is in recommended range (%.2f / %.2f)." % [ load_5m, @@thresh_load ])
end
end
end # load_check
The evaluation compares with 2nd value reported in /proc/loadavg
I was looking also here to better understand:-
When you look at /usr/lib/ruby/1.8/mwg-config/healthmon.rb - and search for loadavg, you will find that the load is already normalized according to the number of CPUs
Should the THRESH_LOAD be based on the number of physical cores? Or based on the number of threads (virtual cores)? I think Hyperthreading is enabled on the MLOS., so each physical core is = to 2 virtual cores. Below is the answer:-
The nominal value is the number of logical cores, not the number of physical CPUs (see also the links above).
You can change the value to value related to number of logical cores.
How many CPU cores present on your MWG?
The threshold of 3.0 is very conservative. So it is not really an overload!
You can change the default value for load average threshold in mwg-monitor configuration file just for your information.
https://community.mcafee.com/t5/Web-Gateway/how-to-increase-load-average-limit-to-avoid-quot-5-Minut...The monitor daemon scans the output of /proc/loadavg (it has to do with the number of processes in the runtime queue and/or waiting for I/O) This shows three values at the beginning of the line. E.g.0.20 0.23 0.67The first column is the load for the last minute, the second last 5 minutes the third last 15 minutes The alert is displayed, if 1. the 5 minute load exceeds the (currently hard-coded) threshold 2. the 1 minute load is greater than the 5 minute load 3. the 5 minute load is greater than the 15 minuteThe latter two shall address the situation of increasing load.In my opinion the threshold when the incident is thrown is too low. The threshold is "3", which is something you easily hit in case some "expensive" tasks are done. Usually this should nothing you should concern about, the logged values of fro example"4.5" and "6" are not too bad actually. Load becomes an issue when it is permanently on a high level, or ramping up without reducing again.Basically, if you do not receive complaints about performance, I am pretty sure there is no need to concern.
Was my reply helpful? If you find this post useful, Please give it a Kudos! Also, Please don't forget to select "Accept as a solution" if this reply resolves your query!
Regards
Alok Sarda