cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Former Member
Not applicable
Report Inappropriate Content
Message 1 of 2

5 minute load average exceeds selected limit

i was wondering if clarification can be provided for wait this load limit is and what it means. and if there is any concern for it flagging this many times.
1 Reply
aloksard
Employee
Employee
Report Inappropriate Content
Message 2 of 2

Re: 5 minute load average exceeds selected limit

Hi,

 

Hope you are doing well.

 

The /etc/mwg-config entry THRESH_LOAD is evaluated by /usr/bin/mwg-monitor.
In more concrete by a package it’s including here:
 
require 'mwg-config/healthmon'
 
Which can be found here:
 
/usr/lib/ruby/1.8/mwg-config/healthmon.rb
 
Searching for THRESH_LOAD in this script you’ll find:
                                       when 'THRESH_LOAD'
                                                @@thresh_load = value.to_f
 
Looking again for variable threash_load, here are the relevant sections in the healthmon.rb:
 
Default definition:
                @@thresh_load                           = 3.0
 
Evaluation:
                def load_check
                        if File.readable? '/proc/loadavg'
                                proc_loadinfo = File.readlines("/proc/loadavg")
                                loads = proc_loadinfo.first.split(' ')
                                load_1m = loads[0].to_f / @ncpu
                                load_5m = loads[1].to_f / @ncpu
                                load_15m = loads[2].to_f / @ncpu
 
                                # puts "%.2f %.2f %.2f => %.2f %.2f %.2f" % [ loads[0].to_f, loads[1].to_f, loads[2].to_f, load_1m, load_5m, load_15m ]
 
                                if load_5m > @@thresh_load and load_1m > load_5m  and load_5m > load_15m # load is becoming higher
                                        message = "5 minute load average exceeds selected limit (%.2f / %.2f)." %  [ load_5m, @@thresh_load ]
                                       Logger.warning(message)
                                        report(LOAD_INCIDENT,message,WARN)
                                else
                                        Logger.debug("load (#{proc_loadinfo.first.strip}) is in recommended range (%.2f / %.2f)." %      [ load_5m, @@thresh_load ])
                                end
 
                        end
                end # load_check
 

The evaluation compares with 2nd value reported in /proc/loadavg
 
 

I was looking also here to better understand:-
 
 

When you look at /usr/lib/ruby/1.8/mwg-config/healthmon.rb - and search for loadavg, you will find that the load is already normalized according to the number of CPUs
 
 
 
Should the THRESH_LOAD be based on the number of physical cores?  Or based on the number of threads (virtual cores)?  I think Hyperthreading is enabled on the MLOS., so each physical core is = to 2 virtual cores. Below is the answer:-
 
 
 
The nominal value is the number of logical cores, not the number of physical CPUs (see also the links above).
 

You can change the value to value related to number of logical cores.
 
 
 
How many CPU cores present on your MWG?
 
 
 
The threshold of 3.0 is very conservative. So it is not really an overload!


You can change the default value for load average threshold in mwg-monitor configuration file just for your information.


https://community.mcafee.com/t5/Web-Gateway/how-to-increase-load-average-limit-to-avoid-quot-5-Minut...


The monitor daemon scans the output of /proc/loadavg (it has to do with the number of processes in the runtime queue and/or waiting for I/O) This shows three values at the beginning of the line. E.g.
0.20 0.23 0.67
The first column is the load for the last minute, the second last 5 minutes the third last 15 minutes The alert is displayed, if 1. the 5 minute load exceeds the (currently hard-coded) threshold  2. the 1 minute load is greater than the 5 minute load  3. the 5 minute load is greater than the 15 minute


The latter two shall address the situation of increasing load.


In my opinion the threshold when the incident is thrown is too low. The threshold is "3", which is something you easily hit in case some "expensive" tasks are done. Usually this should nothing you should concern about, the logged values of  fro example"4.5" and "6" are not too bad actually. Load becomes an issue when it is permanently on a high level, or ramping up without reducing again.


Basically, if you do not receive complaints about performance, I am pretty sure there is no need to concern.


 
Was my reply helpful? If you find this post useful, Please give it a Kudos! Also, Please don't forget to select "Accept as a solution" if this reply resolves your query!
 
 
Regards
Alok Sarda
 
You Deserve an Award
Don't forget, when your helpful posts earn a kudos or get accepted as a solution you can unlock perks and badges. Those aren't the only badges, either. How many can you collect? Click here to learn more.

Community Help Hub

    New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.

  • Find Forum FAQs
  • Learn How to Earn Badges
  • Ask for Help
Go to Community Help

Join the Community

    Thousands of customers use our Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership:

  • Get helpful solutions from product experts.
  • Stay connected to product conversations that matter to you.
  • Participate in product groups led by employees.
Join the Community
Join the Community