In statistics, various types of averages are used, which are divided into two large classes:

Power averages (harmonic mean, geometric mean, arithmetic mean, mean square, cubic mean);

Structural means (fashion, median).

To calculate power averages all available characteristic values ​​must be used. Fashion and median are determined only by the distribution structure, therefore they are called structural, positional averages. The median and mode are often used as an average characteristic in those populations where calculating the power mean is impossible or impractical.

The most common type of average is the arithmetic mean. Under arithmetic mean the meaning of a feature is understood that each unit of the population would have if the total of all values ​​of the feature were distributed evenly among all units of the population. The calculation of this value is reduced to the summation of all values ​​of the variable attribute and dividing the resulting sum by the total number of units in the population. For example, five workers fulfilled an order for the manufacture of parts, while the first made 5 parts, the second - 7, the third - 4, the fourth - 10, the fifth - 12. Since in the initial data the value of each option was encountered only once, to determine

To determine the average output of one worker, a simple arithmetic mean formula should be applied:

that is, in our example, the average output of one worker is equal to

Along with the simple arithmetic mean, study weighted arithmetic mean. For example, let's calculate the average age of students in a group of 20, whose ages range from 18 to 22, where xi- variants of the averaged feature, fi- frequency, which shows how many times it occurs i-th value in aggregate (Table 5.1).

Table 5.1

Average age of students

Applying the formula for the arithmetic weighted average, we get:


There is a certain rule for choosing the weighted arithmetic mean: if there is a series of data on two indicators, for one of which it is necessary to calculate

the average value, and at the same time the numerical values ​​of the denominator of its logical formula are known, and the values ​​of the numerator are unknown, but can be found as the product of these indicators, then the average value should be calculated according to the formula of the weighted arithmetic mean.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic mean loses its meaning and the only generalizing indicator can be only another type of average - average harmonic. At present, the computational properties of the arithmetic mean have lost their relevance in the calculation of generalizing statistical indicators in connection with the widespread introduction of electronic computing technology. The average harmonic value, which can also be simple and weighted, has acquired great practical importance. If the numerical values ​​of the numerator of a logical formula are known, and the values ​​of the denominator are unknown, but can be found as a quotient division of one indicator by another, then the average value is calculated using the harmonic weighted average formula.

For example, let it be known that the car covered the first 210 km at 70 km / h, and the remaining 150 km at 75 km / h. It is impossible to determine the average speed of a car throughout the entire journey of 360 km using the arithmetic mean formula. Since the options are speeds in individual sections xj= 70 km / h and X2= 75 km / h, and the weights (fi) are the corresponding segments of the path, then the products of the options by the weights will have neither physical nor economic meaning. In this case, the quotients from dividing the sections of the path by the corresponding speeds (options xi), that is, the time spent on the passage of individual sections of the path (fi / xi). If the segments of the path are denoted by fi, then the entire path will be expressed as? Fi, and the time spent on the entire path - how? fi / xi , Then the average speed can be found as the quotient of dividing the entire path by the total time required:

In our example, we get:

If, when using the average harmonic weights of all options (f) are equal, then instead of the weighted one, you can use simple (unweighted) harmonic mean:

where xi are individual options; n- the number of variants of the averaged feature. In the example with speed, the simple harmonic average could be applied if the path segments traveled at different speeds were equal.

Any average value should be calculated so that when it replaces each variant of the averaged feature, the value of some final, generalizing indicator, which is associated with the averaged indicator, does not change. So, when replacing the actual speeds on individual sections of the path with their average value (average speed), the total distance should not change.

The form (formula) of the average value is determined by the nature (mechanism) of the relationship of this final indicator with the average, therefore the final indicator, the value of which should not change when replacing the options with their average value, is called defining indicator. To derive the formula for the average, you need to compose and solve an equation using the relationship of the averaged indicator with the determining one. This equation is constructed by replacing the variants of the averaged attribute (indicator) with their average value.

In addition to the arithmetic mean and harmonic mean, other types (forms) of the mean are also used in statistics. They are all special cases. power-law average. If we calculate all kinds of power-law averages for the same data, then the values

they will turn out to be the same, the rule applies here majo-ranks medium. With an increase in the exponent of averages, the mean value itself also increases. The formulas most often used in practical research for calculating various types of power-law mean values ​​are presented in Table. 5.2.

Table 5.2

Types of power averages


Geometric mean is applied when available. n growth factors, while the individual values ​​of the feature are, as a rule, the relative values ​​of the dynamics, built in the form of chain quantities, as a relation to the previous level of each level in the series of dynamics. The average thus characterizes the average growth rate. Average geometric simple calculated by the formula

Formula geometric weighted mean looks like this:

The formulas given are identical, but one is applied at the current rates or growth rates, and the second - at the absolute values ​​of the series levels.

Root mean square is used when calculating with the values ​​of square functions, is used to measure the degree of variability of individual values ​​of a feature around the arithmetic mean in distribution series and is calculated by the formula

Weighted mean square calculated using a different formula:

Average cubic is used when calculating with the values ​​of cubic functions and is calculated by the formula

weighted average cubic:

All the averages discussed above can be presented in the form of a general formula:

where is the average value; - individual value; n- the number of units of the studied population; k- an exponent that determines the type of average.

When using the same initial data, the more k in the general formula of the power-law average, the larger the average value. It follows from this that there is a regular relationship between the values ​​of the power averages:

The average values ​​described above give a generalized idea of ​​the studied aggregate, and from this point of view, their theoretical, applied and cognitive value is indisputable. But it happens that the value of the mean does not coincide with any of the really existing options, therefore, in addition to the averages considered in the statistical analysis, it is advisable to use the values ​​of specific options, which occupy a quite definite position in an ordered (ranked) series of values ​​of a feature. Among these values, the most common are structural, or descriptive, medium- mode (Mo) and median (Me).

Fashion- the value of a feature that is most often found in a given population. With regard to the variation series, the mode is the most frequently occurring value of the ranked series, that is, the variant with the highest frequency. Fashion can be used to determine which stores are more frequently visited and the most common price for a product. It shows the size of a feature characteristic of a significant part of the population, and is determined by the formula

where x0 is the lower boundary of the interval; h- the size of the interval; fm- interval frequency; fm_ 1 - frequency of the previous interval; fm + 1 - frequency of the next interval.

Median is called the variant located in the center of the ranked row. The median divides the row into two equal parts in such a way that the same number of population units are located on either side of it. At the same time, in one half of the units of the population, the value of the varying attribute is less than the median, in the other - more than it. The median is used when studying an element, the value of which is greater than or equal to or simultaneously less than or equal to half of the elements of the distribution series. The median gives a general idea of ​​where the attribute values ​​are concentrated, in other words, where their center is located.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative border of the values ​​of the varying attribute, which half of the population units have. The problem of finding the median for a discrete variation series is easy to solve. If we assign ordinal numbers to all units of the series, then the ordinal number of the median variant is determined as (n +1) / 2 with an odd number of members n. If the number of members of the series is an even number, then the median will be the average of the two options with ordinal numbers n/ 2 and n/ 2 + 1.

When determining the median in the interval variation series, the interval in which it is located (median interval) is first determined. This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds the half-sum of all frequencies in the series. The median of the interval variation series is calculated using the formula

where X0- the lower boundary of the interval; h- the size of the interval; fm- interval frequency; f- the number of members of the series;

M -1 - the sum of the accumulated members of the series preceding this one.

Along with the median, for a more complete characterization of the structure of the studied population, other values ​​of the options are used, which occupy a well-defined position in the ranked series. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into 4 equal parts, and deciles into 10 equal parts. There are three quartiles and nine deciles.

The median and mode, in contrast to the arithmetic mean, do not extinguish individual differences in the values ​​of a variable attribute and therefore are additional and very important characteristics of a statistical population. In practice, they are often used instead of or alongside the average. It is especially advisable to calculate the median and mode in those cases when the studied population contains a certain number of units with a very large or very small value of the varying attribute. These, not very characteristic for the aggregate values ​​of the options, affecting the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.

Average values ​​are widespread in statistics. Average values ​​characterize the qualitative indicators of commercial activity: distribution costs, profit, profitability, etc.

Average is one of the common generalizations. A correct understanding of the essence of the average determines its special significance in the conditions of a market economy, when the average, through the single and random, makes it possible to identify the general and necessary, to reveal the tendency of the laws of economic development.

average value - these are generalizing indicators in which the action of general conditions, patterns of the phenomenon under study is expressed.

Statistical averages are calculated on the basis of mass data of a correctly statistically organized mass observation (continuous and selective). However, the statistical average will be objective and typical if it is calculated from mass data for a qualitatively homogeneous population (mass phenomena). For example, if you calculate the average wages in cooperatives and state-owned enterprises, and extend the result to the entire population, then the average is fictitious, since it is calculated over a heterogeneous population, and such an average loses all meaning.

With the help of the average, there is, as it were, smoothing out the differences in the value of the attribute, which arise for one reason or another in individual units of observation.

For example, the average output of a salesperson depends on many reasons: qualifications, length of service, age, form of service, health, etc.

Average output reflects the general property of the entire population.

The average value is a reflection of the values ​​of the trait under study, therefore, it is measured in the same dimension as this trait.

Each average value characterizes the studied population for any one criterion. To get a complete and comprehensive picture of the studied population for a number of essential features, in general, it is necessary to have a system of average values ​​that can describe the phenomenon from different angles.

There are various averages:

    arithmetic mean;

    geometric mean;

    average harmonic;

    root mean square;

    average chronological.

Let's consider some types of averages that are most often used in statistics.

Arithmetic mean

The simple arithmetic mean (unweighted) is equal to the sum of the individual values ​​of the attribute, divided by the number of these values.

The individual values ​​of the attribute are called variants and are denoted by x (); the number of units in the population is denoted by n, the average value of the feature is denoted by ... Therefore, the simple arithmetic mean is:

According to the data of the discrete distribution series, it can be seen that the same values ​​of the attribute (variants) are repeated several times. So, option x occurs in aggregate 2 times, and option x - 16 times, etc.

The number of identical values ​​of a feature in the distribution series is called the frequency or weight and is denoted by the symbol n.

Let's calculate the average wage of one worker in rubles:

The wage bill for each group of workers is equal to the product of the options by the frequency, and the sum of these products gives the total wage bill of all workers.

In accordance with this, the calculations can be presented in general form:

The resulting formula is called the weighted arithmetic mean.

The statistical material as a result of processing can be presented not only in the form of discrete distribution series, but also in the form of interval variation series with closed or open intervals.

The calculation of the average for the grouped data is made according to the formula of the arithmetic weighted average:

In the practice of economic statistics, sometimes it is necessary to calculate the average by group means or by means of individual parts of the population (private means). In such cases, group or partial averages are taken as options (x), on the basis of which the total average is calculated as the usual weighted arithmetic mean.

Basic properties of the arithmetic mean .

The arithmetic mean has a number of properties:

1. From a decrease or increase in the frequencies of each value of the attribute x in n times, the value of the arithmetic mean will not change.

If all frequencies are divided or multiplied by any number, then the value of the average will not change.

2. The common factor of individual values ​​of the attribute can be taken out of the mean sign:

3. The average of the sum (difference) of two or more values ​​is equal to the sum (difference) of their average:

4. If x = c, where c is a constant, then
.

5. The sum of deviations of the values ​​of the attribute X from the arithmetic mean x is equal to zero:

Average harmonic.

Along with the arithmetic mean, statistics use the harmonic mean, the reciprocal of the arithmetic mean of the reciprocal values ​​of the attribute. Like the arithmetic mean, it can be simple and weighted.

The characteristics of the variation series, along with the mean, are the mode and the median.

Fashion - This is the value of a feature (option), which is most often repeated in the studied population. For discrete distribution series, the mode will be the value of the variant with the highest frequency.

For interval series of distribution with equal intervals, the mode is determined by the formula:

where
- the initial value of the interval containing the mode;

- the value of the modal interval;

- the frequency of the modal interval;

- the frequency of the interval preceding the modal;

is the frequency of the interval following the modal.

Median - this is a variant located in the middle of the variation series. If the distribution series is discrete and has an odd number of members, then the median will be the option located in the middle of the ordered row (an ordered row is the arrangement of the population units in ascending or descending order).

At the stage of statistical processing, a variety of research tasks can be set, for the solution of which an appropriate average must be selected. In this case, it is necessary to be guided by the following rule: the values ​​that represent the numerator and denominator of the average must be logically related.

  • power averages;
  • structural averages.

Let's introduce the following conventions:

The values ​​for which the average is calculated;

Average, where the line above indicates that there is averaging of individual values;

Frequency (repeatability of individual values ​​of a feature).

Various averages are derived from the general power mean formula:

for k = 1 - the arithmetic mean; k = -1 - average harmonic; k = 0 - geometric mean; k = -2 - root mean square.

Average values ​​are simple and weighted.

Weighted averages they call the values ​​that take into account that some options for the values ​​of the trait may have different numbers, and therefore each option has to be multiplied by this number. In other words, the "weights" are the numbers of units of the population in different groups, i.e. each option is "weighted" by its frequency. The frequency f is called the statistical weight or average weight.

It is known that transactions were carried out within 5 days (5 transactions), the number of sold shares at the sales rate was distributed as follows:

1 - 800 ac. - 1010 rubles.

2 - 650 ac. - 990 rubles.

3 - 700 ac. - 1015 rubles.

4 - 550 ac. - 900 rubles.

5 - 850 ac. - 1150 rubles.

The initial ratio for determining the average share price is the ratio of the total amount of transactions (OSS) to the number of sold shares (KPA):

ОСС = 1010 · 800 + 990 · 650 + 1015 · 700 + 900 · 550 + 1150 · 850 = 3 634 500;

KPA = 800 + 650 + 700 + 550 + 850 = 3550.

In this case, the average share price was:

It is necessary to know the properties of the arithmetic mean, which is very important both for its use and for its calculation. There are three main properties that most of all determined the widespread use of the arithmetic mean in statistical and economic calculations.

Property one (zero): the sum of positive deviations of individual values ​​of a feature from its mean value is equal to the sum of negative deviations. This is a very important property, since it shows that any deviations (both with + and with -) caused by random causes will mutually be canceled out.

Proof:

Second property (minimal): the sum of the squares of the deviations of the individual values ​​of the feature from the arithmetic mean is less than from any other number (a), i.e. there is a minimum number.

Proof.

Let's compose the sum of the squares of the deviations from the variable a:

To find the extremum of this function, it is necessary to equate its derivative with respect to a to zero:

From here we get:

Consequently, the extremum of the sum of squared deviations is reached at. This extremum is a minimum, since the function cannot have a maximum.

Property three: the arithmetic mean of a constant value is equal to this constant: at a = const.

In addition to these three most important properties of the arithmetic mean, there are the so-called design properties, which are gradually losing their importance in connection with the use of electronic computing technology:

  • if the individual value of the attribute of each unit is multiplied or divided by a constant number, then the arithmetic mean will increase or decrease by the same amount;
  • the arithmetic mean will not change if the weight (frequency) of each attribute value is divided by a constant number;
  • if the individual values ​​of the attribute of each unit are reduced or increased by the same amount, then the arithmetic mean will decrease or increase by the same amount.

Average harmonic... This average is called the inverse arithmetic average, since this value is used when k = -1.

Simple mean harmonic is used when the weights of the characteristic values ​​are the same. Its formula can be derived from the basic formula by substituting k = -1:

For example, we need to calculate the average speed of two cars that traveled the same path, but at different speeds: the first - at a speed of 100 km / h, the second - 90 km / h.

Using the harmonic mean method, we calculate the mean velocity:

In statistical practice, harmonic weighted is more often used, the formula of which has the form:

This formula is used in cases where the weights (or volumes of events) are not equal for each attribute. In the original ratio for calculating the mean, the numerator is known, but the denominator is unknown.

For example, when calculating the average price, we should use the ratio of the amount sold to the number of units sold. We do not know the number of units sold (we are talking about different goods), but we do know the amount of sales of these different goods.

Let's say you want to know the average price of goods sold:

We get

If you use the arithmetic mean formula here, you can get the average price, which will be unrealistic:

Geometric mean... Most often, the geometric mean finds its application in determining the average growth rates (average growth rates), when the individual values ​​of the trait are presented in the form of relative values. It is also used if you want to find the average between the minimum and maximum values ​​of a characteristic (for example, between 100 and 1,000,000). There are formulas for simple and weighted geometric mean.

For simple geometric mean:

For the weighted geometric mean:

Root mean square... Its main area of ​​application is to measure the variation of a feature in the aggregate (calculating the standard deviation).

Simple root mean square formula:

Weighted root mean square formula:

As a result, we can say that the successful solution of the problems of statistical research depends on the correct choice of the type of average value in each specific case.

The choice of the average assumes the following sequence:

a) the establishment of a generalizing indicator of the population;

b) determination of the mathematical ratio of values ​​for a given generalizing indicator;

c) replacement of individual values ​​with average values;

d) calculation of the average using the appropriate equation.

Example. According to the table. 2.1 it is required to calculate the average wages in general for three enterprises.

Table 2.1

Wages of JSC enterprises

Company

The number of industrial productionpersonnel (PPP), people

Monthly fund wages, rub.

Average wage, rub.

564840

2092

332750

2750

517540

2260

Total

1415130

The specific calculation formula depends on what data in the table. 7 are original. Accordingly, the following options are possible: data in columns 1 (number of PPP) and 2 (monthly payroll); or - 1 (number of PPP) and 3 (average salary); or 2 (monthly payroll) and 3 (average salary). If only column 1 and 2 data is available... The results of these graphs contain the necessary values ​​for calculating the desired average. The average aggregate formula is used: If only column 1 and 3 data is available, then the denominator of the original ratio is known, but its numerator is not known. However, the payroll can be obtained by multiplying the average wage by the number of PPP. Therefore, the overall average can be calculated using the formula weighted arithmetic mean: It must be borne in mind that the weight ( f i) in some cases can be a product of two or even three meanings. In addition, in statistical practice, the average arithmetic unweighted:. where n is the volume of the population. This average is used when the weights ( f i) absent (each variant of the feature occurs only once) or equal to each other. If only the data in columns 2 and 3 is available., i.e., the numerator of the original ratio is known, but its denominator is not known. The number of PPP for each enterprise can be obtained by dividing the payroll by the average salary. Then the calculation of the average salary for the three enterprises as a whole is carried out according to the formula average harmonic weighted: If the weights are equal ( f i) the average indicator can be calculated by unweighted average harmonic:... In our example, we used different forms of means, but got the same answer. This is due to the fact that for specific data the same initial average ratio was realized each time. Averages can be calculated using discrete and interval variation series. In this case, the calculation is carried out according to the arithmetic weighted average. For a discrete series, this formula is used in the same way as in the above example. In the interval series, for the calculation, the midpoints of the intervals are determined. Example. According to the table. 2.2 we will determine the value of the average per capita monetary income per month in the conditional region. Table 2.2 Initial data (variation series)
Average per capita monetary income per month, x, rubles Population,% of the total /
Up to 400 30,2
400 - 600 24,4
600 - 800 16,7
800 - 1000 10,5
1000-1200 6,5
1200 - 1600 6,7
1600 - 2000 2,7
2000 and up 2,3
Total 100
Average per capita money income is 688.5 rubles. The harmonic mean is calculated in cases where: · the arithmetic mean cannot be calculated from the available data; Calculation of average harmonic is more convenient, where NS variants of the averaged feature. Example. It is required to calculate the productivity of labor power, if the first worker needs 0.25 hours to manufacture a unit of production, the second 1/3 hour, and the third 1/2 hour. We get:

Every person in the modern world, planning to take out a loan or stocking vegetables for the winter, is periodically confronted with such a concept as "average value". Let's find out: what it is, what types and classes of it exist, and why it is used in statistics and other disciplines.

Average - what is it?

A similar name (SV) is a generalized characteristic of a set of homogeneous phenomena, determined by any one quantitative variable characteristic.

However, people who are far from such abstruse definitions understand this concept as an average amount of something. For example, before taking a loan, a bank employee will definitely ask a potential client to provide data on the average income for the year, that is, the total amount of money earned by a person. It is calculated by adding up the earnings for the entire year and dividing by the number of months. Thus, the bank will be able to determine whether its client will be able to repay the debt on time.

Why is it used?

As a rule, averages are widely used in order to give a summary description of certain social phenomena that are of a mass nature. They can also be used for smaller scale calculations, as in the case of a loan in the example above.

Most often, however, averages are still used for global purposes. An example of one of them is the calculation of the amount of electricity consumed by citizens during one calendar month. On the basis of the data obtained, in the future, maximum norms are established for categories of the population who enjoy benefits from the state.

Also, with the help of average values, the warranty life of certain household appliances, cars, buildings, etc. is developed. On the basis of the data collected in this way, modern standards of work and rest were once developed.

In fact, any phenomenon of modern life that is of a massive nature is in one way or another necessarily associated with the concept under consideration.

Applications

This phenomenon is widely used in almost all exact sciences, especially those of an experimental nature.

Finding an average is essential in medicine, engineering, cooking, economics, politics, and more.

Based on the data obtained from such generalizations, they develop medications, educational programs, establish minimum living wages and wages, build educational schedules, produce furniture, clothing and footwear, hygiene items and much more.

In mathematics, this term is called "mean value" and is used to implement solutions to various examples and problems. The simplest of these are addition and subtraction with regular fractions. After all, as you know, to solve such examples, it is necessary to bring both fractions to a common denominator.

Also, the queen of exact sciences often uses a similarly meaningful term "mean value of a random variable." Most are more familiar with it as the "mathematical expectation", more often considered in the theory of probability. It should be noted that a similar phenomenon is also applied when performing statistical calculations.

Average value in statistics

However, most often the studied concept is used in statistics. As you know, this science itself specializes in calculating and analyzing the quantitative characteristics of mass social phenomena. Therefore, the average value in statistics is used as a specialized method for achieving its main tasks - the collection and analysis of information.

The essence of this statistical method is to replace the individual unique values ​​of the attribute under consideration with a certain balanced average.

The famous food joke is an example. So, at a certain factory on Tuesdays for lunch, his bosses usually eat meat casserole, and ordinary workers - stewed cabbage. Based on these data, we can conclude that on average the plant's staff dines on cabbage rolls on Tuesdays.

Although this example is slightly exaggerated, it illustrates the main drawback of the method for finding the average value - leveling the individual characteristics of objects or persons.

The average values ​​are used not only to analyze the collected information, but also to plan and predict further actions.

It also evaluates the results achieved (for example, the implementation of the plan for growing and harvesting wheat for the spring-summer season).

How to calculate correctly

Although, depending on the type of SV, there are different formulas for its calculation, in the general theory of statistics, as a rule, only one method of calculating the average value of a feature is used. To do this, you must first add together the values ​​of all phenomena, and then divide the resulting sum by their number.

When making such calculations, it is worth remembering that the average value always has the same dimension (or units) as the individual unit of the population.

Conditions for correct calculation

The formula considered above is very simple and universal, so it is almost impossible to make a mistake in it. However, it is always worth considering two aspects, otherwise the data obtained will not reflect the real situation.


Classes CB

Having found answers to the basic questions: "What is the average value?", "Where is it used?" and "How can you calculate it?", it is worth finding out what classes and types of CB exist.

First of all, this phenomenon is divided into 2 classes. These are structural and power-law averages.

Types of power-law SV

Each of the above classes, in turn, is divided into types. The degree class has four.

  • The arithmetic mean is the most common type of CB. It is the average term, in determining which the total volume of the considered attribute in the aggregate of data is equally distributed among all units of the given aggregate.

    This type is divided into subspecies: simple and weighted arithmetic SV.

  • The harmonic mean is the reciprocal of the arithmetic mean, calculated from the reciprocal of the considered attribute.

    It is used in cases where the individual values ​​of the attribute and the product are known, but the frequency data is not.

  • The geometric mean is most often used in the analysis of the growth rates of economic phenomena. It makes it possible to keep the product of individual values ​​of a given quantity unchanged, and not the sum.

    It can also be simple and balanced.

  • The root mean square value is used in calculating individual indicators of indicators, such as the coefficient of variation, which characterizes the rhythm of production, etc.

    It also calculates the average diameters of pipes, wheels, the average sides of a square and similar figures.

    Like all other types of average SV, the root mean square is simple and weighted.

Types of structural quantities

In addition to average SV, structural types are often used in statistics. They are better suited for calculating the relative characteristics of the values ​​of the varying feature and the internal structure of the distribution series.

There are two such types.