LSM for a function of two variables. The method of least squares in Excel. Regression analysis. Adding Trendlines to a Chart

The problem is to find the coefficients linear dependence, for which the function of two variables A And b accepts smallest value. That is, given the data A And b the sum of the squared deviations of the experimental data from the found straight line will be the smallest. This is the essence of the method. least squares.

Thus, the solution of the example is reduced to finding the extremum of a function of two variables.

Derivation of formulas for finding coefficients. A system of two equations with two unknowns is compiled and solved. Finding partial derivatives of functions by variables A And b, we equate these derivatives to zero.

We solve the resulting system of equations by any method (for example, the substitution method or the Cramer method) and obtain formulas for finding the coefficients using the least squares method (LSM).

With data A And b function takes the smallest value.

That's the whole method of least squares. Formula for finding the parameter a contains the sums , , , and the parameter n- amount of experimental data. The values of these sums are recommended to be calculated separately. Coefficient b found after calculation a.

The main area of application of such polynomials is the processing of experimental data (the construction of empirical formulas). The fact is that the interpolation polynomial constructed from the values of the function obtained with the help of the experiment will be strongly influenced by "experimental noise", moreover, during interpolation, the interpolation nodes cannot be repeated, i.e. you can not use the results of repeated experiments under the same conditions. The root-mean-square polynomial smoothes the noise and makes it possible to use the results of multiple experiments.

Numerical integration and differentiation. Example.

Numerical Integration- calculation of the value of a definite integral (as a rule, approximate). Numerical integration is understood as a set of numerical methods for finding the value of a certain integral.

Numerical differentiation– a set of methods for calculating the value of the derivative of a discretely given function.

Integration

Formulation of the problem. Mathematical statement of the problem: it is necessary to find the value of a certain integral

where a, b are finite, f(x) is continuous on [а, b].

When solving practical problems, it often happens that the integral is inconvenient or impossible to take analytically: it may not be expressed in elementary functions, the integrand can be given in the form of a table, etc. In such cases, numerical integration methods are used. Numerical integration methods use the replacement of the area of a curvilinear trapezoid by a finite sum of areas of simpler ones geometric shapes, which can be calculated exactly. In this sense one speaks of the use of quadrature formulas.

Most methods use the representation of the integral as a finite sum (quadrature formula):

Quadrature formulas are based on the idea of replacing the graph of the integrand on the integration interval with functions of more simple form, which can be easily integrated analytically and thus easily computed. The simplest task of constructing quadrature formulas is realized for polynomial mathematical models.

Three groups of methods can be distinguished:

1. Method with division of the segment of integration into equal intervals. The division into intervals is done in advance, usually the intervals are chosen equal (to make it easier to calculate the function at the ends of the intervals). Calculate areas and sum them up (methods of rectangles, trapezoid, Simpson).

2. Methods with partitioning of the segment of integration using special points (Gauss method).

3. Calculation of integrals using random numbers (Monte Carlo method).

Rectangle method. Let the function (drawing) be integrated numerically on the segment . We divide the segment into N equal intervals. The area of each of the N curvilinear trapezoids can be replaced by the area of a rectangle.

The width of all rectangles is the same and equal to:

As a choice of the height of the rectangles, you can choose the value of the function on the left border. In this case, the height of the first rectangle will be f(a), the second one will be f(x 1),…, N-f(N-1).

If we take the value of the function on the right border as the choice of the height of the rectangle, then in this case the height of the first rectangle will be f (x 1), the second - f (x 2), ..., N - f (x N).

As can be seen, in this case one of the formulas gives an approximation to the integral with an excess, and the second with a deficiency. There is another way - to use the value of the function in the middle of the integration segment for approximation:

Estimation of the absolute error of the method of rectangles (middle)

Estimation of the absolute error of the methods of left and right rectangles.

Example. Calculate for the entire interval and dividing the interval into four sections

Solution. Analytical calculation of this integral gives I=arctg(1)–arctg(0)=0.7853981634. In our case:

1) h = 1; xo = 0; x1 = 1;

2) h = 0.25 (1/4); x0 = 0; x1 = 0.25; x2 = 0.5; x3 = 0.75; x4 = 1;

We calculate by the method of left rectangles:

We calculate by the method of right rectangles:

Calculate by the method of average rectangles:

Trapezoidal method. Using a polynomial of the first degree for interpolation (a straight line drawn through two points) leads to the trapezoid formula. The ends of the integration segment are taken as interpolation nodes. Thus, curvilinear trapezoid is replaced by an ordinary trapezoid, the area of which can be found as the product of half the sum of the bases and the height

In the case of N segments of integration for all nodes, except for the extreme points of the segment, the value of the function will enter into total amount twice (since neighboring trapezoids share the same side)

The trapezoid formula can be obtained by taking half the sum of the rectangle formulas along the right and left edges of the segment:

Checking the stability of the solution. As a rule, the shorter the length of each interval, i.e. the greater the number of these intervals, the less the difference between the approximate and exact values of the integral. This is true for most functions. In the trapezoid method, the error in calculating the integral ϭ is approximately proportional to the square of the integration step (ϭ ~ h 2). Thus, to calculate the integral of a certain function in the limits a, b, it is necessary to divide the segment into N 0 intervals and find the sum of the areas of the trapezoid. Then you need to increase the number of intervals N 1, again calculate the sum of the trapezoid and compare the resulting value with the previous result. This should be repeated until (N i) until the specified accuracy of the result (convergence criterion) is reached.

For the rectangle and trapezoid methods, usually at each iteration step, the number of intervals increases by a factor of 2 (N i +1 =2N i).

Convergence criterion:

The main advantage of the trapezoid rule is its simplicity. However, if high precision is required in the calculation of the integral, this method may require too much a large number iterations.

Absolute error of the trapezoidal method rated as
.

Example. Calculate an approximately definite integral using the trapezoid formula.

a) Dividing the integration segment into 3 parts.
b) Dividing the segment of integration into 5 parts.

Solution:
a) By condition, the integration segment must be divided into 3 parts, that is.
Calculate the length of each segment of the partition: .

Thus, the general formula of trapezoids is reduced to a pleasant size:

Finally:

I remind you that the resulting value is an approximate value of the area.

b) We divide the integration segment into 5 equal parts, that is, . by increasing the number of segments, we increase the accuracy of calculations.

If , then the trapezoid formula takes the following form:

Let's find the partitioning step:
, that is, the length of each intermediate segment is 0.6.

When finishing the task, it is convenient to draw up all calculations with a calculation table:

In the first line we write "counter"

As a result:

Well, there really is a clarification, and a serious one!
If for 3 segments of the partition , then for 5 segments . If you take even more segment => will be even more accurate.

Simpson formula. The trapezoid formula gives a result that strongly depends on the step size h, which affects the accuracy of calculating a definite integral, especially in cases where the function is nonmonotonic. One can assume an increase in the accuracy of calculations if, instead of segments of straight lines replacing the curvilinear fragments of the graph of the function f(x), we use, for example, fragments of parabolas given through three neighboring points of the graph. A similar geometric interpretation underlies Simpson's method for calculating the definite integral. The entire integration interval a,b is divided into N segments, the length of the segment will also be equal to h=(b-a)/N.

Simpson's formula is:

remainder term

With an increase in the length of the segments, the accuracy of the formula decreases, therefore, to increase the accuracy, the composite Simpson formula is used. The entire integration interval is divided into an even number of identical segments N, the length of the segment will also be equal to h=(b-a)/N. The composite Simpson formula is:

In the formula, the expressions in brackets are the sums of the values of the integrand, respectively, at the ends of the odd and even internal segments.

The remainder term of Simpson's formula is already proportional to the fourth power of the step:

Example: Calculate the integral using Simpson's rule. (Exact solution - 0.2)

Gauss method

Quadrature formula of Gauss. The basic principle of quadrature formulas of the second variety is visible from Figure 1.12: it is necessary to place the points in such a way X 0 and X 1 inside the segment [ a;b] so that the areas of the "triangles" in total are equal to the areas of the "segment". When using the Gauss formula, the initial segment [ a;b] is reduced to the interval [-1;1] by changing the variable X on

0.5∙(b– a)∙t+ 0.5∙(b + a).

Then , Where .

This substitution is possible if a And b are finite, and the function f(x) is continuous on [ a;b]. Gauss formula for n points x i, i=0,1,..,n-1 inside the segment [ a;b]:

, (1.27)

Where t i And A i for various n are given in reference books. For example, when n=2 A 0 =A 1=1; at n=3: t 0 =t 2" 0.775, t 1 =0, A 0 =A 2" 0.555, A 1" 0.889.

Quadrature formula of Gauss

obtained with a weight function equal to one p(x)= 1 and nodes x i, which are the roots of the Legendre polynomials

Odds A i easily calculated by formulas

i=0,1,2,...n.

The values of nodes and coefficients for n=2,3,4,5 are given in the table

Order	Knots	Odds
n=2	x 1=0 x 0 =-x2=0.7745966692	A 1=8/9 A 0 = A 2=5/9
n=3	x 2 =-x 1=0.3399810436 x 3 =-x0=0.8611363116	A 1 =A 2=0.6521451549 A 0 = A 3=0.6521451549
n=4	x 2 = 0 x 3 = -x 1 = 0.5384693101 x 4 =-x 0 =0.9061798459	A 0 =0.568888899 A 3 =A 1 =0.4786286705 A 0 =A 4 =0.2869268851
n=5	x 5 = -x 0 =0.9324695142 x 4 = -x 1 =0.6612093865 x 3 = -x 2 =0.2386191861	A 5 =A 0 =0.1713244924 A 4 =A 1 =0.3607615730 A 3 =A 2 =0.4679139346

Example. Calculate the value using the Gauss formula for n=2:

Exact value: .

The algorithm for calculating the integral according to the Gauss formula provides not for doubling the number of microsegments, but for increasing the number of ordinates by 1 and comparing the obtained values of the integral. The advantage of the Gauss formula is high accuracy with a relatively small number of ordinates. Disadvantages: inconvenient for manual calculations; must be stored in computer memory t i, A i for various n.

The error of the Gauss quadrature formula on the segment will be at the same time For the formula of the remainder term will be where the coefficient α N decreases rapidly with growth N. Here

Gauss formulas provide high accuracy already with a small number of nodes (from 4 to 10). In this case, in practical calculations, the number of nodes ranges from several hundred to several thousand. We also note that the weights of Gaussian quadratures are always positive, which ensures the stability of the algorithm for calculating the sums

3.5. Least square method

The first work, which laid the foundations of the method of least squares, was carried out by Legendre in 1805. In the article "New methods for determining the orbits of comets", he wrote: "After all the conditions of the problem have been fully used, it is necessary to determine the coefficients so that the magnitude of their errors were the least possible. The simplest way to achieve this is the method, which consists in finding the minimum of the sum of squared errors. ”At present, the method is used very widely in approximating unknown functional dependencies given by many experimental readings in order to obtain an analytical expression, the best way close to natural experiment.

Let, based on the experiment, it is required to establish the functional dependence of the quantity y on x : .And let as a result of the experiment obtainedn values ywith the corresponding values of the argumentx. If the experimental points are located on the coordinate plane as in the figure, then, knowing that there are errors in the experiment, we can assume that the dependence is linear, i.e.y= ax+ b.Note that the method does not impose restrictions on the form of the function, i.e. it can be applied to any functional dependencies.

From the point of view of the experimenter, it is often more natural to think that the sequence of samplingfixed in advance, i.e. is an independent variable, and the counts - dependent variable. This is especially clear if under instants of time are understood, which most widely takes place in technical applications. But this is only a very common special case. For example, it is necessary to classify some samples by size. Then the independent variable will be the number of the sample, the dependent variable will be its individual size.

The least squares method is described in detail in many educational and scientific publications, especially in terms of the approximation of functions in electrical and radio engineering, as well as in books on probability theory and mathematical statistics.

Let's go back to the drawing. The dotted lines show that errors can arise not only due to the imperfection of the measurement procedures, but also due to the inaccuracy of setting the independent variable. With the chosen form of the function it remains to choose the parameters included in ita And b.It is clear that the number of parameters can be more than two, which is typical only for linear functions. general view we assume

.(1)

It is required to choose coefficientsa, b, c... so that the condition is met

. (2)

Let's find the values a, b, c… that turn the left side of (2) to a minimum. To do this, we define stationary points (points at which the first derivative vanishes) by differentiating the left side of (2) with respect toa, b, c:

(3)

etc. The resulting system of equations contains as many equations as there are unknownsa, b, c…. It is impossible to solve such a system in a general form, therefore it is necessary to set, at least approximately, a specific type of function . Next, we consider two cases: linear and quadratic functions.

Linear function .

Consider the sum of the squared differences between the experimental values and the function values at the corresponding points:

(4)

Let's select the parametersa And bso that this sum has the smallest value. Thus, the problem is reduced to finding the valuesa And b, at which the function has a minimum, i.e. to the study of a function of two independent variablesa And bto the minimum. To do this, we differentiate with respect toa And b:

;

(5)

Substituting the experimental data and , we obtain a system of two linear equations with two unknownsa And b. Having solved this system, we can write the function .

We make sure that for the found valuesa And bhas a minimum. To do this, we find , and :

, , .

Hence,

− = ,

>0,

those. a sufficient minimum condition for a function of two variables is satisfied.

quadratic function .

Let the values of the function at the points be obtained in the experiment. Let also on the basis of a priori information there is an assumption that the function is quadratic:

It is required to find the coefficientsa, b And c.We have

is a function of three variablesa, b, c.

In this case, system (3) takes the form:

Or:

Solving this system of linear equations, we determine the unknownsa, b, c.

Example.Let four values of the desired function be obtained on the basis of the experiment y = (x ) with four values of the argument, which are given in the table:

Least square method

Least square method ( MNK, OLS, Ordinary Least Squares) - one of the basic methods of regression analysis for estimating unknown parameters of regression models from sample data. The method is based on minimizing the sum of squares of regression residuals.

It should be noted that the least squares method itself can be called a method for solving a problem in any area, if the solution consists of or satisfies a certain criterion for minimizing the sum of squares of some functions of the unknown variables. Therefore, the least squares method can also be used for an approximate representation (approximation) of a given function by other (simpler) functions, when finding a set of quantities that satisfy equations or restrictions, the number of which exceeds the number of these quantities, etc.

The essence of the MNC

Let some (parametric) model of probabilistic (regression) dependence between the (explained) variable y and many factors (explanatory variables) x

where is the vector of unknown model parameters

- Random model error.

Let there also be sample observations of the values of the indicated variables. Let be the observation number (). Then are the values of the variables in the -th observation. Then, for given values of the parameters b, it is possible to calculate the theoretical (model) values of the explained variable y:

The value of the residuals depends on the values of the parameters b.

The essence of LSM (ordinary, classical) is to find such parameters b for which the sum of the squares of the residuals (eng. Residual Sum of Squares) will be minimal:

IN general case this problem can be solved by numerical methods of optimization (minimization). In this case, one speaks of nonlinear least squares(NLS or NLLS - English. Non Linear Least Squares). In many cases, an analytical solution can be obtained. To solve the minimization problem, it is necessary to find the stationary points of the function by differentiating it with respect to the unknown parameters b, equating the derivatives to zero, and solving the resulting system of equations:

If the random errors of the model are normally distributed, have the same variance, and are not correlated with each other, the least squares parameter estimates are the same as the maximum likelihood method (MLM) estimates.

LSM in the case of a linear model

Let the regression dependence be linear:

Let y- column vector of observations of the explained variable, and - matrix of observations of factors (rows of the matrix - vectors of factor values in a given observation, by columns - vector of values of a given factor in all observations). The matrix representation of the linear model has the form:

Then the vector of estimates of the explained variable and the vector of regression residuals will be equal to

accordingly, the sum of the squares of the regression residuals will be equal to

Differentiating this function with respect to the parameter vector and equating the derivatives to zero, we obtain a system of equations (in matrix form):

The solution of this system of equations gives general formula OLS estimates for a linear model:

For analytical purposes, the last representation of this formula turns out to be useful. If the data in the regression model centered, then in this representation the first matrix has the meaning of the sample covariance matrix of factors, and the second one is the vector of covariances of factors with dependent variable. If, in addition, the data is also normalized at the SKO (that is, ultimately standardized), then the first matrix has the meaning of the sample correlation matrix of factors, the second vector - the vector of sample correlations of factors with the dependent variable.

An important property of LLS estimates for models with a constant- the line of the constructed regression passes through the center of gravity of the sample data, that is, the equality is fulfilled:

In particular, in last resort, when the only regressor is a constant, we get that the OLS estimate of the only parameter (the constant itself) is equal to the mean value of the variable being explained. That is, the arithmetic mean, known for its good properties from the laws of large numbers, is also an least squares estimate - it satisfies the criterion for the minimum sum of squared deviations from it.

Example: simple (pairwise) regression

In the case of paired linear regression, the calculation formulas are simplified (you can do without matrix algebra):

Properties of OLS estimates

First of all, we note that for linear models, the least squares estimates are linear estimates, as follows from the above formula. For unbiased least squares estimators, it is necessary and sufficient that essential condition regression analysis: conditional on the factors, the mathematical expectation of a random error must be equal to zero. This condition, in particular, is satisfied if

the mathematical expectation of random errors is zero, and
factors and random errors are independent random variables.

The second condition - the condition of exogenous factors - is fundamental. If this property is not satisfied, then we can assume that almost any estimates will be extremely unsatisfactory: they will not even be consistent (that is, even a very large amount of data does not allow obtaining qualitative estimates in this case). In the classical case, a stronger assumption is made about the determinism of factors, in contrast to a random error, which automatically means that the exogenous condition is satisfied. In the general case, for the consistency of the estimates, it is sufficient to fulfill the exogeneity condition together with the convergence of the matrix to some non-singular matrix with an increase in the sample size to infinity.

In order for, in addition to the consistency and unbiasedness, the estimates of the (usual) LSM to be also effective (the best in the class of linear unbiased estimates), it is necessary to fulfill additional properties of a random error:

These assumptions can be formulated for the covariance matrix of the random error vector

A linear model that satisfies these conditions is called classical. OLS estimates for classical linear regression are unbiased, consistent and most efficient estimates in the class of all linear unbiased estimates (in English literature, the abbreviation is sometimes used blue (Best Linear Unbaised Estimator) is the best linear unbiased estimate; in domestic literature, the Gauss-Markov theorem is more often cited). As it is easy to show, the covariance matrix of the coefficient estimates vector will be equal to:

Generalized least squares

The method of least squares allows for a wide generalization. Instead of minimizing the sum of squares of the residuals, one can minimize some positive definite quadratic form of the residual vector , where is some symmetric positive definite weight matrix. Ordinary least squares is a special case this approach when the weight matrix is proportional to the identity matrix. As is known from the theory of symmetric matrices (or operators), there is a decomposition for such matrices. Therefore, the specified functional can be represented as follows, that is, this functional can be represented as the sum of the squares of some transformed "residuals". Thus, we can distinguish a class of least squares methods - LS-methods (Least Squares).

It is proved (Aitken's theorem) that for a generalized linear regression model (in which no restrictions are imposed on the covariance matrix of random errors), the most effective (in the class of linear unbiased estimates) are estimates of the so-called. generalized OLS (OMNK, GLS - Generalized Least Squares)- LS-method with a weight matrix equal to the inverse covariance matrix of random errors: .

It can be shown that the formula for the GLS-estimates of the parameters of the linear model has the form

The covariance matrix of these estimates, respectively, will be equal to

In fact, the essence of the OLS lies in a certain (linear) transformation (P) of the original data and the application of the usual least squares to the transformed data. The purpose of this transformation is that for the transformed data, the random errors already satisfy the classical assumptions.

Weighted least squares

In the case of a diagonal weight matrix (and hence the covariance matrix of random errors), we have the so-called weighted least squares (WLS - Weighted Least Squares). In this case, the weighted sum of squares of the residuals of the model is minimized, that is, each observation receives a "weight" that is inversely proportional to the variance of the random error in this observation: . In fact, the data is transformed by weighting the observations (dividing by an amount proportional to the assumed standard deviation of the random errors), and normal least squares is applied to the weighted data.

Some special cases of application of LSM in practice

Linear Approximation

Consider the case when, as a result of studying the dependence of a certain scalar quantity on a certain scalar quantity (This can be, for example, the dependence of voltage on current strength: , where is a constant value, the resistance of the conductor), these quantities were measured, as a result of which the values \u200b\u200band and their corresponding values. Measurement data should be recorded in a table.

Table. Measurement results.

Measurement No.
1
2
3
4
5
6

The question sounds like this: what value of the coefficient can be chosen to best describe the dependence ? According to the least squares, this value should be such that the sum of the squared deviations of the values from the values

was minimal

The sum of squared deviations has one extremum - a minimum, which allows us to use this formula. Let's find the value of the coefficient from this formula. To do this, we transform its left side as follows:

The last formula allows us to find the value of the coefficient , which was required in the problem.

Story

Before early XIX V. scientists did not have certain rules to solve a system of equations in which the number of unknowns is less than the number of equations; Until that time, particular methods were used, depending on the type of equations and on the ingenuity of the calculators, and therefore different calculators, starting from the same observational data, came to different conclusions. Gauss (1795) is credited with the first application of the method, and Legendre (1805) independently discovered and published it under modern name(fr. Methode des moindres quarres ) . Laplace related the method to the theory of probability, and the American mathematician Adrain (1808) considered its probabilistic applications. The method is widespread and improved by further research by Encke, Bessel, Hansen and others.

Alternative use of MNCs

The idea of the least squares method can also be used in other cases not directly related to regression analysis. The fact is that the sum of squares is one of the most common proximity measures for vectors (the Euclidean metric in finite-dimensional spaces).

One application is "solving" systems of linear equations in which the number of equations is greater than the number of variables

where the matrix is not square, but rectangular.

Such a system of equations, in the general case, has no solution (if the rank is actually greater than the number of variables). Therefore, this system can be "solved" only in the sense of choosing such a vector in order to minimize the "distance" between the vectors and . To do this, you can apply the criterion for minimizing the sum of squared differences of the left and right parts of the equations of the system, that is, . It is easy to show that the solution of this minimization problem leads to the solution next system equations

The method of least squares (LSM) allows you to estimate various quantities using the results of many measurements containing random errors.

Characteristic MNC

The main idea of this method is that the sum of squared errors is considered as a criterion for the accuracy of the solution of the problem, which is sought to be minimized. When using this method, both numerical and analytical approaches can be applied.

In particular, as a numerical implementation, the least squares method implies making as many measurements of the unknown as possible. random variable. Moreover, the more calculations, the more accurate the solution will be. On this set of calculations (initial data), another set of proposed solutions is obtained, from which the best one is then selected. If the set of solutions is parametrized, then the least squares method will be reduced to finding the optimal value of the parameters.

As an analytical approach to the implementation of LSM on the set of initial data (measurements) and the proposed set of solutions, some (functional) is defined, which can be expressed by a formula obtained as a certain hypothesis that needs to be confirmed. In this case, the least squares method is reduced to finding the minimum of this functional on the set of squared errors of the initial data.

Note that not the errors themselves, but the squares of the errors. Why? The fact is that often deviations of measurements from exact value are both positive and negative. When determining the average, simple summation can lead to an incorrect conclusion about the quality of the estimate, since the mutual cancellation of positive and negative values will lower the sampling power of the set of measurements. And, consequently, the accuracy of the assessment.

To prevent this from happening, the squared deviations are summed up. Even more than that, in order to equalize the dimension of the measured value and the final estimate, the sum of squared errors is used to extract

Some applications of MNCs

MNC is widely used in various fields. For example, in probability theory and mathematical statistics, the method is used to determine such a characteristic of a random variable as the standard deviation, which determines the width of the range of values of the random variable.

It has many applications, as it allows an approximate representation of a given function by other simpler ones. LSM can be extremely useful in processing observations, and it is actively used to estimate some quantities from the results of measurements of others containing random errors. In this article, you will learn how to implement least squares calculations in Excel.

Statement of the problem on a specific example

Suppose there are two indicators X and Y. Moreover, Y depends on X. Since OLS is of interest to us from the point of view of regression analysis (in Excel, its methods are implemented using built-in functions), we should immediately proceed to consider a specific problem.

So, let X be the selling area of a grocery store, measured in square meters, and Y is the annual turnover, defined in millions of rubles.

It is required to make a forecast of what turnover (Y) the store will have if it has one or another retail space. Obviously, the function Y = f (X) is increasing, since the hypermarket sells more goods than the stall.

A few words about the correctness of the initial data used for prediction

Let's say we have a table built with data for n stores.

According to mathematical statistics, the results will be more or less correct if the data on at least 5-6 objects are examined. Also, "anomalous" results cannot be used. In particular, an elite small boutique can have a turnover many times greater than the turnover of large outlets"Masmarket" class.

The essence of the method

The table data can be displayed on the Cartesian plane as points M 1 (x 1, y 1), ... M n (x n, y n). Now the solution of the problem will be reduced to the selection of an approximating function y = f (x), which has a graph passing as close as possible to the points M 1, M 2, .. M n .

Of course, you can use the polynomial high degree, but this option is not only difficult to implement, but simply incorrect, since it will not reflect the main trend that needs to be detected. The most reasonable solution is to search for a straight line y = ax + b, which best approximates the experimental data, and more precisely, the coefficients - a and b.

Accuracy score

For any approximation, the assessment of its accuracy is of particular importance. Denote by e i the difference (deviation) between the functional and experimental values for the point x i , i.e. e i = y i - f (x i).

Obviously, to assess the accuracy of the approximation, you can use the sum of deviations, i.e., when choosing a straight line for an approximate representation of the dependence of X on Y, preference should be given to the one that has the smallest value of the sum e i at all points under consideration. However, not everything is so simple, since along with positive deviations, there will practically be negative ones.

You can solve the problem using the deviation modules or their squares. The latter method is the most widely used. It is used in many areas, including regression analysis (in Excel, its implementation is carried out using two built-in functions), and has long been proven to be effective.

Least square method

In Excel, as you know, there is a built-in autosum function that allows you to calculate the values of all values located in the selected range. Thus, nothing will prevent us from calculating the value of the expression (e 1 2 + e 2 2 + e 3 2 + ... e n 2).

In mathematical notation, this looks like:

Since the decision was initially made to approximate using a straight line, we have:

Thus, the task of finding a straight line that best describes a specific relationship between X and Y amounts to calculating the minimum of a function of two variables:

This requires equating to zero partial derivatives with respect to new variables a and b, and solving a primitive system consisting of two equations with 2 unknowns of the form:

After simple transformations, including dividing by 2 and manipulating the sums, we get:

Solving it, for example, by Cramer's method, we obtain a stationary point with certain coefficients a * and b * . This is the minimum, i.e. to predict what turnover the store will have for a certain area, the straight line y = a * x + b * is suitable, which is a regression model for the example in question. Of course, it will not allow you to find the exact result, but it will help you get an idea of \u200b\u200bwhether buying a store on credit for a particular area will pay off.

How to implement the least squares method in Excel

Excel has a function for calculating the value of the least squares. It has the following form: TREND (known Y values; known X values; new X values; constant). Let's apply the formula for calculating the OLS in Excel to our table.

To do this, in the cell in which the result of the calculation using the least squares method in Excel should be displayed, enter the “=” sign and select the “TREND” function. In the window that opens, fill in the appropriate fields, highlighting:

range of known values for Y (in this case data for turnover);
range x 1 , …x n , i.e. the size of retail space;
and known and unknown values of x, for which you need to find out the size of the turnover (for information about their location on the worksheet, see below).

In addition, there is a logical variable "Const" in the formula. If you enter 1 in the field corresponding to it, then this will mean that calculations should be carried out, assuming that b \u003d 0.

If you need to know the forecast for more than one x value, then after entering the formula, you should not press "Enter", but you need to type the combination "Shift" + "Control" + "Enter" ("Enter") on the keyboard.

Some Features

Regression analysis can be accessible even to dummies. Excel Formula to predict the value of an array of unknown variables - "TREND" - can be used even by those who have never heard of the least squares method. It is enough just to know some features of its work. In particular:

If you place the range of known values of the variable y in one row or column, then each row (column) with known values of x will be perceived by the program as a separate variable.
If the range with known x is not specified in the TREND window, then in case of using the function in Excel, the program will consider it as an array consisting of integers, the number of which corresponds to the range with the given values of the variable y.
To output an array of "predicted" values, the trend expression must be entered as an array formula.
If no new x values are specified, then the TREND function considers them equal to the known ones. If they are not specified, then array 1 is taken as an argument; 2; 3; 4;…, which is commensurate with the range with already given parameters y.
The range containing the new x values must consist of the same or more rows or columns, as a range with given y values. In other words, it must be proportionate to the independent variables.
An array with known x values can contain multiple variables. However, if we are talking only about one, then it is required that the ranges with the given values of x and y be commensurate. In the case of several variables, it is necessary that the range with the given y values fit in one column or one row.

FORECAST function

It is implemented using several functions. One of them is called "PREDICTION". It is similar to TREND, i.e. it gives the result of calculations using the least squares method. However, only for one X, for which the value of Y is unknown.

Now you know the Excel formulas for dummies that allow you to predict the value of the future value of an indicator according to a linear trend.