Approved

by order of the Chairman

Committee on Statistics

Ministry of National

economy of the Republic of Kazakhstan

Methodology for conducting a sample survey of households

Chapter 1. General Provisions

1. The methodology for conducting a sample survey of households (hereinafter - the Methodology) refers to the statistical methodology, formed in accordance with international standards and approved in accordance with the Law of the Republic of Kazakhstan dated March 19, 2010 "On State Statistics" (hereinafter - the Law).

2. The methodology establishes the main aspects and methods for analyzing the sample and general population of households and is intended for use by the structural divisions of the Committee on Statistics of the Ministry of National Economy of the Republic of Kazakhstan (hereinafter referred to as the Committee).

3. A subset of households is selected for sample surveys, and observations are made or data are collected within this subset. The results obtained are extrapolated (distributed) to the entire population as a whole.

4. The main advantages of using the sampling method in modern statistics are:

1) reducing the time for statistical observations (surveys);

2) reducing the information load on respondents;

3) significant savings in labor costs, material and financial resources during the survey;

4) significantly accelerated receipt of the results of the study compared to a complete survey.

5. This Methodology uses concepts in the meanings defined in the Law, as well as the following definitions:

1) Ppanel observation method- a method of collecting information, in which a certain group of units of analysis is periodically polled for a relatively long time, and the subject of the study remains constant;

2) the general population - a complete group of all units of analysis, whose characteristics are subject to evaluation;

3) representativeness - the correspondence of the characteristics of the sample to the characteristics of the population or the general population;

4) mathematical expectation - the average value of a particular characteristic in all possible samples, as well as the weighted average of all possible results with a weight of probabilities reflecting the possibility of occurrence in each result;

5) parameter - a value calculated from all values ​​in the set of the general population, that is, a descriptive measurement of the general population;

6) stratum - division into special layers of units (respondents) with the same or similar indicators;

7) sampling plan - a set of specifications that define the general population and sample units, as well as the degree of probability of possible samples;

8) sample population (sample) - a set of cases (subjects, objects, events, samples), using a certain procedure, selected from the general population for participation in the study;

9) sample size - the total number of observation units in the sample.

Chapter 2Planning andsampling process

6. When planning a survey, it is necessary to determine the geographic areas to be covered and the population to be surveyed.

7. When determining the statistical population, it is necessary to identify the population group from which the sample is formed. Outlying areas with few households or residents are removed from the sampling frame because their coverage is too expensive. They represent only a small fraction of the population, their impact on population figures is very small. The report on the results of such a survey clearly indicates the exclusion of these areas.

8. The process of forming a sample for conducting a survey consists of several stages:

definition of the general population;

establishing a sampling frame;

choice between probabilistic and improbable methods of selection;

definition of sampling plan;

determination of the sample size;

direct sampling according to the plan.

Chapter 3Definition of populationand sampling frame

9. Population census data or information system, the statistical register of housing stock (hereinafter - IS HRHF) is the main source for the formation of a sampling frame for household surveys in the Republic of Kazakhstan. Census data serve as a means of providing information on the size, composition and geographical distribution of the population, in addition to socioeconomic and demographic characteristics. The population census collects information on each individual in the household and on each set of dwellings throughout the territory. In order to avoid cases of non-receipt of data from respondents, IS SRZHF is used to form the sample. IS SRHF was created with the aim of generating and accumulating data on residential buildings and residential premises for housing stock statistics and forming samples for household surveys.

10. Accounting units in IS SRZHF are all residential buildings and residential premises (apartments) located on the territory of the Republic of Kazakhstan.

These include:

residential premises (apartment);

single-family (individual) house;

semi-detached house;

three or more apartment buildings.

Each house and apartment has an identification number (hereinafter - ID).

In addition, IS SRZHF contains the following data: apartment ID, classifier of administrative territorial objects (KATO), street, house number, apartment number, total area, living area. Actualization of the data contained in the IS SRZHF is carried out daily.

Chapter 4Sampling strategy and methods

11. To select elements from the general population, the following methods of probabilistic selection are used:

systematic random sampling (step sampling);

a sample with a probability proportional to size (hereinafter referred to as VLOOKUP).

12. Simple random selection (sample) provides an equal probability of being selected for each element of the general population. There are the following varieties of this method:

repeated random selection;

repetitive random selection.

13. Non-repeated random selection gives more accurate results of sample observation compared to repeated, since with the same sample size, observation covers more units of the general population. In cases where non-repetitive sampling is not possible, re-sampling is used.

14. The essence of systematic random sampling is the selection from the base of the element, starting with the first element, which is selected at random.

For example, when forming a systematic sample with a size of 500 elements from a general population of 15,000 employees of an organization.

First, a random start is determined, then a selection step. (15,000/ 500=30, selection step is 30).

15. The VLOOKUP selection method improves the estimation accuracy if the auxiliary size variable used to determine the probabilities is approximately proportional to the features being studied. When using the VLOOKUP method, there is a greater likelihood that units with large features will fall into the sample. The sampling method is often used in household surveys to select areas where the probability of including items in the sample is proportional to the size of the resident population in the area of ​​sampling.

Section 1. Stratified sampling

16. In planning a household survey, a widely used technique is stratification for the pre-sampling population survey. It serves the purpose of classifying a population into subpopulations based on additional information that is known about the population. For example, territorial characteristics or gender and age categories, type of area, number of residents, type or type of structure, building. The main principle of formation of strata (stratification) is heterogeneity between strata and homogeneity within strata. Urban and rural areas are formed as two separate strata for the household survey. Urban and rural populations differ from each other in many aspects (type of employment, source and size of income, average household size, birth rate) while persons belonging to one of these subgroups have similar characteristics. The probability of selection with a starified sample using non-repetitive random selection is calculated by the following formula:

https://pandia.ru/text/80/295/images/image005_10.png" width="20" height="24 src=">- the size of the general population in the stratum.

17. The advantages of stratified sampling are:

Δ is the marginal sampling error.

23. To determine the sample size, the following parameters of the population are estimated:

1) The arithmetic mean (for example, household income and expenses, the number of people living in households) is calculated for all units of the general population and is called the general average () and is calculated using the following formula.

https://pandia.ru/text/80/295/images/image027_3.png" width="16" height="25 src="> - the sum of the i-stratum indicator.

2) Population variance is defined as the mean of the squared deviations of all individual observations from their mean.

Population variance is calculated using the following formula:

The square root of the variance is called the standard deviation or standard deviation and is calculated using the formula:

3) If the error is expressed as standard error ( m), then the following formula is used to determine the sample size:

RSE is the relative standard error of the sample.

If the final population adjustment is not taken into account, the formula for determining the sample size will be as follows:

24. Once the sample size has been determined, the sample should be allocated to strata if it is a stratified sample or to clusters if it is a cluster sample. The sample distribution is made by the same sample size in each stratum (uniform distribution), or distributed in other ways. In order to determine the distribution of the sample to different strata, there are two important criteria that affect how the sample size in the strata is determined:

The first criterion is convenience: a method of proportional distribution is chosen in which the sample size in i-th stratum is calculated by the formula:

ni- sample size i- strata;

i = 1,2,…,h;

Ni- number of households in i-th stratum, while i = 1,2…., h.

The second criterion is accuracy: the method of optimal distribution is chosen, which gives the smallest mean square error (standard error) of the sample.

25. Where the costs of sampling from different strata are the same, the optimal distribution formula is called the Neumann distribution. In this case, the sample size in i th stratum is determined by the formula:

https://pandia.ru/text/80/295/images/image034_2.png" width="16" height="29">, then its basis weight, denoted as (spread factor), is calculated by the formula:

32. The problem of non-response from the sample unit for subjective reasons in household surveys is solved by adjusting the sample weights. The calculation of the adjusted non-response weight for the i-th sample unit is calculated using the following equation:

https://pandia.ru/text/80/295/images/image038_1.png" width="51 height=29" height="29"> is the number of actually reported.

Calculation of the final adjusted weight in case of non-response for i The th sampling unit is calculated using the following equation:

https://pandia.ru/text/80/295/images/image040_1.png" width="34" height="23 src="> is the initial basis weight;

DIV_ADBLOCK117">

Estimate of the standard error of the sample. Possible discrepancies between the characteristics of the sample and the general population are measured by the standard error (mean error) of the sample. The sample standard error is determined by the following formula:

m - standard error;

General dispersion;

Sample size.

36. Sample standard error shows the absolute values ​​of the error. To determine the estimated value in shares, the relative standard error (coefficient of variation) is used. This coefficient is expressed as a percentage and is calculated by the formula:

https://pandia.ru/text/80/295/images/image048_0.png" width="25" height="30">.png" width="25" height="30 src=">- | £ D, from which it follows that x - D £ https://pandia.ru/text/80/295/images/image050.png" width="113" height="32 src=">

The sum of the indicator of the sample i-strate;

The average value of the indicator of the sample population i-strat.

Population censuses are labor-intensive and costly operations. For this reason, they are held relatively infrequently and their programs are limited to only the most necessary information. Sample surveys (research) make it possible, with less time, effort and money, to study the problem of interest to the researcher on a small group of the population, selected according to special rules, so that the results obtained can then be extended to the entire population (or to that part society, which is the object of study).

The application of the sampling method in demography is no different from its analogous application in other sciences. It has found wide application in population censuses. On its basis, two micro-censuses of the population were conducted in 1985 and 1994.

In our demographics, sampling studies have received the greatest use in the study of fertility factors. In 1966, 1967, 1969, 1972, 1975, 1978, 1981 and 1984. The Department of Demography of the Scientific Research Institute of the Central Statistical Bureau of the USSR carried out in-depth studies of fertility factors in the families of workers and employees who keep budget records. In the United States, sample surveys of fertility factors that are representative of the country as a whole are conducted annually, in which a little more than 50,000 women are interviewed (including about 30,000 of them married), and the findings apply to the entire population of the country (260 million . Human). In 1974-1982. The International Statistical Institute, together with the International Union for the Study of Population Problems, conducted the World Fertility Survey (Factors of Fertility). It consisted of a series of sample surveys of fertility factors conducted in 21 economically developed countries and 41 developing countries. A large amount of scientifically important information was collected, for the first time comparable and representative for the 62 countries of the world in which he lived in the late 1970s. about 1.8 billion people, or 42% of the world's population. Our country, unfortunately, did not participate in this survey.

However, sample surveys also require a lot of staff and resources. For this reason, not always and not everyone can organize them. Then one has to deviate from the theory and conduct a survey that does not pretend to be representative, organized without observing the rules of the sampling method. Such a survey, in contrast to a selective one, is usually called special (ᴛ.ᴇ. dedicated to a specially in-depth study of some narrow task).

Unlike a sample, a special survey does not have the same evidentiary power. But, as they say, at least something is better than nothing (sometimes, however, it happens vice versa, if generalizing conclusions are made on the basis of an unrepresentative special survey. And this happens quite often). Perhaps the majority of domestic sociological studies are not representative (in other words, they are not representative), although it also happens that their authors, without much embarrassment, call their studies selective. Alas, it has always been difficult to conduct a real selective study in our country, since they have always been carried out mainly on the enthusiasm of the researchers and did not enjoy the support of the authorities. For this reason, such studies cannot be completely abandoned. It is only necessary to show scientific conscientiousness in their conduct and caution in interpreting the results.

4. METHODOLOGY OF STATISTICAL POPULATION SURVEY
4.1. HOUSEHOLD SURVEYS
4.1.1. HOUSEHOLD BUDGET SURVEY METHODOLOGY
Goals and objectives of statistical observation

Household budget surveys are multipurpose in nature. Its main tasks are defined as obtaining weights for the calculation of the consumer price index and data for compiling the accounts of the household sector in the system of national accounts. Traditionally, the budget survey is also a source of statistical data on the distribution of the population according to the level of material well-being, on the level of poverty and food consumption.

The household budget survey is conducted in all republics within the Russian Federation, territories and regions using a sample method and covers 49,175 households.

The survey is based on the principles of voluntary participation of selected households.

Study population definition (survey scope)

The general population in the selection is made up of all types of households, with the exception of collective households (the part of the population consisting of persons who are long-term residents of hospitals, nursing homes, boarding schools and other institutional institutions, monasteries, religious communities and other collective residential premises).

Definition and procedure for the formation of the statistical basis of observation,
definition of sampling unit, observation unit

The information array of the 1994 population microcensus served as the basis for constructing a sample of households.

the presence (absence) of a backyard, garden, suburban area, vegetable garden

Distribution of examined persons:

for living in a household of a certain size

from 1 person, 2 people, 3 people, 4 people,
5 people, 6 people, 7 or more people

according to the age

0-6 years old, 7-12 years old, 13-16 years old, 17-29 years old, 30-39 years old, 40-49 years old, 50-54 years old, 55-59 years old, 60-64 years old,

65 years and older

men; women

by source of livelihood

Form No. 1 (section 1, question 2, column 2)

The sum of the values ​​of indicators R1V212 - R1V252

Expenses for the purchase of jewelry

Form (section 5c,
column 4);
form (section 1c,
column 4)

Amount by code 941

Expenses for the purchase of building materials for construction and overhaul

Form (section 5b,
column 5); form (section 1b, column 5)

Amount by codes

Payment for construction and overhaul services

Form (Section 6,
column 4); form (section 2, column 4)

Amount by code 502

Intermediate consumption expenditure and gross fixed capital formation

Summary indicator

e + f + h + i + k

Taxes, fees and other obligatory payments

Form No. 1 (section 1, question 1, column 2)

The sum of the values ​​of the R1V112 indicator

Other expenses

Form No. 1 (section 1, question 1, column 2; question 3, column 2; question 5); form
(section 6,
column 4); form (section 2,
column 4)

The sum of the values ​​​​of indicators R1V122-R1V152 (repayment of a loan, repayment of loans, payment of alimony, insurance and membership fees, given free of charge to relatives and friends), R1V362 (purchase of other real estate), R1V511 (other expenses) according to form No. 1 minus the amount according to codes 961- 965 (insurance services) according to forms No. 1-b

Cash expenses

Summary indicator

1 + 2 + 3 + 4

Amount of savings made

Form No. 1 (section 1, question 9)

Sum of R1V911 indicator values

Amount of loan and spent savings

Form No. 1 (section 1, question 7)

The sum of the values ​​​​of the indicator R1V711

Growth of financial assets

Summary indicator

l - m

Cash income

Summary indicator

I + II

In-kind food value

Form (section 2,
column 4)

The sum of estimates at the average purchase prices of in-kind receipts of food [(codes 101-108, 121-134,141-145, 151-163, 171,172, 181-189, 201-210, 221-227, 241, 242, 244, 261-263 , 271, 272) x kz ]

The value of subsidies and benefits provided in kind

Form No. 1 (section 3, question 17)

Sum of R3V172 indicator values

The value of natural receipts

Consolidated
index

Gross income

Consolidated
index

III+ IY

The value of household donated food

Form No. 1-a (section 3,
column 3)

The sum of estimates at the average purchase prices of natural food transfers [(codes 101-108,121-134, 141-145,151-163, 171,172,181-189,
261-263, 271.272) x kz ]

Final consumption expenditure

Summary indicator

1 + IY - R

Available
resources

Summary indicator

I + l+ IY or Y +m

Development sections and the order of formation of grouping features
based on survey results

The results of the survey are developed for the Russian Federation as a whole and for the regions included in the survey plan, in the following sections:

I. Geographically, determined on the basis of the address part of the survey forms:

households in urban areas;

households in rural areas;

II. According to the form of quantitative expression of the developed indicators:

absolute data are summary results of the survey, obtained by summing up the weighted data of individual household budgets covering the entire population of respondents or its part belonging to one or another group;

data for the average per 100 household members are calculated by dividing the absolute data by the weighted number of current household members, determined on the basis of monthly registration of the number of persons living in the household. Cash persons include all members of the household, except for those who are absent for a long time (who are on a business trip, drafted into the ranks of the Russian Army, students in boarding schools, etc.);

relative data are given as percentages and calculated from absolute data;

III. In groupings according to a number of socio-economic characteristics. Grouping features in the development of survey results are:

1) composition of households:

2) socio-demographic typology of households:

A. Family households (families) - types 1 - 9.

Type 1 - "Married couple without children" (complete simple family without children). These are either young spouses who do not yet have children, or spouses from whom the children have already separated (left to work, study, join the army, etc., separated by their family, died).

Type 2 - "Married couple without children with relatives" (complete complex family without children).

This type includes families in which, in addition to spouses, parents (or one of them) live, as well as other relatives without children.

Type 3 - "Married couple with children under 18" (a complete family with minor children).

This type includes families with children under 18 years of age. A family remains in this type even if it has two children and only one of them has not yet reached the age of 18.

Type 4 - "Married couple with children under 18 with relatives" (complete complex family with minor children).

In addition to a married couple and their children, families of this type may include the parents (or one of them) of one of the spouses, other relatives (brother, sister, grandmother, grandfather, aunt, nephew, etc.).

Type 5 - "Married couple with adult children and relatives" (complete simple/complex family with adult children).

This group of families includes those families that include spouses and their adult children. In addition, this also includes complex families, where, in addition to them, the parents of one of the spouses live, as well as other relatives without children. Their combination in one group is explained by their relatively low share among family households, as well as the main feature - the absence of minor children, which is a determining factor in family well-being.

Type 6 - "Mother (father) with children under 18 years old" (incomplete simple family with minor children).

Just as in the case of complete families, a mother or father who has several children of different ages, including, along with minors, children over 18, remains in the same type.

Type 7 - "Mother (father) with children under 18 with relatives" (incomplete complex family with minor children).

The composition of families of this type, in addition to one of the spouses and children, may include parents (or one of them), other relatives.

Type 8 - "Mother (father) with adult children and relatives" (incomplete simple/complex family with adult children).

In this case (as well as in type 5), simple and complex incomplete families are combined, the main feature that unites them is the absence of children under the age of 18.

Type 9 - "Other family households".

These are families consisting of relatives, but not including spouses or one of the parents with children. Most often, these include families in which grandparents live with their grandchildren, aunt and nephews, brother and sister or two sisters (without parents), etc.;

B. Non-family households - types 10, 11.

Type 10 - Singles.

Persons who do not have a family, as well as those who have a family, but live permanently separately from it and do not have a common budget with it.

Type 11 - "Other non-family households".

These are households that include persons who are not related (property), but pool their budgets. For example, two students who rent an apartment together and run a common household.

The source of information for the formation of these types of households is the Household Register, which contains information on all cash members of the household as of the end of the quarter (section 3 of the Household Budget Survey Questionnaire, lines 1, 4);

3) indicators characterizing the level of welfare of households.

The system of indicators characterizing the welfare of households based on the results of the survey includes:

cash income;

cash expenses;

gross income;

available resources.

Groupings are constructed using the method of ranking individual household budgets in ascending order of the average per capita value of the attribute used as the basis for assessing well-being. Using this method, the surveyed households and the population in them are grouped:

by decile (10 percent) groups of the surveyed population;

according to the interval series of the distribution of households (population) depending on the size of the welfare indicator;

When constructing groupings by decile groups, an order is used in accordance with which weighted data on the number of persons in households are ranked as average per capita indicators of well-being increase and are summed up on an accrual basis to obtain the total number of the surveyed population. This number is taken as 100%. The sum of all households, where 10% of the total number of the surveyed population is concentrated, refers to the corresponding decile group of the population, distributed as well-being indicators increase.

The key characteristic for assessing the uneven distribution of household welfare indicators is the coefficient indicating the ratio between the total values ​​for the most and least well-to-do population groups. In developing household budget survey data

as such a characteristic, the coefficient of funds is used, the calculation of which is carried out according to the following formula:

When constructing groupings according to the interval series, the order is used in accordance with which the weighted data on the number of persons are ranked as indicators of the level of well-being increase and are summarized within the boundaries of the interval series. The boundaries of the interval series are reviewed annually.

When constructing a grouping by category with a welfare level below the subsistence minimum (poor households), an order is applied in accordance with which the weighted data on the number of persons are ranked as the indicators of the welfare level increase and summed up within the boundaries to the subsistence minimum.

Among the characteristics that explain the position of the category of households with a welfare level below the subsistence level, the following indicators are used in developing the results of the household budget survey:

the percentage ratio of this category of households (the population in them) to the total number of households (the population);

shortage of funds for this category of households, necessary to bring their level of well-being to the subsistence level.

The first group of indicators includes poverty and extreme poverty rates.

Poverty ratio:

,

the number of all surveyed households in j -m section;

number of households in j -th section with average per capita indicators of well-being below the subsistence level;

number of current members in

average subsistence level per capita t -th region of the Russian Federation;

number of current members in i -m household;

individual normative coefficient assigned depending on the age of a family member: for members of active working age - 1.125; retirement age - 0.705; children under 16 - 1,018.

The second group of indicators includes indices of the depth and severity of poverty. Both indices characterize the ratio of the deficit in the welfare level of households (ie, the difference between one of the welfare indicators and the subsistence minimum) to the subsistence minimum per person. The difference between the two indicators is that when calculating the poverty depth index, a simple ratio of the deficit to the subsistence minimum is taken, and when calculating the severity of poverty, it is raised to the second power, which ultimately makes this characteristic more sensitive to the level of well-being of the poorest.

f1 , f2

the proportion of selection, respectively, at stages I and II of sampling;

n

the number of enumeration areas included in the sample at stage I;

the average number of households in the enumeration area for the general population;

Mi

number of households in i -th counting area (general population);

the average value of the feature on i th primary unit (enumeration area);

the average value of the attribute for the entire sample;

variance of values ​​of the characteristic of secondary units (household) within i -th primary unit (enumeration area), estimated from the sample;

mi

number of households selected from i -th counting area;

proportion of the feature in the sample for i -th counting area;

proportion of the feature for the entire sample;

the average number of households in the enumeration area for the sample;

N

the number of enumeration areas in the general population;

feature value obtained for j th household in i -m counting area;

the number of people who have this characteristic in i -m counting area.

The boundaries of the confidence interval for the mean value of the feature.