Monitoring Glossary

From energypedia


This glossary article includes terms which are typically used in the area of monitoring and evaluation (M&E). It is not exhaustive.

►Please feel free to add Terms & Definitions.



Absolute poverty line

Absolute poverty line is set as an absolute level below which consumption is considered to be too low to meet the minimum welfare level acceptable. Direct Calorie Intake Method, Food-Energy Method, and Cost of Basic Needs Method could be obtained to set absolute poverty line.

Categorical Data

A set of data is said to be categorical if the values or observations belonging to it can be sorted according to category. Each value is chosen from a set of non-overlapping categories.

Cluster Sampling

Cluster sampling is a sampling technique where the entire population is divided into groups, or clusters and a random sample of these clusters are selected. Cluster sampling is typically used when the researcher cannot get a complete list of the members of a population they wish to study but can get a complete list of groups or 'clusters' of the population.

Confidence Limits

Confidence limits are the lower and upper boundaries / values of a confidence interval, that is, the values which define the range of a confidence interval.


An estimate is an indication of the value of an unknown quantity based on observed data.


The plan of an experiment, including selection of subjects, order of administration of the experimental treatment, the kind of treatment, the procedures by which it is administered, and the recording of the data (with special reference to the particular statistical analyses to be performed.


Extrapolation is when the value of a variable is estimated at times which have not yet been observed.

Focus Group Discussion

A qualitative method to obtain in-depth information on concepts and perceptions about a certain topic through spontaneous group discussion of approximately 6–12 persons, guided by a facilitator.

Independent Sampling

Independent samples are those samples selected from the same population, or different populations, which have no effect on one another. That is, no correlation exists between the samples.

Interval Scale

An interval scale is a scale of measurement where the distance between any two adjacents units of measurement (or 'intervals') is the same but the zero point is arbitrary.

Logical Framework (Logframe)

Management tool used to improve the design of interventions, most often at the project level. It involves identifying strategic elements (inputs, outputs, outcomes, impact) and their causal relationships, indicators, and the assumptions or risks that may influence success and failure. It thus facilitates planning, execution and evaluation of a development intervention.


The median is the value halfway through the ordered data set, below and above which there lies an equal number of data values.


The mode is the most frequently occurring value in a set of discrete data.

(Multiple) Regression Analysis

In statistics, regression analysis includes any techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables.


Percentiles are values that divide a sample of data into one hundred groups containing (as far as possible) equal numbers of observations.


A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, which we wish to describe or draw conclusions about.

Purchasing Power Parity (PPP)

A method of measuring the relative purchasing power of different countries' currencies over the same types of goods and services. Because goods and services may cost more in one country than in another, PPP allows us to make more accurate comparisons of standards of living across countries. PPP estimates use price comparisons of comparable items but since not all items can be matched exactly across countries and time, the estimates are not always "robust".


Quartiles are values that divide a sample of data into four groups containing (as far as possible) equal numbers of observations.


Non-Experimental Designs

  • Matching methods or constructed controls, in which one tries to pick an ideal comparison that matches the treatment group from a larger survey. The most widely used type of matching is propensity score matching, in which the comparison group is matched to the treatment group on the basis of a set of observed characteristics or by using the “propensity score” (predicted probability of participation given observed characteristics); the closer the propensity score, the better the match. Agood comparison group comes from the same economic environment and was administered the same questionnaire by similarly trained interviewers as the treatment group.
  • Double difference or difference-in-differences methods, in which one compares a treatment and comparison group (first difference) before and after a program (second difference). Comparators should be dropped when propensity scores are used and if they have scores outside the range observed for the treatment group.
  • Instrumental variables or statistical control methods, in which one uses one or more variables that matter to participation but not to outcomes given participation. This identifies the exogenous variation in outcomes attributable to the program, recognizing that its placement is not random but purposive. The “instrumental variables” are first used to predict program participation; then one sees how the outcome indicator varies with the predicted values.
  • Reflexive comparisons, in which a baseline survey of participants is done before the intervention and a follow-up survey is done after. The baseline provides the comparison group, and impact is measured by the change in outcome indicators before and after the intervention.

Random Sampling

Randomization, in which the selection into the treatment and control groups is random within some well-defined set of people. In this case there should be no difference (in expectation) between the two groups besides the fact that the treatment group had access to the program (There can still be differences due to sampling error; the larger the size of the treatment and control samples the less the error.).


The range of a sample (or a data set) is a measure of the spread or the dispersion of the observations. It is the difference between the largest and the smallest observed value of some quantitative characteristic and is very easy to calculate.

Representative sampling

The population is divided into subpopulations (strata) and random samples are taken of each stratum.


A sample is generally selected for study because the population is too large to study in its entirety. The sample should be representative of the general population. This is often best achieved by random sampling. Also, before collecting the sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be included.

Sample mean

The sample mean is an estimator available for estimating the population mean . It is a measure of location, commonly called the average.

Selection Bias

  • Selection bias relates to unobservables that may bias outcomes (for example, individual ability, preexisting conditions). Randomized experiments solve the problem of selection bias by generating an experimental control group of people who would have participated in a program but who were randomly denied access to the program or treatment. The random assignment does not remove selection bias but instead balances the bias between the participant and nonparticipant samples.
  • In quasi-experimental designs, statistical models (for example, matching, double differences, instrumental variables) approach this by modeling the selection processes to arrive at an unbiased estimate using nonexperimental data. The general idea is to compare program participants and nonparticipants holding selection processes constant. The validity of this model depends on how well the model is specified.

Semi-Structured Interview

  • Semi-structured interviews are conducted with a fairly open framework which allow for focused, conversational, two-way communication. They can be used both to give and receive information.
  • Unlike the questionnaire framework, where detailed questions are formulating ahead of time, semi structured interviewing starts with more general questions or topics. Relevant topics (such as cookstoves) are initially identified and the possible relationship between these topics and the issues such as availability, expense, effectiveness become the basis for more specific questions which do not need to be prepared in advance.
  • Not all questions are designed and phrased ahead of time. The majority of questions are created during the interview, allowing both the interviewer and the person being interviewed the flexibility to probe for details or discuss issues.

Standard Deviation

Standard deviation is a measure of the spread or dispersion of a set of data.

Standard error

Standard error is the standard deviation of the values of a given function of the data (parameter), over all possible samples of the same size.


A statistic is a quantity that is calculated from a sample of data. It is used to give information about unknown values in the corresponding population. For example, the average of the data in a sample is used to give information about the overall average in the population from which that sample was drawn.

Time Series

A time series is a sequence of observations which are ordered in time (or space). If observations are made on some phenomenon throughout time, it is most sensible to display the data in the order in which they arose, particularly since successive observations will probably be dependent. Time series are best displayed in a scatter plot. The series value X is plotted on the vertical axis and time t on the horizontal axis.


The (population) variance of a random variable is a non-negative number which gives an idea of how widely spread the values of the random variable are likely to be; the larger the variance, the more scattered the observations on average.

Further Information