Make sense of it

Analyse, interpret, and share your results

You collected the data. Now you have to make sense of it and tell people what you found. You do not need statistics software or a research background to do this well. This guide takes you from a tidy spreadsheet to a short, honest report, using counts, percentages, and a few well-chosen stories.

Keep it honest and simple

The most useful thing a small evaluation can do is report clearly what happened to the people it reached, without claiming more than the data can carry. Simple and accurate beats clever and overstated. Everything below is built to keep your numbers correct and your claims matched to your design.

Step 1

Before you analyse: get your data tidy

Almost all of the difficulty in making sense of evaluation data comes from messy data, not from the analysis itself. Before you summarise anything, get everything into one tidy table. There are two layouts that work, and either is fine. Pick the one that matches how you collected.

One row per participant per timepoint (long layout). Each person appears on more than one row: once for intake, once for each follow-up. You add a column that says which timepoint the row is. This suits data entered visit by visit.
One row per participant with intake and follow-up columns (wide layout). Each person appears once. You have a column for their intake score and another for their follow-up score, side by side. This is usually easier for comparing change by hand.

Whichever you choose, give every participant a stable ID (a code, not their name), and keep one column per question. Do not put two answers in one cell, and do not change how you write an answer partway down (decide that "Yes" is always "Yes", not sometimes "Y" and sometimes "yes").

Check the data before you trust it

Spend a few minutes scanning each column before you summarise. A short check catches most problems:

Are the values in a sensible range? A UCLA-3 loneliness score runs from 3 to 9, so a 0 or a 45 is an entry error.
Is anything blank that should be filled, or filled that should be blank? Note real blanks (the question was skipped) separately from a typed zero.
Are there duplicate rows for the same person and timepoint? Keep one.
Do dates make sense, with follow-up after intake rather than before?

Fix what you can trace, and leave a note for anything you change so the cleaning is transparent later.

Decide your denominator: who is included

Before any percentage means anything, you have to decide who it is a percentage of. This is your denominator, and it is the single most common place small evaluations slip. Three groups are easy to confuse, and they answer different questions:

Everyone referred. All the people sent to the program. This describes demand and referral reach.
Everyone who started. The people who actually began the program. This is usually the right base for describing who you served.
Everyone with both an intake and a follow-up measure (paired data). Only the people you measured twice. This is the only group you can use to talk about change on a scale, because change needs a before and an after for the same person.

Carry the denominator with the number. "55% improved" is not interpretable on its own. "55% of the 40 participants with paired scores improved" is. Whenever you write a percentage in this guide, the denominator travels with it.

Step 2

Simple descriptives, without statistics software

Descriptives just means describing what is in front of you. A spreadsheet does all of it. There are three jobs.

Counts and proportions

A count is how many. A proportion is that count divided by the total, usually written as a percentage. If 28 of 40 participants were women, that is 28 out of 40, which is 70%. To get the percentage, divide the count by the total and multiply by 100. That is the whole arithmetic. Use counts and percentages for things like who took part, which referral sources sent people, and how many attended.

A cascade: totals and rates at each stage

Programs often lose people at each step from referral to completion, and showing that honestly is useful in itself. A cascade lists the count at each stage and the rate relative to a base you choose. Here is an illustrative cascade (the numbers are made up to show the shape):

Illustrative figures. Rates here are out of 120 referred. You could also show each step as a rate out of the step before it, which tells you where people drop off most. Say which base you used.

Scales and single items at two timepoints

For a measure taken at intake and again at follow-up, the clearest headline is not an average. It is the share of people who moved in each direction. For each person with paired data, compare their follow-up score to their intake score and sort them into three buckets: improved, stayed about the same (stable), or declined. Then report each bucket as a percentage of the paired group.

Decide in advance what counts as "about the same" so you are not treating a one-point wobble as real change. For a short scale you might call a change of one point stable and only count larger moves as improvement or decline. State the rule you used.

Worked example (illustrative)

Suppose 40 participants had paired UCLA-3 loneliness scores at intake and six months, where a lower score means less lonely. Sorting each person by the direction they moved gives:

Improved

55%

Stable

33%

Declined

12%

Illustrative numbers, not real results. Read as: of the 40 participants with paired UCLA-3 scores, 22 (55%) improved, 13 (33%) stayed about the same, and 5 (12%) declined. Percentages are rounded and may not sum to exactly 100.

Mean and median, briefly

If you do want a single typical value, you have two options. The mean is the average: add every score and divide by how many there are. The median is the middle value when you line the scores up from lowest to highest. They usually sit close together, but the mean gets pulled toward a few unusually high or low values, while the median does not. The median is the safer summary when the data are skewed (a few people very different from the rest) or when you have small numbers, because in those cases one extreme person can drag the mean somewhere unrepresentative. When in doubt, report the median, or report both and let the reader see they agree.

A quick feel for it: scores of 3, 4, 4, 5, and 20 have a mean of 7.2 but a median of 4. The median (4) describes most of this group better, because the single 20 distorts the average.

Step 3

Interpreting change: what it does and does not mean

A number that moved is the start of a question, not the end of one. Before you describe a change as an effect of your program, sit with a few honest alternatives.

Stability can be a good result

For higher-needs participants, especially older adults or people with progressive conditions, holding steady can be a genuine success. If the expected path was decline and someone stayed level, "no change" understates what happened. Report the stable group on purpose rather than treating it as a non-result, and say who was in it.

Reasons a score can move on its own

Several ordinary processes can shift a before-and-after score even with no program at all. A single group measured before and after cannot separate these from a real effect, which is the central limitation of that design.

Regression to the mean. People often enrol when things are at their worst. On any later measure, extreme starting scores tend to drift back toward the middle on their own, which can look like improvement.
Maturation. People change over time anyway. Some recover, adjust, or settle regardless of the program, particularly over a six-month gap.
Seasonal or secular trends. Loneliness and mood shift with the seasons, and the wider world changes around your program. A winter-to-summer follow-up sits inside that movement.

This is a design point, not an analysis trick. No amount of careful arithmetic removes these alternatives from a single-group pre-post study. What helps is the design: a comparison group or comparison period. The methods and study design guide walks through the options, and the study design builder will write the matching limitations into a one-page summary for you.

Beware tiny denominators

Percentages on small groups are unstable and easy to misread. One person is 100% of two. With six paired participants, a single person changing their mind swings the result by 17 percentage points. When the group is small, lead with the counts and treat the percentage as secondary, or skip the percentage entirely.

Statistical change versus change that matters

Two different questions hide inside "did it change". One is whether a change is larger than chance noise, which is what significance testing is about and which needs a fair sample size. The other is whether the change is big enough to matter in someone's life. A score can move a little in a way that is unlikely to be chance yet too small to feel, and it can move a lot for one person in a way that matters enormously to them but cannot be proven across a small group. For a small evaluation, describe the size and direction of change plainly and pair it with what people said it meant. That is more honest than reaching for a significance test the data cannot support.

Step 4

Missing data and small numbers

Who you could not measure shapes what your numbers mean, often more than the people you could.

Loss to follow-up is rarely random

The people you lose between intake and follow-up are usually not a random slice. People who feel worse, who became more unwell, or who disengaged are often harder to reach again. If only the people who did well stayed in to be measured, your follow-up group looks better than the full group ever did. This does not make your evaluation worthless, but it does mean you should not present results from the paired group as if they describe everyone who started.

The honest move is short and specific:

Report how many started and how many had a follow-up measure, as plain counts.
Say what share was lost to follow-up, and anything you know about how the lost group differed.
Keep your change claims attached to the paired group, and say so in the sentence.

Be cautious with very small cells

Small cells cause two problems at once. They are unstable, so they jump around for trivial reasons, and they can identify people, which is a privacy risk in a small community. When a breakdown leaves you with a handful of people in a category, either combine categories into a larger group or suppress the cell and say you have done so. A common practice is to avoid reporting cells below a small threshold and to note the suppression rather than quietly dropping it.

One rule covers most of this. Put the denominator next to every percentage, every time. It tells the reader how much weight the number can bear and stops a 100% that is really two out of two from doing damage.

Step 5

Qualitative material: making sense of stories and open answers

Open-text answers, short stories, and interview notes carry the part of the result numbers cannot reach: how change happened and what it meant. You do not need formal coding software to handle them well. A light, transparent approach is enough.

Read it all the way through first, before you start sorting, so you get a feel for what people are actually saying rather than what you expected.
Group answers into a few themes. Aim for a small number of plain-language themes, such as feeling less alone, getting out of the house, or a practical problem solved. Most material falls into a handful of groups.
Count how common each theme is, in plain terms. "Most participants mentioned feeling less alone; a few described practical help with housing" is honest and useful. Avoid precise percentages on a small set of quotes, which implies more rigour than the method has.
Choose a few illustrative quotes, one or two per theme, that genuinely represent what the group said rather than the single most dramatic line.

Consent and identifying detail

Only use a quote if the person agreed their words could be used this way. Take out anything that could identify someone, including names, specific places, unusual circumstances, and small details that are obvious to a local reader. In a small program a story can identify a person even with the name removed, so when in doubt, generalise the detail or leave the quote out.

Step 6

Putting numbers and stories together to report

A short report that funders and partners can actually read tends to follow the same shape. Five plain parts cover almost everything.

Part	What goes here
Who you reached	Counts and percentages describing who took part: how many started, who they were, where referrals came from. Your cascade fits here.
What changed	For each measure, the share who improved, stayed stable, or declined, with the paired denominator beside it. Note the stable group on purpose.
What it means	A careful reading: what the change suggests, and the limits, including loss to follow-up and what the design can and cannot rule out.
A story	One or two short, consented quotes or a brief account that shows how change happened for a real person.
What you will do next	What you learned and what you will change. This is what tells a funder the evaluation is being used.

What funders tend to look for

Most funders are looking for reach (that you served the people you said you would), some signal of benefit reported honestly, evidence that you understand your own limitations, and a sense that the findings feed back into the program. A clearly stated limitation reads as competence, not weakness. Overclaiming is what erodes trust.

Match every claim to your design

Keep the strength of your language inside what the design supports. With a single group measured before and after, write that participants improved over the period, not that the program caused the improvement. Causal language belongs with a design that has a comparison. The study design builder states the matching claims and limits for the design you have, which you can lift straight into the report.

Tables and charts

Keep visuals plain. A simple bar chart for the share who improved, stayed stable, or declined, and a cascade chart for reach, communicate more than anything fancier. Avoid false precision: reporting 54.7% on 40 people implies an accuracy you do not have, so round to whole percentages and show the counts. Label every chart with its denominator, the same as the text.

Finally

Tools that help

You can do everything on this page with free tools already named in the toolkit. A spreadsheet (Google Sheets, LibreOffice Calc, or Excel) handles every descriptive here: counts, percentages, a cascade, sorting people into improved, stable, or declined, and the mean and median. No statistics package is needed for a small evaluation. When you are ready to write down what your numbers can support, the study design builder matches your claims to your design and produces a one-page summary with the right strengths and limits, and the methods and study design guide explains the reasoning behind it. If you have not built your measures yet, start from the Evaluation Tool Builder so your intake and follow-up line up cleanly for the paired comparison described above.