Visualising Test Scores in Neuropsychology
Interpreting psychometric test scores is a fundamental part of clinical neuropsychology however the process by which this is done is rarely given much discussion.
Typically, neuropsychologists follow a standard workflow: raw scores are converted to standard scores and then organised into a table. While this method is widely used, its limitations are seldom discussed. As a result, many clinicians rely heavily on tabulated data, often overlooking several key limitations.
Even well-designed tables—an often underestimated skill—can pose challenges, including:
Cognitive Overload: Reviewing large volumes of test scores, especially when supplied with confidence intervals, can be time-consuming and cumbersome. Even if scores are presented using a uniform metric1 (e.g., z scores) it can still be challenging to quickly scan and assimilate all available information.
Usability: Test tables often span multiple pages (or slides in case presentations) which can make them difficult to navigate and interpret during presentations or supervision.
Lack of a Big-Picture View: Using numeric data alone makes it challenging to identify patterns, as holding complex arrays of numbers in mind can obscure the broader narrative.
I would argue that these challenges can partially be addressed through data visualisation.
Visualising neuropsychological test scores is a powerful way to capture and communicate a patient’s cognitive performance. By plotting test scores, we can make the distribution and uncertainty of test scores more accessible at a glance. Visual elements like dots for point estimates (i.e., standard scores) and error bars for uncertainty (i.e., confidence intervals) can provide a clear and concise representation that tables can’t match. Translating data into a visual form can also help neuropsychologists to quickly profile strengths/weaknesses and identify trends, while making the information more accessible for patients and colleagues.
In this post, I will discuss some of the key types of plot that can be applied to neuropsychological test data for the individual patient, including how to represent point estimates, uncertainty, change, and proportions.
Structuring Test Data for Visualisation
Visualisations for representing individual patient data require only two variables. These are:
X-axis (Label): Usually representing the test name or label (e.g., “Block Design”).
Y-axis (Score): Representing the individual’s test score, ideally converted to a standard metric such as a z-score.
Here, the X variable is usually categorical (test labels), while the Y variable is continuous (test scores). This data structure narrows the choice of suitable plots which makes selection more straightforward.
Point Estimates
Point estimates—an individual’s score on each test—are a good starting point for visualising neuropsychological test data. Plotting these estimates provides an overview of a patient’s performance across multiple tests, making it easy to identify clusters or outliers in the battery of scores. Assuming both a normal distribution2 and that scores are standardised allow a number of plots to choose from.
Bell Curve Plots
Bell curve plots are one method of visualising point estimates. In this approach, scores are plotted along the X-axis, providing a clear way to communicate a score’s standing relative to the population. However, bell curve plots assume a basic understanding of bell curves, which may make them less suitable for non-technical audiences. Additionally, when visualising multiple test scores, the plot can become cluttered and subsequently difficult to interpret. As such, bell curve plots are often more effective for visualising a small number of scores, as shown in Figure 2.
Bar Plots
Bar plots are a highly intuitive choice for displaying multiple test scores (see Figure 3). Each bar represents an individual test score, allowing clinicians to easily compare performance across tests. This type of plot is particularly useful when presenting scores from a larger battery of tests, providing a comprehensive view of the cognitive profile in a single chart
Radar Plots
Radar plots, sometimes referred to as spider plots, display multiple test scores across several axes radiating from a central point. Each axis represents a different cognitive test, with the test score plotted along it. While radar plots are visually engaging and provide a “snapshot” of a patient’s cognitive profile, they can be difficult to interpret, especially when needing to discriminate between scores that are clustered close together (such as in Figure 4)3. These plots are most effective for engaging a non-technical audience and when the goal is to highlight the overall profile rather than make precise comparisons between test scores.
Uncertainty
In neuropsychological assessment, it is important (yet often overlooked) to account for the inherent uncertainty within test scores. This is especially important given that many tools routinely used in neuropsychology have poor test-retest reliability. For example, a clinician might intuitively interpret a difference between scaled scores of 6 and 13 as highly discrepant. However, this perspective becomes difficult to justify when poor reliability results in overlapping confidence intervals.
Given the cognitive challenge of simultaneously holding a battery of point estimates and their associated confidence intervals in mind, I strongly encourage the use of visualisations that incorporate uncertainty. Plots that illustrate both of these elements offer greater interpretive value than those that do not. One particularly well-suited plot for this purpose is the forest plot, which excels at presenting both point estimates and their confidence intervals in a clear and concise manner.
Forest Plots
Forest plots—commonly seen in meta-analysis papers—are a practical and effective way to present point estimates alongside confidence intervals, capturing both the individual score and its associated range of uncertainty (see Figure 5). In a forest plot, each test score is depicted as a point, with lines extending to indicate the confidence interval. This makes forest plots particularly useful for analysing potential discrepancies within a profile of test scores.
Despite their strengths, forest plots remain underutilised, likely because they are somewhat less familiar to practitioners (and lay audiences). When presenting forest plots to non-technical audiences, it’s important to provide a straightforward explanation of how to interpret, ensuring clarity and accessibility.
Change Over Time
Tracking changes in performance over time through serial assessments—such as evaluating intervention effects or monitoring cognitive decline—is a key role for neuropsychologists. To ensure clarity, visualisations for this purpose should include both point estimates and confidence intervals for each time point, clearly highlighting discrepancies between them. The following plots discuss options for visualising test scores when there are two time points4.
Dumbbell Plots
Dumbbell plots are an option for illustrating changes in test scores between two time points (Figure 6). In this format, each cognitive test or domain is represented by a line with dots at either end, corresponding to scores at the initial and follow-up assessments. The connecting line conveys the direction of change (supplemented by use of colour and/or arrows), offering a clear view of progress vs. decline. A drawback of dumbbell plots is that they do not traditionally include uncertainty, although this can be incorporated, such as through the adapted forest plots below.
Adapted Forests Plots
Adapted forest plots5 offer another effective way to visualise change over time. Unlike dumbbell plots, these do not connect test scores with lines. Instead, each point estimate is paired with its own error bar, representing confidence intervals. This design makes it straightforward to assess whether confidence intervals overlap, providing an accessible way to evaluate changes while accounting for uncertainty. In situations when regression based methods for estimating re-test score are used, then this can be adapted further by including only the re-test point estimate (with associated confidence intervals) along with the predicted re-test score; this is the approach taken in the Stability web application (also see Figure 7).
Proportions
In some cases, the goal of visualisation may be to display proportions as opposed to a standardised score.
Waffle Plots
Waffle plots offer an intuitive way to represent proportions (see Figure 8), making them useful for illustrating percentile ranks. Each square in a waffle plot represents 1% of the distribution, enabling a clear visualisation of where a patient’s score falls within a normative sample.
For example, if a patient’s percentile rank on a cognitive test is 75, the waffle plot would visually fill 75% of the grid, with the remaining 25% left unshaded. This format not only highlights the patient’s relative standing but also provides an accessible way to communicate results to patients or colleagues unfamiliar with percentile rank terminology. One key problem with waffle plots as applied to percentile ranks is that equal intervals on the waffle plot may not correspond to equal differences in test performance.
Choosing the Right Plot
Selecting the appropriate visualisation depends on several factors:
Audience: If the visualisation is for clinicians, forest plots or dumbbell plots convey more detailed clinical information, while waffle plots and bar plots may be more accessible to patients.
Purpose: If the goal is to show performance for a snapshot in time (single assessment), bar plots and forest plots are well suited, but if tracking change is essential, dumbbell plots or adapted forest plots are indicated. For uncertainty forest plots are always well suited, while waffle plots are better match for proportional data.
Context: Consider the specific context—visualising single-assessment results versus longitudinal data will guide the choice, as will the number of test scores being displayed, and if confidence intervals are available.
For an overview of the plots discussed in this post, including a summary of their pros and cons, see Table 1
Plot | Pros | Cons |
---|---|---|
Bell Curve Plots | Clear representation of relative performance | Difficult for non-technical audiences, Cluttered when using several scores |
Bar Plots | Easy to compare multiple scores, intuitive for most users | No representation of uncertainty |
Radar Plots | Engaging visual snapshot | Hard to interpret clustered scores, no representation of uncertainty |
Forest Plots | Captures point estimates and uncertainty, good for assessing discrepancies | Less familiar to some, needs explanation for non-technical audiences |
Dumbbell Plots | Clear depiction of change, intuitive for tracking progress vs. decline | Typically excludes uncertainty, harder to use with more than two time points |
Adapted Forest Plots | Combines change and uncertainty effectively, excellent for evaluating overlap of intervals | May require explanation to audiences less familiar with plots |
Waffle Plots | Accessible way to represent proportions, visually appealing | Percentile interpretation can mislead |
Colour and Labels
Effective use of colour and clear labelling can enhance the readability of visualisations. Profiles can be emphasised by applying a colour scheme whereby colours discriminate among different cognitive domains, while accessible colour palettes may accommodate for certain forms of visual impairment. Labels should be precise, with test names and score metrics clearly marked to support interpretation.
Software & Tools
The availability of tools for creating visualisations is a key consideration. Microsoft Excel, for example, is familiar to most clinicians, and can generate many types of plot, however creating some of the more advanced visualisations discussed in this post may require more complex syntax.
Web apps provide an alternative solution, but they are few in number and may lack the full range of functionality. Similarly, proprietary software accompanying specific tests—developed by test publishers—may include visualisation tools. However, such tools are often limited to the publisher’s own tests, excluding data from other assessments, which restricts their utility for visualising comprehensive assessments.
There is a clear need for integrated tools that are accessible to routine clinicians, intuitive to use, and efficient, and (ideally) free of charge. To address this gap, I developed Score Converter, a web app that serves as an early proof of concept, offering a glimpse of what such a solution could look like. I sincerely hope that more comprehensive and effective solutions will be developed by others in the near future.
Avoiding Overinterpretation
A small reminder: while visualisations provide valuable insights, they should complement—not replace—clinical interpretation. Every test score must be considered within the broader context of the patient’s history and clinical presentation. Additionally, clinicians should avoid overlooking data that cannot be visualised, such as criterion tests, clinical observations, and errors.
Conclusion
Data visualisation is a powerful tool for supporting the interpretation of neuropsychological test data. By selecting the right visualisation—whether it’s bar plots for point estimates, forest plots for uncertainty, dumbbell plots for change, or waffle plots for proportions—clinicians can make patient data more accessible, appealing, and insightful. Appealing visualisations, along with careful interpretation, help ensure that visualisations support clinical understanding and enhance communication with patients and colleagues alike.
If you made it this far, thanks for reading - And do let me know if there is anything else that I should have included.
Footnotes
For more information on how to convert scores to a standard metric, see my earlier post.↩︎
This post is predominantly concerned with visualising standard scores from tests which assume a normal distribution (with the brief exception of proportions that are discussed later on). Test scores that do not come from a normal distribution (e.g., criterion/screening tests), will not be discussed here, although may form the subject of a future blog post.↩︎
This can partially be overcome by including text label for each score.↩︎
If situations when using more intensive measurement methods then time-series plots may be indicated. Given how integral these forms of plots are to single-case experimental design, which I plan to be a subject of a future post, I will reserve further discussion for the time being.↩︎
This is not a technical term, just something I have introduced here. I haven’t actually seen forest plots, used in this way, as referred to by a specific name. If you are aware that they have a specific name then please let me know.↩︎
Citation
@online{gaskell2024,
author = {Gaskell, Chris},
publisher = {NeuroPsyTools},
title = {Visualising {Test} {Scores} in {Neuropsychology}},
date = {2024-12-01},
url = {https://neuropsytools.com/pages/posts/2024-11-01-visualisation/},
langid = {en}
}