The effectiveness of occupational performance outcome measures within mental health practice.
Abstract: Introduction: The routine use of outcome measures is essential to the maintained delivery of quality care and the continued commissioning of mental health occupational therapy services. Occupational therapists are required to demonstrate that intervention is successful in an evidence-based, valid and reliable way. Therefore, this critical review aims to address the issue of choosing an appropriate occupational performance outcome measure for use within mental health services.

Method: Evidence was critically appraised for the effectiveness of the Assessment of Communication and Interaction Skills (ACIS), Occupational Therapy Task Observation Scale (OTTOS), Canadian Occupational Performance Measure (COPM) and Goal Attainment Scaling (GAS), all recommended for use by occupational therapists within mental health practice.

Findings and discussion: The review identifies that there are a limited number of clinically based studies evidencing the validity and reliability of occupational performance outcome measures. It also identifies a paucity of literature concerning service user experience of outcome measures, bringing into question how client centred and meaningful these tools are.

Full Text: Introduction

In the 1990s, key objectives from the outcome measures movement (Epstein 1990) were included in health care related government policy, emphasising the use of outcome measures to evidence delivery of care (Department of Health 1991, 1998). Specific to the clinical context of mental health services, the National Service Framework for Mental Health requested routine measurement of patient-related outcomes (Secretary of State for Health 1999). It also called for interventions to be evaluated from the perspective of both clinicians and service users. This was further supported by the National Institute for Mental Health in England (NIMHE 2005), which stressed the importance of outcome measurement in improving service quality and accountability. Therefore, the routine collection and use of outcome measure data are vital to meet the trend in mental health policy (Slade 2006).

In the past decade, the evolution of the concept of evidence-based practice has shaped health system expectations. A recent high profile report by Lord Darzi stressed the importance of outcome measures to increase quality of care: we can only be sure to improve what we can actually measure' (Department of Health 2008, p49). In response to the recommendations outlined in this report, the National Health Service Chief Executive set out his implementation plans through the quality and productivity challenge, specifically highlighting the importance of evidence-based practice to embed change in practice (Nicholson 2009). The use of outcomes to evidence the effectiveness of clinical intervention will have obvious financial implications in the future. Evidence of outcomes will be used when services like occupational therapy are commissioned.

Skelton (2010), an occupational therapist, noted: 'if we don't make savings they will be done to us and they may not be done favourably' (p28). This suggests that evidence of clinical outcomes will be key in securing future funding. However, in the field of mental health, defining outcomes to measure is a significant challenge owing to the complex and subjective experience of each individual case (Schofield 2006). Measurement is complicated further by the dynamic and multifaceted nature of occupational performance (Laver-Fawcett 2007).

With the future of the occupational therapy profession in mind, it seems essential to consider how occupational therapists will evidence the benefit of therapeutic intervention to the occupational performance ability of the service users they work with. This critical review of the literature critiques the effectiveness and use of four outcome measures of occupational performance available within mental health practice. This should assist occupational therapists in making an informed decision regarding which measure to use within their service.


In order to convert this clinical practice issue into a focus for evidence-based research, an answerable question must be formulated (Flemming 1998). The search strategy was constructed by developing search terms from the key words using the patient-intervention-comparison-outcome (PICO) framework (Table 1). This approach has been shown to save searching time (Flemming 1998).

Careful consideration was exercised when formulating search terms. The results were limited to make them specific to the question. This was also supported by the use of a controlled thesaurus (for example, Medical Subject Headings [MeSH] using the Medline database). Mapping tools within databases were used to cross-reference terms and synonyms to ensure that all terminology for each PICO component had been considered. To capture all references, 'wildcard characters' were used to search for multiple words that share the same root (for example, Therap* will return, Therapy and Therapist) (Polit and Beck 2009).

The next stage was to use these terms to search for evidence in electronic databases, as summarised in Table 2. The databases were chosen because they collate studies relevant to allied health professionals in the area of mental health (Melnyk and Fineout-Overholt 2005). Initial searches resulted in a huge amount of varied, and largely irrelevant, literature, including generic outcome measures and those designed purely for research purposes.

The great number of outcome measures available poses a significant challenge to informed decision making in clinical practice. The large search results are supported by one key text that lists over 100 outcome measures available to occupational therapists (Law et al 2005). Therefore, the decision was taken to focus the search on recommended outcome measures most appropriate to clinical practice.

Key texts within occupational therapy and mental health were consulted to identify recommended outcome measures (Table 2). These recommendations were then cross-referenced against evidence of outcome measures currently used (College of Occupational Therapists 2010) and survey data from an unpublished thesis to ensure relevance to clinical practice. The thesis describes choices made by occupational therapists practising in mental health using outcome measures (Duggan 1999). The final four outcome measures (Table 3) were included by recommendation of key texts, evidence of use in practice (Duggan 1999, Duncan et al 2003), and being standardised and individualised outcome measures appropriate to the question (Donnelly and Carswell 2002).

The search was then re-run by including the name of each individual outcome within the PICO framework. A range of literature relevant to clinical practice was retrieved, from experimental quantitative studies through to qualitative studies, unpublished theses and opinion pieces. However, this diversity makes comparison between studies very difficult. Therefore, to capture the clinically meaningful evidence required to answer the question, only evidence specific to the validity and reliability of the outcome measure is included in order to critique its effectiveness in clinical practice.


The literature search identified large numbers of occupational performance outcome measures. These may exist due to the challenge of measuring the many elements of occupational performance. This is supported by the work of Creek (2003), who defined and discussed the complexities of occupational therapy by analysing the individual elements of the process. To inform clinical decision making, four individualised and standardised outcome measures were identified that have clinical utility for occupational therapists within the field of mental health. Table 4 summarises key details from the articles that met the inclusion criteria and enabled review of each study based on guidelines for critiquing research reports (Polit and Beck 2009). A critical appraisal of the literature surrounding the effectiveness and use of these outcome measures in practice follows.

The Assessment of Communication and Interaction Skills (ACIS) and the Occupational Therapy Task Observation Scale (OTTOS)

The ACIS sits within the strong theoretical framework of the Model of Human Occupation (Fisher and Kielhofner 1995). The development of any new scale must be carefully considered and justified, because superfluous scales can add to confusion in the literature (Streiner and Norman 2003). However, there was a clear purpose for the development of the ACIS because it was a concept from the Model of Human Occupation yet to be measured. The ACIS development study included a range of clients from both learning disability and psychiatric clinical settings. It may have been more scientifically rigorous to sample only one clinical group because it is important to ensure that a sample is representative of a specific population (Polit and Beck 2009). However, this method enabled the analysis of the ability of the ACIS to discriminate between clients of different functional levels (Forsyth et al 1999).

Validity is the degree to which the tool (or outcome measure) measures what it is intended to measure (Polit and Beck 2009). Rasch analysis was an appropriate statistical method to use for validating the ACIS because it allows rating-scale observations to be converted into a linear measure, which can then be checked to ensure that sensitivity is in the correct order (Laver-Fawcett 2007). This analysis confirmed the validity of the ACIS by calibrating the test items on a clinically logical linear scale (Forsyth et al 1999). It also identified items that were too easy to score and, therefore, did not provide discriminative value. The authors' justification for the exclusion of these items was transparent, made clinical sense and was endorsed by occupational therapists, therefore adding to the credibility of the article.

Another approach to ensuring validity is to test the relationship between the outcome measure and an established tool (Polit and Beck 2009). This can be achieved by comparing scores through correlation coefficient analysis, whereby a mathematical formula correlates scores on the established tool with those on the new outcome measure (Polit and Beck 2009). The OTTOS task behaviour section showed high correlation with two standardised and validated scales: the Bay Area Functional Performance Evaluation cognitive subscale (r = 0.726) and the Comprehensive Occupational Therapy Evaluation Scale (r = 0.880) (Margolis et al 1996). The closer these results are to 1 the better, with results above r = 0.7 being most desirable (Polit and Beck 2009). The high r, or coefficient value, suggests that the OTTOS was succeeding in rating function during task, and the section on behaviour was rooted in fundamental cognitive skills (Margolis et al 1996). As with the ACIS, development of the OTTOS was clearly justified by being created in response to the clinical need for a simple, quantitative and rapid method of evaluating task performance (Margolis et al 1996). In contrast to the ACIS, the specific sample of psychiatric service users was entirely appropriate because the OTTOS was designed specifically for psychiatric services to answer the needs of practitioners (Margolis et al 1996).

Interrater reliability is the degree of consistency between two independent assessors in scoring the attribute being measured (Polit and Beck 2009). The ACIS has good interrater reliability, because 90% of responses fitted within the Rasch measurement model. Lack of experience was cited as the reason for the small number of clinicians that did not fit within the model (Forsyth et al 1999). This lack of experience could highlight a training need to ensure understanding of the use of this tool; it is worth noting that no training is currently available in the United Kingdom (UK). A similar issue regarding experience was found in the OTTOS.

This outcome measure also had good interrater reliability (r = 0.92 total score, where scores closest to 1 show highest reliability) but only in experienced therapists (Margolis et al 1996).

The Canadian Occupational Performance Measure (COPM) and Goal Attainment Scaling (GAS)

The issue of experience is also evident in the articles relating to the COPM and GAS. Both rely on accurate scoring from practitioners; if this is not achieved then reliability is questionable. In these individualised outcome measures, goals are designed in collaboration between patient and practitioner. They are based on sound clinical reasoning, but this is inevitably affected by experience. However, in reference to GAS this issue must be viewed with a certain amount of caution because the results themselves are uncertain. Lannin (2003) showed that GAS was capable of detecting change through an intervention in pre-test/ post-test scores. However, there was no comment on whether this change was clinically significant. This was compounded by the study failing to use any statistical analysis to make inference from the data.

Howell (1986) set out to see if the process of goal setting was a useful therapeutic intervention in itself. Using the Student's t-test, a comparison was made between results of an experimental group who completed GAS and a control group who received weekly reviews. This experimental comparison of two groups on a dependent variable makes the t-test an appropriate statistical test to use in order to make inferences about the validity of the hypothesis (Polit and Beck 2009). However, there was no comment on whether the data were normally distributed; thus, it was unclear if this test was appropriate because normal distribution is required for parametric analysis such as the t-test (Polit and Beck 2009). The sample size was small (n = 24) and so statistical power could not be ensured. Further to this, the paper concluded that goal setting should be used as a separate treatment technique, although a non-significant result was identified. The credibility of this paper is questionable and, at over 20 years old, it is recommended that this article should not be used to justify the clinical application of GAS in current evidence-based practice.

The COPM has been shown to demonstrate change of client self-perception regarding occupational performance in an individual recovering from a depressive episode (Waters 1995). The significant limitation of this study is the inability to be representative of the wider population due to its single case study design. Although this study is clinically meaningful and provides guidance on clinical practice, it is recognised that single case designs are regarded as one of the least powerful forms of evidence available (Polit and Beck 2009). However, studies with much larger sample sizes, and therefore more powerful forms of evidence, have also detected change. In a large-scale pilot study (n = 268), Law et al (1994) identified clinically significant change, although this conclusion was not justified and, therefore, must only be accepted tentatively. It was intended that the mixed population sampled within the pilot study would allow generalisation of the results to a wider clinical population; however, the recommendation remained for further research into specific clinical groups. Chesworth et al (2002) built on the pilot work and specifically recruited from a sample of adult psychiatric patients. The Pearson correlation coefficient was used as an appropriate statistical test to summarise the direction and relationship between two variables (Polit and Beck 2009). In this case, the two variables were task performance and satisfaction, with analysis confirming a statistically significant result (p< 0.0001). Therefore, they also concluded COPM to be sensitive to change (Chesworth et al 2002).

The criticisms of the literature appraised within this systematic review illustrate the challenge faced by occupational therapists in evidencing their choice of outcome measure in mental health practice. However, the systematic nature of this review has ensured that this is the best evidence currently available with which to make these decisions. The next step within the evidence-based process is to consider how to apply this to practice.


This critical review has sourced and explored the literature regarding four occupational performance outcome measures currently in use by occupational therapists across the UK. The effectiveness and use of these measures has been discussed to assist occupational therapists practising in mental health in making informed decisions, based on the best evidence available, for which measure, or combination, to use with the service users under their care. It is quite clear that outcome measurement is a complex process. If it were easy then all occupational therapists would be completing it routinely as recommended in national policy (Secretary of State for Health 1999, NIMHE 2005). The challenge in practice is likely to be compounded by the complexity of occupational therapy as an intervention (Creek 2003). This, coupled with the individual nature of service users' own occupational performance ability and the opportunity to use an endless variety of tasks for assessment, results in a huge array of potential measures.

The complexity of measuring occupational performance leads to the contradictory 'lack of suitable measures' and 'too many to choose from', as discussed by Duggan (1999). At a UK Occupational Therapy Research Foundation event, the need for more research into the use, validity, reliability, sensitivity and specificity of outcome measures was recognised, with a focus on working with what we have rather than creating new outcome measures (Sainty 2010). Adding to the huge range of outcome measures already in existence is discouraged (Streiner and Norman 2003) and would only lead to less clarity in which outcome measure to choose, placing further pressure on professionals already stretched for time.

Often the quickest way to access current research is by using the internet. Although it provides access to a wealth of fast expanding health care knowledge (Melnyk and Fineout-Overholt 2005), the amount of information available can be superfluous to that required for individual clinical questions. Therefore, a clear strategy for the inclusion and exclusion of evidence is required to answer the question in the most rigorous way (Polit and Beck 2009). With this in mind, the decision was taken to focus the literature review on four outcome measures currently used in practice. Further to this, the outcome measures chosen represent a range of clinical requirements: the ACIS and OTTOS use observation of task function, whereas GAS and the COPM use individualised goal setting and are well suited to higher functioning clients. It is acknowledged that there are other worthy outcome measures cited within the literature; for example, the Assessment of Motor and Process Skills (AMPS) is widely used and was recommended as an appropriate outcome measure (Pan and Fisher 1994). It has not been included in this review due to the high training costs (Chard 2000), which may be difficult to fund in the current financial climate. Further to not being able to review all outcome measures currently in use, a limitation of this review was the inaccessibility of some articles. However, the systematic nature of this review ensured that, where possible, all literature was identified and retrieved.

In order for this critical review to be clinically relevant, outcome measures had to be used in practice. Critical appraisal of the literature supporting the development, validity and reliability testing of the outcome measures was completed. In summary, although there is some evidence for the validity and reliability of these outcome measures, it is often sparse and can only be found in the initial user manuals for the tools. This has to be questioned because the authors of the tools will make money out of their success and, therefore, may be considered to have a vested interest. The lack of continued studies over time into the clinical use of the outcome measures brings into question their clinical application and highlights the need for further studies regarding clinical effectiveness. Although beyond the scope of this article, it is acknowledged that GAS and COPM do have further evidence of validity and reliability; however, caution must be exercised in applying evidence from different clinical settings. What is clear is that there is limited literature specific to mental health; thus reducing the number of studies eligible for inclusion. This is a reality with occupational therapy literature across the board, not helped by the fact that the profession is a relative newcomer to the UK research scene (College of Occupational Therapists 2007).

Busy clinicians are striving to practise in an evidence-based way, but tracking down this evidence and critically appraising it for individual service users and services is a time-consuming process. However, it is of note that the use of standardised outcome measures facilitates evidence-based practice (Unsworth 2011). Therefore, it is hoped that this review will assist occupational therapists to make informed decisions about what outcome measures may be appropriate to their area of practice. This decision-making process will include considering what the tool needs to measure, how it measures this and who it will be measuring. It must also consider the clinical utility regarding ease of use, administration, scoring, costs and training requirements. All these questions require evidence to support the validity, reliability and specificity of the tool within the specific clinical area of work.

The search revealed a paucity of literature regarding service user experience of outcome measures. With the clinical focus on quality client-centred evidenced-based practice, this is clearly an area for concern. There are obvious advantages within the realms of client-centred practice to use individualised outcome measures. Clinicians should ensure that they are using these for the benefit of the patient and not just to meet service aims, asking themselves: how meaningful is the task of completing an outcome measure to each individual service user? This should assist in the choice of measure used: if service users are high functioning then setting their own goals may be an intervention in itself; for others, it may be taking part in the task that matters, enabling the therapist to observe function. Outcome measures provide different levels of information to different people, especially in the context of the complex intervention of occupational therapy.


The outcome measurement of occupational performance in mental health services is a complex process. However, it is required in order to ensure that evidence-based, quality interventions are being provided. Choosing the right outcome measure to demonstrate effective practice involves sound clinical reasoning and consideration of the resulting use of the information collected. The literature available to evidence these decisions is limited; further research should aim to support clinicians in choosing and using quality outcome measures from those already in existence.

This review has identified a limited number of clinically relevant studies concerning the validity and reliability of four recommended occupational performance outcome measures. There is a clear need to

recognise the potential benefits and challenges from the perspective of everyone affected by outcome measures to justify fully those chosen in practice.

Key findings

* There are limited clinically-based studies evidencing the validity and reliability of occupational performance outcome measures.

There is a lack of literature concerning service user experience to develop meaningful outcome measures.

What the study has added

The study provides clinically relevant critical appraisal of literature supporting the use and effectiveness of four occupational performance outcome measures, promoting evidence-based decision making and inquiry within mental health practice.


This critical review was completed in part fulfilment of a National Institute of Health Research funded Master's degree in Clinical Research (MRes). I am sincerely grateful for the support of Anita Bowser and the Occupational Therapy team at Ravenswood House for the opportunity to complete the MRes. I would also like to thank Dr Christopher Bailey of the University of Southampton for his advice regarding this review. Conflict of interest: None declared.


Table 1. PICO framework

Framework    Definition      Key words

P            Patient         Adult mental health

I            Intervention    Occupational performance outcome measure

C            Comparison      Effectiveness of outcome measure

O            Outcome         Effectiveness and use in practice

Table 2. Evidence search strategy

Online English language             Key textbooks searched
databases searched                  (February 2010)
(February 2010)

CINAHL: Cumulative Index to         Occupational therapy and
Nursing and Allied Health           mental health (Creek and
Literature (1982-2010)              Lougher 2008)
Medline (1966-2010)
PsycINFO (1887-2010)                Measuring occupational
Cochrane Database of Systematic     performance: supporting
Reviews                             best practice in occupational
                                    therapy(Law et al 2005)
Unpublished thesis
An investigation of the use of      Advancing occupational
outcome measures by occupational    therapy in mental health
therapists practising in mental     practice (McKay et al 2008)
health (Duggan 1999)                Willard and Spackman's
                                    occupational therapy
                                    (Crepeau etal 2003)

Table 3. Occupational performance outcome measure inclusion criteria

                        Outcome measure         Inclusion/exclusion
Outcome measure         inclusion criteria      criteria of evidence

Assessment of           Measures                Specific to the
Communication and       occupational            outcome measure in
Interaction Skills      performance Uses        question
(ACIS)                  standardised
                        instructions and        The study population
                        methods                 was adult mental
                                                health only
Occupational            Individualised
TherapyTask             outcome measures--      The article referred
Observation Scale       compares the person     to the development
(OTTOS)                 against himself or      of the outcome
                        herself over time       measure and/or
                        (Laver-Fawcett 2007)    validity/
                                                reliability data
Goal Attainment         Specific to adult       regarding the
Scaling (GAS)           mental health           effectiveness of its
                                                use in clinical
Canadian                Demonstrates            practice
Occupational            clinical utility
Performance Measure
(COPM)                  Not designed purely
                        for research

Table 4. Summary of studies in critical review

                               Description                 Articles

Assessment of       The ACIS is a structured             (Forsyth
Communication       observational rating scale           et al 1999)
and Interaction     designed to capture in detail
Skills (ACIS)       the person's social interactional
                    ability whilst he or she is
                    participating in a meaningful
                    social context.

Occupational        A 10-item 'task behaviour'           (Margolis
Therapy Task        and 5-item 'general behaviour'       et al 1996)
Observation         observational rating scale,
Scale (OTTOS)       designed to facilitate evaluation
                    and documentation of patient
                    performance during occupational
                    therapy task groups and to
                    improve the communication
                    between occupational
                    therapists and other treatment
                    team members.

Goal Attainment     Scores if individual patient         (Howell
Scaling (GAS)       goals have been met by an            1986)
                    intervention. Developed for use
                    within community mental
                    health settings (Smith 1994).
                    Through defining the goals,
                    service user and clinician
                    produce an individualised            (Lannin
                    outcome measure.                     2003)

Canadian            Service users are directly           (Law et al
Occupational        involved in determining the          1994)
Performance         initial content of the
Measure             measure. Individual problems
(COPM)              related to self-care,
                    productivity and leisure are
                    rated by importance resulting
                    in up to five goals, which are       (Waters
                    subsequently rated by                1995)
                    satisfaction and performance.

                                                         et al 2002)

  Articles            Study method                Data analysis

(Forsyth        Repeated use ofACIS by      Rasch analysis was
et al 1999)     occupational therapists     used to determine
                (n = 52) in various         reliability of item
                psychiatric and learning    statistics, construct
                disability settings.        validity and
                                            interrater reliability.

(Margolis       Pilot study of              Reliability testing
et al 1996)     occupational therapists     using correlation
                (n = 12) using the tool     coefficient analysis
                daily over a one-year       on data from 25
                period in an American       patients.
                psychiatric inpatient
                rehabilitation unit.        Criterion validity
                                            tested through
                                            correlations against
                                            accepted valid and
                                            reliable measures.

(Howell         Psychiatric day and         Student's t-test
1986)           inpatient district service  between
                centre based in the UK.     experimental
                                            (n = 13) and
                                            control group
                                            scores (n = 11).

(Lannin         Measured individual and     Pre-test/post-test
2003)           programme outcomes          design to show
                in Australian community     average difference
                psychiatric patients.       in scores before and
                                            after intervention.

(Law et al      Pilot study of 268          Pearson correlation
1994)           clients across four         coefficient to
                countries including the     detect change in
                UK. Data include adult      performance and
                psychiatric populations     satisfaction scores.
                from a range of

(Waters         Single case design.         Reports changes in
1995)           Clinical practice report    performance and
                of acute care               satisfaction scores.
                intervention for a lady     No statistical analysis
                diagnosed with major        performed.

(Chesworth      Sixty adults in contact     Pearson correlation
et al 2002)     with a rural UK mental      coefficient to
                health service. COPM        detect change in
                used to detect change       performance and
                post-intervention.          satisfaction scores.

  Articles                Clinical usability

(Forsyth        Takes around 20 minutes to
et al 1999)     complete and is said to be easy
                to incorporate into practice.
                Two-day training was implemented
                prior to data collection for research
                purposes. Currently no training
                offered in the United Kingdom (UK).

(Margolis       Created by occupational therapists
et al 1996)     practising clinically. The completion
                time is 1-2 minutes and a free copy
                is attached to the article. It aims to
                improve the occupational therapist's
                ability to communicate findings
                with other members of staff. It aims
                to identify subtle changes in client

(Howell         Individualised scale. Embraces the
1986)           client-centred approach. Detects
                meaningful change. Can be
                administered in a time-effective
                manner. Can be used for
                programme evaluation.


(Law et al      An individualised scale measure
1994)           of client's self-perception of
                occupational performance in
                self-care, leisure and productivity.
                It has been shown to take less than
                30 minutes to complete.


et al 2002)
