  3. Statement of Kathleen K. Lundquist, Ph.D., Co-Founder and President, APT, Inc.

The U.S. Equal Employment Opportunity Commission

Meeting of May 16, 2007 - Employment Testing and Screening

Good morning. Thank you for the opportunity to share some of my thoughts with you about the role employment testing can play in creating a fair and effective workforce. As the last of your speakers today, I will keep my remarks relatively brief.

In a number of recent class actions, concern has been raised about the potential for “excessive subjectivity” in such selection procedures as performance evaluations and unstructured interviews. In contrast, employment tests, when well-developed, can provide a consistent and fair basis on which to make employment decisions, making them less subject to intentional or unintentional bias.

Professionally-developed selection procedures serve a legitimate business purpose: they allow employers to base hiring and promotional decisions on solid, job-related information. The evidence that a selection procedure measures behavior consistently (i.e., its reliability) and is an accurate measure of job performance (i.e., its validity) is the basis on which a selection procedure is shown to be job-related. Job-related procedures ensure that employees possess the necessary skills to perform the job; such procedures can be used by employers to predict which candidates will be able to successfully perform the job. In short, good selection procedures are fair to candidates (i.e., standardized and objective in their administration and scoring) and useful to organizations (i.e., result in gains in overall productivity).

The key steps in developing a good selection procedure or test include:

  1. Determining the need and appropriateness of testing;
  2. Determining the necessary job requirements;
  3. Developing tests to measure the job’s requirements;
  4. Collecting data to evaluate test quality;
  5. Collecting data to evaluate job-relatedness;
  6. Establishing administrative procedures and controls, and
  7. Implementation and ongoing monitoring of the test.

One of the most important considerations underlying any selection procedure used for employment decision-making is the assumption that the procedure is asking the right questions, i.e., is measuring the characteristics necessary for successful job performance. This is determined by conducting a job analysis. Job analysis involves systematically analyzing what tasks individuals are expected to perform on a job, and what knowledge, skills, abilities and other characteristics (KSAOs) are required to perform the job tasks. For example, oral communication skills and the ability to organize work would be two of the KSAOs required to perform the tasks in the job of an attorney, while the knowledge of advanced mathematics might not be required for this job. The job analysis thus assists in identifying the content to be measured in the selection procedure and the criteria against which to evaluate test performance. Further evidence of job-relatedness is obtained by formally conducting a validation study to demonstrate the connection between the test and the job, typically either through linking the content of the test with the content of the job (content validity) or by showing a statistically significant relationship between test performance and job performance (criterion-related validity).

One of the major challenges in employment testing has been ensuring that the selection procedure itself fairly represents what an individual will be able to do on the job. Recent applications of technology and research in the testing field have permitted employment tests to measure skills in ways that are more similar to how the individual actually would perform on the job. Research has shown that “high fidelity” selection procedures, such as work samples, video simulations and assessment center exercises, enhance candidate acceptance and often reduce adverse impact.

The use of traditional paper and pencil tests in personnel selection has long been debated based on the trade-off between validity and adverse impact. On the one hand, cognitive ability tests have been shown to be highly predictive of job performance. On the other hand, tests of cognitive ability have also shown higher adverse impact than other types of assessment. Traditionally, White candidates have performed better on cognitive ability tests, demonstrating a 1.0 standard deviation difference (i.e., effect size1) between African American/Blacks and Whites, a 0.7 standard deviation difference between Hispanics and Whites, and virtually no difference between men and women.

In an attempt to search for less adverse alternatives in our recent work on the Ford Apprentice Testing program, we reviewed the research literature on alternative testing measures that demonstrate good validity with less adverse impact2. Test administration format or test medium is an important factor in this equation. Research has shown video-based testing to have comparable validity to paper-based tests, with lower adverse impact. Adverse impact was found to be reduced by enhancing applicants’ job relatedness perceptions, positively impacting test-taker motivation, and reducing the reading comprehension demands of the test. Table 1 shows a comparison of the research on different test administration formats.

Table 1
Comparison of Test Administration Formats

Paper-and-Pencil Computer-Based Video-Based
Assessment of KSAOs (—) Not as wide a range of KSAOs can be assessed vs. computer and video-based tests (+) Wider range of KSAOs can be assessed vs. paper-and-pencil tests (+) Wider range of KSAOs can be assessed vs. paper-and-pencil tests
(+) Tests can look more like the job
Validity (+) Same validity as paper-and-pencil tests (+) Same validity as paper-and-pencil tests
Adverse Impact (+) Same adverse impact as paper-and-pencil tests (++) Reduces adverse impact
Development Costs (+) Less development costs (—) Increased development costs (—) Increased development costs
Administration (+) Cost effective and practical for large group administration (—) Facilities with many secure computers with internet access required
(—) Smaller test sessions needed
(+) Test Administrator responsibilities reduced
(+) Practical for large group administration
(—) Increased costs for video presentation equipment to ensure adequate viewing for all candidates
Test Security Security of test materials needed during printing, shipping, and storage (—) Greater exposure of test content over repeated administrations threatens test security Security of test materials needed during shipping and storage
Scoring (—) Delayed scoring and reporting of results (+) Real-time scoring and results (—) Delayed scoring and reporting of results (answers captured in paper-and-pencil form)
Alternate Forms (+) Can develop alternate forms of some tests and less costly to develop (+) Can develop alternate forms of some tests (—) Difficult and costly to develop alternate forms

In addition, research has shown that, for some jobs, measuring more than just cognitive ability can result in better prediction of overall job success and result in lower adverse impact. Considering both cognitive abilities and non-cognitive personal characteristics (e.g., conscientiousness or customer focus) may give a more complete picture of the qualifications of the candidate. Often, it is not just what you know, it’s how you show it.

Work sample or situational judgment tests have also been shown to be promising ways to maintain validity and decrease adverse impact. These assessments are designed to mirror or simulate the actual tasks performed on the job, for instance through a Manager In-Basket exercise or a video simulation of a production line. Such tests measure the ability to identify and understand job-related issues or problems and to select the proper course of action to resolve the problem. Their good validity stems from having the candidate actually perform a part of the job and their reduced adverse impact appears to result from candidate acceptance and motivation.

In our work on the Ford Apprentice Testing program, we have combined a cognitive test, a non-cognitive assessment and a video-based simulation to measure candidates’ qualifications for the apprenticeship. These tests will undergo validation later this summer.

I would like to focus my final comments to the Commission on two critical areas: the importance of operational validity and the need to provide recognition and encouragement to good testing programs.

Beyond the content and initial development of the test, it is important to ensure that the selection procedures continue to be job-related in actual use over time (operational validity). This involves monitoring and revising the selection procedures when jobs change, ensuring that those who use and score the tests are fully trained, and regularly monitoring test data to detect and address observed patterns of unfairness. I would encourage the Commission to develop standards and review procedures to audit testing programs for their operational validity, as well as their initial job-relatedness.

Finally, there are many good testing programs which could serve as models for employers wishing to implement effective selection procedures. I encourage the Commission to provide recognition to such programs.

Thank you.


1 Effect size indicates the number of standard deviation units that separates non-protected and protected group means.

2 A complete list of the literature reviewed is contained in the reference list at the end of this paper.

