released 12.02.08

Preliminary course-grained slice planes of prediction landscapes generated by a cognitive model of a serial subtraction task running in the ACT-R cognitive architecture. Plot 1 visualizes performance predictions for Subject 1 (the worst performer); Plot 2 shows the predictions for Subject 26 (the best performer). The colorbar represents model-to-data fit; dark red (value 0) equals perfect fit between the model's predictions and human performance. These two subjects were known to be using different strategies for performing the task.
By Trish Barker
Using an NCSA cluster, behavioral scientists demonstrate the value of a novel genetic algorithm approach to fitting their cognitive models to human data.
Think of a four-digit number. Now—without a calculator or pencil and paper (or even an abacus)—subtract 13 from that starting point. Now subtract 13 again. And again. You're being timed, so you'd better hurry up! But make sure you've got the right answer! Remember, your performance is being evaluated—you're being watched! Now subtract 13 again. And again.
Feeling stressed? This type of serial subtraction task is part of the Trier Social Stressor Test that has been used in hundreds of research studies since the 1960s. Sue Kase, now a post-doctoral researcher at the Defense Threat Reduction Agency, and Frank Ritter, head of the Applied Cognitive Science Lab at Pennsylvania State University, recently used this arithmetic challenge to induce stress in subjects as part of a study of cognition.
"It's a very broad question—how do people under stress think?" explains Ritter. "If you knew how cognition changed under stress, it tells you how to help people who are under stress. If they're thinking more slowly, you try to give them more time. If they can't recall things, you try to give them memory aids. If they're making poor choices, you try to help them choose the right thing."
Kase and Ritter employed a computational model of the serial subtraction task developed by Michael Schoelles at Rensselaer Polytechnic Institute. And in what Kase terms a "research expedition," she used a novel genetic algorithm approach and computing resources at NCSA to fit the serial subtraction model to data gathered from human experiments conducted by Penn State's Biobehavioral Health Studies Laboratory, led by Laura Klein. The results were published this summer in the proceedings of the 30th Annual Meeting of the Cognitive Science Society and formed the basis for Kase's doctoral dissertation at Penn State's College of Information Sciences and Technology.
'A programmable theory'
Modeling the flow of air over a jet's wing, simulating the collision of galaxies, or computing the interaction of molecules are common tasks for high-performance computers. Capturing the workings of the human mind is not a typical application of high-performance computing resources.
"I went to the San Diego Supercomputer Center last summer for a workshop, and I was the only behavioral scientist there," Kase says with a laugh.
While cognitive psychologists aren't consuming as many cycles as astronomers or engineers, computational simulation is one of their tools. Their work in modeling how we think often involves the use of cognitive architectures such as ACT-R, which Ritter calls "a programmable theory." ACT-R is a framework for cognitive tasks (like the stressful serial subtraction exercise); researchers can build models in ACT-R, adding their own assumptions to its overall theory of cognition and then comparing the results obtained by their programs to results from human experiments.
"The most important step is when you fit the model to the human data," Kase says. Through this process, the researchers find what changes in the model are necessary to match its performance to that of real people. These changes in the model represent how cognition changes under the conditions being studied. Consider Kase and Ritter's study of cognition, caffeine, and stress, for example. If the model needs to talk faster to fit the human subject, that indicates that stress makes people talk faster. If the model has to have poorer memory to fit, then stress makes our memory poorer.
When performing this critical fitting task, behavioral scientists typically use a manual optimization technique "more reminiscent of trial-and-error than optimization methods used by other disciplines," Kase says.
"But this is a multi-dimension problem, in which we're considering both how fast and how accurate the subtractions were, and we're trying to fit that model to data from 15 subjects at an individual level of analysis—it's hard!" Ritter says. "Doing it 15 times while exploring the parameter space is virtually impossible."
Kase and Ritter's "research expedition" was designed to see if they could achieve efficient, accurate, non-biased fits by using a modified parallel genetic algorithm. Genetic algorithms winnow fields of potential solutions (called genotypes), evolving toward better and better answers.
"There are a lot of systems that won't deal well with a noisy evaluation function, which is what we have here with lots of embedded stochastic components in ACT-R and the model," he says. "Genetic algorithms are robust. They're useful for optimizing noisy processes and non-linear problems."
Getting the right fit
To implement the genetic algorithm, Kase and Ritter needed the number-crunching power of a high-performance cluster. While Kase found it easy to write a proposal and obtain a TeraGrid allocation, finding the right resource took time.
ACT-R is written in the Lisp programming language, "and that's one of the things that made this difficult, because that language is not something that typically has been used on a cluster," she says.
NCSA's staff helped move the project forward by explaining how to install Lisp in a home directory and setting the path on Tungsten (which was recently retired). "I couldn't have implemented the project without having that help setting up the programming environment," Kase says. "They were very helpful with that and they answered some of my questions as a new user."
For the initial "expedition," Kase fit data from the 15 subjects in the control group of a broader study of the impact of caffeine and stress on cognition. Fifteen parallel genetic algorithms with 200 genotypes ran for 100 generations, each algorithm fitting the serial subtraction model to an individual human subject's performance data using three ACT-R parameters. The genotypes offering the best fits then were then validated by running them another 200 times.
Rather than painstakingly fitting the model results to human data by hand, tweaking one parameter after another in search of the "perfect" fit, "in one run of the parallel genetic algorithm I tested 20,000 parameter combinations, and all I had to do was drop the job in the queue," Kase says.
"Fitness" is expressed as the discrepancy between the model's predictions and the actual human performance—a smaller gap between the two means a better "fit." Running the parallel genetic algorithm, the researchers achieved what they described in their published article as "exceptional model to human data fits." The serial subtraction model predicted about the same range and distribution of performance as was produced by the human subjects. The fitness values represented differences of less than one subtraction problem out of a range of 28 to 83 problems and percentage correct differences of less than one percent.
In fact, they got more results than they expected.
"With manual optimization you get one solution, but using the parallel genetic algorithm I got sets of solutions," Kase says. For example, the genetic algorithm discovered nine good fits to Subject 16's performance. Because of the range of solutions provided, Kase and Ritter see the genetic algorithm technique as a way to further develop the ACT-R architecture and to advance the model development process in general.
"We will be able to identify the most correct default values for cognitive mechanisms in large cognitive architectures," Ritter explains. "It can have a good effect on models, because we'll find out more about our models and their architectures than we could when doing fitting by hand."
More cycles for 'soft' sciences
In addition to completing the analysis of the full cognition, stress, and caffeine study results, Kase and Ritter plan to do more work with parallel genetic algorithms using more parameters, different cognitive tasks, and more human subjects. And they hope other research groups will follow.
"I think it's up and coming," Kase says. "I think in the end it will catch on. It's hard to change a field. They've been doing their model fitting manually since the field's inception, so it just has to catch on, but I think other labs will try it if it is made easy enough for them to implement."
And Ritter sees the potential for behavioral scientists and other social scientists to consume as many or more computing cycles than the cosmologists and engineers.
"If you start to see where we're headed, you can see there's a lot more work that can come behind this," he says. "We're looking at relatively low-level individual behavior, and as you start to look at more complex behavior you're going to need more cycles, and as you start to look at more subjects you're going to need more cycles, and as you start to look at more aspects of these data and to look at teamwork, you're going to need even more computation because there are more parameters and more complex models.
"Soft sciences are really going to need cycles, because their theories will be much more complex when we're done."
This work was sponsored by the Office of Naval Research.
Team members
Sue E. Kase
Laura Klein
Frank E. Ritter
Michael Schoelles