Visual literacy | National Center for Supercomputing Applications at the University of Illinois

Visual literacy

10.16.13 -

by Barbara Jewett

NCSA experts assist a USC researcher analyzing video and film content and exploring how frequent exposure may influence our beliefs.

As children, grownups frequently told us “don’t believe everything you hear.” This was secret code, reminding us to always sift what we’d heard through our brains’ filters of knowledge and personal experience before accepting it as absolute truth. Filtering was especially important if information came from an unproven source. As we matured, we learned to also apply the secret code to what we read or saw.

Since technology put the information superhighway known as the Internet literally at our fingertips, however, many people’s information filters disengaged. Virginia Kuhn hopes to reverse this trend as it applies to video and films, giving new meaning to the phrase “don’t believe everything you see.”

“There’s video everywhere, screens all over the place. These screens and the content of these screens are impacting human beings in ways that we just aren’t sure of,” says Kuhn. “We know that with the language of images there is both the content, which represents the subject of whatever the film or video is, but there’s also the impact.

“Impacts are more subtle and don’t lie in the content of what’s happening on the screen,” she continues. “It’s a matter of large-scale public literacy. Understanding how these screens impact us is crucial. It’s just vital to an educated citizenry.”

An associate professor in the School of Cinematic Arts at the University of Southern California, Kuhn explores the ways communication and expression are impacted by our digital culture, focusing on the rhetoric of images in film and video. Just as words can persuade, so can the nuances of filmmaking.

Thanks to the NCSA experts available to Kuhn through the XSEDE project’s Extended Collaborative Support Service (ECSS), and the introduction of the Gordon supercomputer at the San Diego Supercomputer Center (SDSC), she’s been able to advance her study of the human impact of digital images. ECSS consultants from other XSEDE sites were also drawn in as needed.

Overcoming obstacles

Kuhn says she “was always kind of hamstrung” when it came to supercomputing. The big file sizes she works with and batch queuing made supercomputing “more trouble than it was worth” to learn to use supercomputers. That’s a message NCSA senior research scientist Alan Craig, who is also the senior associate director for human-computer interaction for the Institute for Computing in Humanities, Arts, and Social Science (I-CHASS) at the University of Illinois at Urbana-Champaign, heard repeatedly on a cross-country tour visiting computing centers to chat with humanities scholars prior to becoming the humanities specialist for XSEDE.

“When the Gordon machine was introduced at San Diego,” says Craig, “I knew it had the technical capability to do the kinds of things I had been hearing from these groups. We explained to SDSC why interactive querying was important, and after they learned more about Virginia’s project, they made the decision to allow this on-demand mode of operation for interactive database querying as opposed to traditional batch processing. They dedicated one head node and four compute nodes on Gordon to her project.”

Video exploration

Kuhn is looking at the various human impacts of film and video by working with films in the Internet Archive. The archive is a San Francisco-based non-profit that was founded in 1996 with the purpose of building a library of collections that exist in digital format and offering access to researchers, scholars, and the general public. The archive has more than a million digital movies, ranging from classic full-length films to news broadcasts, concerts, and cartoons.

Thanks to the assistance she’s received through XSEDE’s ECSS, Kuhn is able to use computationally intensive algorithms to do real-time, sophisticated queries to determine which frames or sequences of frames in which film or video to return. She is also doing some experiments with automatic metadata extraction of the multimedia resources, as well as experimenting with image and video analytics. In addition, the ECSS team is assisting Kuhn by developing tools to make film and video images easily available to scholars to study.

Managing content

Medici is an NCSA-developed content management system that specializes in large collections of heterogeneous datasets and makes it easy to interact with the available data on the web and extract metadata in the cloud. Luigi Marini, the lead developer of Medici, and computer vision expert Liana Diesendruck, both members of NCSA’s Image and Spatial Data Analysis Division (ISDA) and ECSS consultants, developed extensions within Medici specifically designed for this project.

Medici version 2, currently in development, was adopted due to its improved scalability and flexibility. The team expanded the tagging capabilities of the system and developed methods for extracting and managing shots of uploaded videos. Diesendruck included within the system methods to index shots based on low-level descriptors, such as color and texture. Leveraging the system, a user can select a shot and retrieve similar shots in other videos. This enables users of the system to explore the video library and find unexpected relations between videos. As new descriptors get developed, they can be easily added to the system to provide new metrics of comparison facilitating further exploration of video collections.

But possibilities don’t automatically translate into realities. It’s good to be able to ask a higher-level question such as “find all of the shots that are done like in the Hitchcock movie Psycho,” but how does that get translated into something the Gordon supercomputer can reasonably address? Michael Simeone, formerly associate director of I-CHASS, joined the project as co-principal investigator and his knowledge of supercomputing workflows and humanities methods enhanced the collaboration significantly.

One thing leads to another

Even with the aid of a supercomputer Kuhn’s research is daunting.

“It’s not like getting the results is the end. For us, it’s just the beginning, because it is very hard to know what you are going to find initially” she says. “The images the supercomputer returns spark another series of questions.”

Take, for example, the first proof-of-concept for the project. An initial keyword search of “safety” returned 300 films. The reason “safety” was chosen as a keyword, says Kuhn, is that it would return films that have not been studied as much for formal qualities (shot length, camera angles, color timing, etc.). Usually they are perceived as straightforward and are studied for content rather than form.

The team’s next analysis is of nearly 3,000 video in the Internet Archives corpus subset Cultural and Academic Films. Kuhn says that corpus was chosen for two reasons. One, it contains nearly all the video from Kahn Academy, a nonprofit website library of video specifically purposed for viewing on the computer to teach lessons in math, science, and humanities. Since the academy’s founder is an entrepreneur rather than an educator or filmmaker, Kuhn anticipates the films will yield ample material to generate additional queries.

The corpus subset also contains a project called Global Voices, which is her second reason for selecting it. With Global Voices the team can compare cross-culturally, looking at, for instance, camera angles, processing techniques, and what sorts of apparatuses are used for things that are considered educational.

“We think technology is neutral; it’s a machine,” says Kuhn with a laugh. “But the infrastructure of technology carries the ideology of those who create it. Filmed footage also appears to be objective since the camera mechanically records the material world. But to film is to frame a shot and framing a shot means excluding all else. One of the goals of the project is demonstrate this subjectivity.”

Visualizing the visual

What new insights are revealed when you visualize image-based media? Kuhn says that was something she could not even imagine at the start of the project. Now, however, she thinks “it’s going to be huge,” noting that the visualizations add productivity and more depth to the project.

Helping Kuhn uncover filmmaking’s technical moves is NCSA visualization expert David Bock. He’s excited to be working on a humanities project, and notes Kuhn’s project has nearly infinite potential. “She’s started something that she’s not going to be able to stop. I really do mean that,” says Bock.

“For so long we dealt with scientific data,” he continues. “Scientists say ‘I want to see the relationship between pressure and wind in this atmospheric simulation.’ Here, if I treat the films and video like I would scientific data points, I can think of myriad ways to look at relationships between the movies. There are so many things I can do and the hardest part is deciding which one to pursue. And that’s just with video. We haven’t even dealt with sound yet. We could, for example, pull all the soundtracks from all the movies in a collection, look at a frequency analysis and say is there a relationship between high frequency and warm colors that we see in a film? Or vice versa? And that opens up all kinds of new theories.”

Bock says he began by looking at the film frames as data points like he would in a scientific visualization. He then took the 2D frames from select films and video and turned them on their sides to make 3D “slices” in a time and space configuration. This temporal and spatial simultaneity lets the researchers see motion across time in a single image. And by literally looking at a film frame from its side, Bock was able to pull information out of the image.

“Looking at those slices, we can see things we didn’t expect to see,” he explains. “You can immediately ascertain camera shots, movement, and other cinematic elements. The only way to find them otherwise is to watch the entire movie, and nobody could ever watch all the movies. It’s impossible.”

Bock equates a movie slice image to a bar code. It holds unique information and after working with them for a while you quickly recognize patterns. He points to a segment with a wave-like shape and says he can tell from the shape that a car enters the frame from the left, stops at some point, then moves again. Bock finds that point in the film and plays it for me. Sure enough, an ambulance enters the first frame of the segment from the left, drives in front of the hospital emergency room, stops, then backs up to the emergency room doors.

That’s entertainment?

Comparing fiction and nonfiction films is one of the teams’ next steps. Kuhn notes that this analysis is extremely important since it focuses on the medium itself. As such, it can help show the cumulative impact of viewing so much filmic media. This type of comparison is unique: films are typically analyzed by genre so the study of documentary, for instance, is separate from the study of Hollywood blockbusters.

“Even though entertainment films are fiction, many researchers have found that viewers often believe what they are seeing, if only on a subconscious level. When you are watching film, the camera seems to be capturing something real. It’s easy to get caught up and not recognize the way the story is filmed and edited,” she explains.

Her colleagues on USC’s cinematic arts faculty are very interested in this portion of the project, she notes, because they feel she could make significant discoveries that would influence how they teach film editing. NCSA’s Craig also is interested in what this aspect of the project will reveal.

“Now at least there’s a screen to remind people they’re looking at a screen,” he says. “But pretty soon we’ll have video embedded in everything and the screen frame will be gone, and these moving images will appear to our senses just like anything else. So I think we need to understand the effects really well.”