In spite of current improvements in technology, data visualization continues to use techniques that date to its origins. Digital interfaces might have changed the way we interact with data, making it possible to filter, zoom-in, and access details of it. However, the basic principles to display data are still based on reductions to geometric shapes, like rectangles, circles, and lines.
When working with cultural data, this gap between representation and artifact might be particularly problematic. Does a bar chart of the most used words of a book reveal more than the book itself? Can a pie chart of the colors of a painting tell us more about a painting than the actual image? Or is it possible to balance both representations? If so, can modern technologies create new visualization techniques to bridge this gap between representation and artifact?
In his paper “Visualizing Vertov,” [@manovich_visualizing_2013] Lev Manovich analyses the work of Russian filmmaker Dziga Vertov using methods from his Software Studies Initiative. The research intends to show alternative ways to visualize media collections and focuses on the movies The Eleventh Year (1928) and Man With a Moving Camera (1929). It utilizes digital copies of Vertov’s movies provided by the Austrian Film Museum, in Vienna. They are the media source from which movie frames are taken, visualized and analysed, using custom software. Also, they provide metadata for the research: frame numbers, shot properties and manual annotations. A secondary source utilised by Manovich is Cinemetrics, a crowdsourced database of shot lengths from more than 10000 movies.
As for the visual representations, traditional techniques, like bars and scatterplots, are complemented by Manovich’s “direct visualization“ approach. In this method, “the data is reorganized into a new visual representation that preserves its original form.” [@manovich_what_2010]
Manovich describes his own method for analysis as analogous to how Google Earth operates: from a “bird’s eye” view of a large collection of movies to a zoom into the details of a single frame. The summary of those steps is as follows:
- ‘Birds-eye’ view: 20th Century movies compared by ASL (average shot length). Technique: scatterplot.
- Timeline of mean shot length of all Russian films in the Cinemetrics database. Technique: line chart.
- Movies from Vertov and Eisenstein compared to other 20th century movies. Technique: scatterplot.
- Shot length
- The Eleventh Year and Man With a Moving Camera compared by shot length. Technique: bars, bubbles, and rectangles.
- Zoom-in of the same visualisations.
- Each of 654 shots in The Eleventh Year, represented by its second frame. Technique: direct visualization.
- Shots rearranged based on content (close-ups of faces). Technique: direct visualization.
- Shots and their length. Technique: direct visualization and bar chart.
- First and last frames from each shot compared. Technique: direct visualization.
- Average amount of visual change in each shot. Technique: direct visualization (2nd frame) and bar chart.
- Average amount of visual change in each shot. Technique: direct visualization (juxtaposition) and bar chart.
- Frames organised by visual property. Technique: direct visualization.
Also, Manovich uses contextual notes to draw conclusions from the visualisations. His findings are often compared to facts from the history of cinema, Vertov’s early manifestos, and previous studies, confirming or contradicting them.
This prototype will compare two movies utilising some of the methods from “Visualizing Vertov.” It will combine traditional visualization techniques — charts using 2d primitives — and the “direct visualization” approach.
As for the data, it will use a specific component from films: sound. Because Manovich’s research relies largely on visual artefacts, using sound in this prototype might reveal limitations of his method or point out to new ways of applying it.
Besides, the Cinemetrics project focuses exclusively on a single aspect of movies: ”In verse studies, scholars count syllables, feet and stresses; in film studies, we time shots.” [@cinemetrics] This approach seem to underestimate other quantifiable data that could be used in movie studies.
In spite of using sound instead of time or visuals, this prototype will keep the analytical aspect of Cinemetrics and “Visualizing Vertov.” Then, it will draw conclusions on this approach compared to the supercut method.
To sum up, these are the questions raised by this prototype:
- Is it possible to apply the “direct visualization“ technique to a non-visual artifact?
- Which patterns and insights can a sound analysis of a movie reveal as compared to a visual analysis?
- What are the results of an analytical method as compared to the supercut technique?
All levels of analysis in “Visualizing Vertov” — movies, shot lengths, shots, and frames — utilise comparisons.This device is largely employed in traditional data visualization, and seems to be even more useful for cultural artefacts. In Vertov’s case, for instance, the shot length measurement would not provide any insight if it was not compared to the Russian avant-garde or the 20th Century movies in general.
Following the same logics, this prototype takes an European movie, Wings of Desire (Der Himmel über Berlin, 1987), by German filmmaker Win Wenders, and its American remake, City of Angels (1998), directed by Brad Silberling.
The following diagram shows the development steps for this prototype:
The digital copies of the movies utilised in the first step were not high quality ones. Also, the process by which the data was gathered do not preserve a high resolution sample. Those are limitations of this prototype, which focused on a rapid technique to extract and compare sound data. They will affect the visualization decisions as well.
The first iteration of this prototype is a simple area chart depicting the sound variations from the movies. Because of the low quality of the sources, it utilizes a relative scale for each movie: the higher value of each film is represented as the higher point of each scale, and all other values are relative to that.
For this reason, the peaks from each movie might differ in absolute value of decibels. In conclusion, this parameter should not be used for comparison.
Some visual disparities seem to appear in this first iteration: longer areas with high volume in Wings of Desire versus constant medium-volume ones in City of Angels.
However, the sound volume does not seem to provide many insights. Are these variations due to background music? Which patterns represent dialogues? The representation is so highly encoded that leaves no way to answer these questions.
In order to add some more clarity, the second prototype includes visual marks to represent the parts of the movies when dialogues occur. A computational method to differentiate music from speech would be laborious and not reliable. The solution was to use part of the first prototype developed for this project, which parses subtitles files into different formats. The software generated JSON files with timestamps of the subtitles. Like the sound data, they were read and visualised using D3.
This new iteration seems to shed some light on how the movies compare. While City of Angels shows a constant use of dialogues, the subtitle marks in Wings of Desire have long blanks in several parts of the movie.
The presence of the red marks also help us understand the sound representation. By comparing the two, it is possible to see that the alternating medium-low volume represents dialogues, while the constant and higher areas might indicate background music.
Even though this iteration offers more insights about the movies, most of them are not yet verifiable. The tool does not let us access the sound itself.
The last iteration of this prototype embeds the sound extracted from the movies into the page. It is controlled by a slider that matches the horizontal dimension of the charts.
This final representation is analogous to the shot length one in “Visualizing Vertov.” It combines a 2D chart with the artefact itself, trying to bridge the gap between the two.
Findings and Next Steps
Though the third iteration of the prototype includes the sound, it does not achieve the same results as the display of frames in “Visualizing Vertov.” The access to the sound is mediated through a GUI element, which creates an extra step. On one hand, the overview of the “media collection” (sound fragments) is only accessible through the chart. And on the other hand, the access to the sound through the player does not provide an overview, since it is not possible to listen to all the sound fragments at the same time. As opposed to what happens in Manovich’s visualizations, those two representations were not combined.
Nevertheless, the sound analysis does seem to provide some useful insights for movie studies. Even though this rough prototype relies on low-res sources and the support of subtitle files, yet it reveals some interesting patterns. An analysis of other sound components might include some more interesting findings.
At last, this prototype shows a radically different approach compared to the supercut technique. The analytical method of translating time to space and reducing sound to visuals have a very less entertaining result. In conclusion, the next steps for this project are:
- Find out the communication purposes of it — data visualization as an analytical tool? A media? A language?
- Continue the explorations of media collections using a “direct visualization” approach.
- Expand this approach, as much as possible, to media and representations not yet explored by the Software Studies Initiative.
“Cinemetrics – About.” 2014. Accessed October 10. http://www.cinemetrics.lv/ index.php.
Manovich, Lev. 2010. “What Is Visualization?” Paj:The Journal of the Initiative for Digital Humanities, Media, and Culture 2 (1). https://journals.tdl.org/paj/ index.php/paj/article/view/19.
———. 2013. “Visualizing Vertov.” Russian Journal of Communication 5 (1): 44–55. doi:10.1080/19409419.2013.775546.