Since their invention, cameras have transformed our relationship with the world and each other by providing a window into another time and place. Since the first permanent photographic image was captured by Joseph Nicéphore Niépce in 1826, innovations in imaging have evolved from long exposure images of static scenes to support the hyper-realistic experiences of today. While single cameras are still the primary means of content generation as evidenced by cell-phone photos and video streaming platforms, systems using multiple cameras simultaneously are becoming increasingly mainstream. Since camera array systems image a scene with a multiple perspectives and modalities, they provide a much deeper measurement than a single camera. However understanding this information cam be difficult due to the sheer amount of data acquired. This post discusses the relationship between imaging capabilities of camera array systems and the display solutions that are needed to make the information meaningful and accessible to a person.
One of the earliest examples of the power of using camera arrays occurred in 1878 when Eadweard Muybridge captured the a sequence of images of a horse galloping to settle the question of how its legs moved while in motion. Using an array of 12 cameras with specially developed emulsions to support reduced exposure times and a mechanical shuttering mechanism, Muybridge was able to capture the animal's full gait at 1 millisecond per frame as shown below. These images were able to prove that a running horse would have all four feet in the air and on the ground at regular intervals, since they were taken at a faster rate than the human eye.
Muybridge's The Horse in Motion, 1878
The imaging technology was a stunning advancement, but it was still difficult to see the horse's motion from the original images. To make the movement more clear, the images were retouched to help the rider and horse stand out from the background and make a clear silhouette. To further emphasize the motion, Muybridge developed the Zoöpraxiscope, an early type of projector that used a lantern and a hand crank to display the images sequentially, making what is arguably the first motion picture and movie projector. When you compare the image to the right with the previous set it's clear how much more meaningful these moving images are. Another interesting aspect of the Zoöpraxiscope is that the mechanical crank mechanism would allow the presenter to control the projection speed, providing slow-motion, reverse, and fast forward controls. While the movie of the horse itself is impressive, the ability to control playback speed would really let one analyze the animal's gait in much more detail than any live view, making this system also one of the first image-based interactive experiences. I would argue that it is this interactivity that truly allows the user to understand what they are seeing, or more to the point of this post, what they are normally unable to see.
Animated gif from frame 1 to 11 of The Horse in Motion. "Sallie Gardner", owned by Leland Stanford, running at a 1:40 pace over the Palo Alto track, 19 June 1878
The interrelationship between imaging and display is something I became inherently familiar with as the project manager for the DARPA AWARE Wide Field of View program at Duke. Tasked with designing systems that approached the theoretical limits of resolution per optical volume, the AWARE cameras ranged from 250 megapixels to over 10 gigapixels. For context, 100 megapixels is equivalent to the number of pixels in fifty standard HD monitors or twenty-five 4K displays. While the sheer number of pixels was already overwhelming, the program also wanted a user to view acquired images at video rates. To achieve these objectives, we flipped convention on its head and developed a system that focused on the scale of the display rather that the imaging system. Without going into the technical details, we developed a solution that allowed the user to interact with the image by zooming in and out and being able to move left and right. We then calculated what pixels were needed and at what scale to generate the requested image, giving the user the perception that they had access to all of the pixels simultaneously without all of the data being transmitted and displayed. Only by mapping the measured data to a display at a scale that could be seen and understood by there we able to meet the program objectives of real-time interactive viewing.
The technology developed through the the AWARE program was transferred to Aqueti and became the basis of their gigapixel security cameras. At over 100 megapixels, these systems were much smaller that their predecessors, but they were still impressive. The following image is an example of the field of view and resolution of an Mantis camera.
Image showing the field of view and resolution of an Aqueti Mantis 70 Camera. ( https://www.aqueti.com/)
Camera arrays are being used in applications ranging from microscopy, agriculture management, entertainment, sports, 3D imaging, and security. Each of these applications measure the world in ways beyond the capabilities of the human visual system. Given the this overwhelming amount of information, application specific visualizations and interactive ways to view the data are necessary to make it meaningful. Camera arrays really do see more than meets the eye.
As the second installment in a series of posts on camera arrays systems, I highlight the relationship between imaging systems and how the information is presented to the end-user. Feel free to follow along the conversation in the parallel post on LinkedIn