UniDescription
Academy

A Place for Learning About Audio Description

You want to know about it? We want to teach it here.

UniD Documents

Documents produced by the UniDescription Project for external value.

In 2014, The UniDescription Project was just an idea about how to make the world a more-accessible place through academic research. Now 10 years later, our UniD team has a long list of accolades and accomplishments to its credit, including creating the leading production software for descriptions of static media (such as photographs and maps) in the world. We have made that software Open Access and Open Source, so everyone can use it. And the descriptions always are free to hear.

We have collaborated with about 200 U.S. National Park Service sites and other public attractions over the past decade to make media more accessible for their visitors using Audio Description. And we have trained hundreds of employees at those places about AD and media accessibility in general. For those efforts, our work has been supported by grants from the NPS, the National Endowment for the Humanities, the National Endowment for the Arts, and Google, among others. And our team members have been recognized with national and international awards from the American Council of the Blind, Helen Keller Services, the American Alliance of Museums, and with the MUSE Research & Innovation Award. ...

With Covid-19 pandemic restrictions finally fading this year, UniD researchers were back into the field doing extensive studies at two nationalparks: Pearl Harbor National Memorial in Honolulu, HI,

and Pullman National Historical Park in Chicago, IL. We also completed another major Descriptathon workshop, our ninth, which was our largest to date, with more than 150 people participating from around the U.S., in Canada, and in the United Kingdom. They were a part of 16 teams in the friendly competition designed to improve accessibility worldwide.

We shared at academic conferences and through scholarly publishing much new empirical work, and, in a sign of the times, we also haveintegrated our first AI (Artificial Intelligence) tool in our UniD software, with two other related AI tools under development as well and planned for release before Descriptathon 10, scheduled for February 2024.

We welcomed many new partners to our initiative, too, including our first state park and several other organizations that independently used UniD tools. ...

uReview makes its debut, amplifying research

The UniDescription Project always has been a research-focused initiative, intent on bringing more empiricism to the field of Audio Description. But now — following years of development of this idea — we have a robust new online uReview tool that allows us to ask research questions about descriptions, even at scale, and handle that data in efficient ways. In other words, we now can conduct Audio Description research beyond what others in the world can do. That should make for an exciting future. We also had many other important achievements in this past academic year (2021-2022), including ...   


More Partners, More-Accessible Public Places

The UniDescription Project has carried on, despite this pandemic-plagued period of world history. We worked remotely. We postponed field visits. We incorporated Zoom. We have kept trying to make the world more-accessible for people who are blind or have low-vision. This annual report shares some of these 2020-2021 highlights.

"With so much sad and scary discourse circulating, this month seemed an appropriate time to launch a counter-narrative in the form of our first public UniDescription Report. Positive news, like what is in this report, has been happening in 2020, too. And you are a part of it. 

Our small research-and-development team – working from a tiny speck of an island in the middle of the Pacific Ocean 

– has been collaborating for the past five+ years with people from around the United States to steadily improve media accessibility, especially for those who are deaf-blind, blind, or low-vision. We are sending out this report as a way to further connect with you (our partners), to share our collective successes together, and to update you about what we are planning. We have many exciting ideas in motion! 

This week is Helen Keller Deaf-Blind Awareness Week, for example, and one of our Co-PIs, Dr. Megan Conway, is doing her part to make a more-accessible world as a Research and Accessibility Specialist for the Helen Keller National Center. Through her advocacy, we have expanded our UniD research scope this year to explicitly include people who both cannot see and hear well, as a distinct audience for Audio Description. 

Next month is the 30th anniversary of the passage of the Americans with Disabilities Act (ADA). Maybe that would be an ideal moment for you to lead new public conversations about the accessibility of your favorite places, say 

U.S. National Park Service sites, and how you might be able to improve that accessibility for more people?"

OVERVIEW:

This template is constantly evolving, but when we encounter a new image that needs to be described, we typically split the transcription from the description, meaning all existing text should be copied and pasted into the UniD system, so it easily can be heard as a part of the Audio Description. That's the easy part. The next step is remediating (aka translating) the purely visual piece of media into a purely audible form (in this case, into digital text, which can be read by screen readers or heard as Mp3 files). 

One aspect of a visual image that complicates this process is its typical lack of a single narrative thread or a single meaning. Most images give everything at once (all of the possible storylines and all of the possible meanings, forcing a viewer to quickly decide on the interpretation). In other words, images can be interpreted in many ways, based on the perspective, interests, and context of the viewer. 

In the case of Audio Description, though, the describer must choose that perspective to transform the media from visual to audio for the secondary listener. This choice becomes an inherent filter, which affects the reception of the description in many significant ways. If the describer and the listener are aligned on the choice, then the process might be relatively seamless. But if the describer takes a perspective that – for whatever reason – does not align with the listener, a fog of confusion easily can be created. 

DETERMINE THE PURPOSE OF THE IMAGE:

In that respect, we suggest that describers first determine the purpose of the image. Why is it being used? What is it being used to illustrate? If you can clearly determine the purpose of the image that can help you to decide on your describing approach.

Once you have determined the purpose, and what you think this image description needs to do for the listener, I recommend a journalistic approach to Audio Description, which is basically to decide if you are going to tell the story of the image or explain the image. Journalism has a long history of using texts to convey imagery and meaning. Journalists aspire to be fair and objective about what they see, by not taking sides or tilting the scales, and so should an audio describer. Journalists aim for the heart of the matter and always tell the truth. These are all reasonable and potentially valuable positions to take as an audio describer as well. 

DECIDE YOUR APPROACH: A CONTEXTUAL SUMMARY OR A FORM OF STORYTELLING

In practice, I think, that means that the describers should start their descriptions either with a narrative approach that tells "the most important" story about the image, meaning the story that the describer has chosen to best reflect the image's purpose, or a fact-focused explanatory summary, with the most-important facts first. Either way, I recommend starting with a short description of what you are describing (i.e., a horizontal color photograph), followed by a synopsis of the image (a paragraph that provides a thoughtful overview), followed by the more in-depth description. This approach, in turn, orients the listener to what is being described and quickly shares the highlights. Then, if the listener wants more, the describer can go deep into the details, and the listener can decide at any point to drop out (because the most-important descriptions happen first). If an audience member wants to keep listening, and getting richer and deeper details, that person can choose to do that. But that person also can drop out when satisfied and still get the main gist of the image. So the structure really matters.

For the Storytelling style, which I hypothesize as the style with the most potential for creating motivating and engaging Audio Description, there has been some research (and a lot of speculation) about how mental images are formed from words and how narratives engage our minds. This type of conjuring happens all of the time, for example, in novels, in music, and on radio programs. But what about in description form, when a particular image exists in reality, and someone wants to hear about it, specifically? For the Explanatory style, the facts-first approach, the inverted-pyramid technique (in which the most important facts are provided in descending order of importance) has been used for hundreds of years for utilitarian purposes. It gets the job done. There certainly are opportunities for poetic and creative forms of Audio Description, too, that follow no template. We are working on just such an experiment with the National Endowment for the Arts and The Goldsworthy Walk in San Francisco (you can listen to those experiments now, just search "Goldsworthy" in the UniD mobile app. But, as a workhorse model, I propose that describers fundamentally connect with the long-established journalistic traditions of the 5Ws +H (Who, What, When, Where, How, and Why). I think this approach will work well in this field of Audio Description, too. But we're still testing that idea.

WWWWWH – WHO IS DOING WHAT, WHEN AND WHERE, WHY AND HOW? 

To put it into practice, for example, when the describer encounters an image of a person or people doing something (which is what most images are), the description could easily convey Who is doing What, When and Where as the starting point. I hypothesize a return loop then is warranted to unspool the Who (what does the Who look like, in more detail?) and the What (what does it look like, more specifically, when the Who does that thing?). At that point the How might come into play. Or the How can come later. But the When might need some further description (how do we know, from looking, that it is When), and the Where (again, how do we know, from looking, Where this image is)? Lastly, if the How already has been described in-depth, the description should address the Why? Why is this person doing this thing in this time period in this place? And how? I think if a describer can do all of that, in this type of orderly manner, descriptions will be easier to understand (and also to write). Such a straightforward compositional strategy works well for the writers and the listeners, as a template for creating the work and for creating expectations for what to hear.

What if the image doesn't have a person? An animal might use the same approach (what's its motivation?). This approach, of course, can become quite complicated by a collage of, say, a National Park ecosystem shared by people, animals, and plant life. In some scenes we have encountered, there are dozens of potential starting points and mini-narratives to tell. The key, in those cases, is to create a strategy for your approach and then carry it all of the way through (such as, I'm going to start by describing all of the things the people do in this place; then, I'm going to describe all of the animals in action; then the plant life, or in some other order, depending on what's most important in that particular place). 

A type of flower, though, would not necessarily have a motivating action to attribute (unless you are focusing on describing photosynthesis or seed spreading). Neither would an image of a piece of machinery. So for an artifact or any type of visual protagonist that does not have human or animal motivations, I suggest simply clipping out the Who (agent or actor) part of the approach and focusing instead on the What, When, and Where. What is this thing, and when and where is it at? Such a contextualization process will held to render meaning and to put the artifact into its place. A How and Why also probably exist in this scenario. So those can be teased out as well.

SHARE YOUR DESCRIPTION WIDELY AND CONSIDER ALL FEEDBACK:

But what if there is no person or thing? One of the toughest challenges we have faced as describers is describing a map (check out the paper we wrote about that issue on our Research page). A map, at least theoretically, has no fact that is more important than any other and no clear narrative to tell. It does, though, have a purpose, and we recommend first identifying the purpose of the map. If you can do that, then you can probably develop a strategy to communicate that purpose. For example, maybe the map is shared to show highlights of the area, if you are a tourist, so the description would take a "highlights" approach. Or maybe the map is designed to help a person navigate a complex area, so the description would take a "navigation" approach. Or maybe the map isn't really about highlights or navigation; instead, it really just intends to show people the way it used to be, or how something was done, with no intention of the viewer of the map walking in those footsteps. If that's the case, a cultural-history approach or a natural-history approach might be the best choice. 

Once all of that has been settled, the describer still needs to determine what comes first, second, third, etc., since an audible experience is linear while a visual experience is not. But this process should not happen alone, with a single writer dropping the description onto the world and walking away. Like with any type of writing, Audio Description comes to life when it is given to its intended audience. So share your drafts with a trusted circle of advisers who are blind or who have low-vision (a group of even 5 independent reviewers can make a big difference in the quality of the descriptions). Get feedback as you go. Share what you publish widely, and open your communication channels for meaningful feedback. Also, don't just wait for it. Actively seek out feedback, with focus groups, interviews with audiences members, surveys, etc. 


Last updated by: Brett Oppegaard, Oct. 1, 2021

The CURRENT UniD DESCRIPTION TEMPLATE (Established in 2021)

As a way to suggest shape for your descriptions, we have created a template for describing that goes in this order, and in this style:

DESCRIBING: [Describe the type of thing you are describing here, i.e. A small, black-and-white photograph]

SYNOPSIS: [~ 1 paragraph overview, 4 to 8 chunks of information; hit the highlights]

IN-DEPTH DESCRIPTION: [The rest of the description, if needed] 

CAPTION: [Caption goes here]

CREDIT: [Credit goes here]

RELATED TEXT: [Related text goes here]


COMPONENT NAME: 

Start with the type of image, such as MAP: (we found the inclusion of MAP, TEXT, PHOTO, and the like, helps to set the stage for the listener in the Table of Contents view). This label then should include the basic information to tell the listeners what they will get by selecting this description, such as the title of the image being described (if it has one), who made it (if that seems important), and the year it was created (if that seems important), and its physical location at the place (if that's relevant).

EXAMPLES (from Belmont-Paul Women's Equality National Monument):

  • IMAGES: Suffragist efforts
  • IMAGES and TEXT: Park access
  • IMAGE, QUOTE, and TEXT: Alice Paul and the National Woman's Party
  • TEXT: Definition of "suffrage"
  • MAP: Area around monument
  • CHART and TIMELINE: The path to equal rights
  • CHART and QUOTE: Percentage of women in congress


DESCRIBING: 

How would you describe the artifact you are describing? In this order: Size (small / medium / large) / Shape (horizontal / vertical / square / cut-out / oval / circle) / Type (i.e., photograph, chart, or map; see hierarchy below), distinctive characteristics (like the primary or only image on the page), and the point of view that the listener has (through what frame is this image being conveyed?) ... note only if in black and white (not if in color)

EXAMPLES (from Desert National Wildlife Refuge):

  • DESCRIBING: A small, shield-shaped illustration.
  • DESCRIBING: A small, horizontal photograph.
  • DESCRIBING: A small, square photograph.
  • DESCRIBING: A column of small, square photographs that illustrate elements of the page's text.
  • DESCRIBING: Color photograph of a golden eagle in close-up, portrait style.
  • DESCRIBING: A medium-sized square map with a column of small, horizontal photographs beneath it.
  • DESCRIBING: A large map that spreads across two pages of the brochure.  


FOR EACH PIECE OF MEDIA BEING DESCRIBED

Choose the description style you will use:

  • UniD Storytelling Style, typically for people-oriented images: Tell the story of the image. Who is doing what (to whom?), in this image, when and where, and how and why? A visual story involves both a complication and a resolution. Can you determine both parts of the story in this image?
  • UniD Explanatory Style, typically for object-oriented images (i.e. artifacts, landscapes, maps): What is the primary purpose(s) of showing this image? What is it trying to communicate visually? When is it? Where is it? How does it work? How might someone use this image? Why is it important to be shown in this way?  

If the component has just a single type of media being described, here is the template for putting the description together (if more than one type, other examples follow):


COPY AND PASTE THIS TEMPLATE INTO YOUR COMPONENT

DESCRIBING: Describe the image being described (per the examples above)

SYNOPSIS: ~ 1 paragraph of Description goes here; present the highlights of the image, ideally in four chunks of information but not more than eight chunks of information to avoid cognitive overload

IN-DEPTH DESCRIPTION: If needed (not always necessary), the rest of the Description goes here, as a continuation of the Synopsis Description (so not saying the same things over again but starting from the synopsis and building from that structure, as if the listener selected "Hear More"); this can be as long as needed, but it also should be structured with the most important description first, second most second, and so on. 

CAPTION: Caption goes here

CREDIT: Credit goes here

RELATED TEXT: Related text goes here


AND, IF ... MULTIPLE MEDIA ITEMS OF THE SAME TYPE

If more than 1 of any of these, then signal with a label, like:

IMAGE 1 of 6 over the first one, IMAGE 2 of 6 over the second one, and so on ... 

EXAMPLE (from Ulysses S. Grant National Historic Site):

IMAGE 1 of 3: Ulysses S. Grant

DESCRIBING: A small, oval, black and white photograph. 

SYNOPSIS: An 1866 black and white oval photograph of Ulysses S Grant. The 44 year-old Grant is shown in a studio setting, seated with his left arm resting on a table and his left leg crossed over his right. He has dark hair, trimmed above his ears, combed over and parted on his left. He has a neatly trimmed short beard and mustache and a thin-lipped serious expression. His head his turned slightly to his left following his gaze. His right eyebrow is slightly raised. He is wearing a military frock coat with black cuffs, epaulettes denoting his rank as a General. He has on a white shirt with a small bowtie and a dark vest with a watch fob. His right arm lays across his body with his hand resting on his left knee. His open left hand reveals his wedding band on his little finger.

CAPTION: Ulysses S. Grant in 1866, about the time he received the rank of General of the US Army, “conferred by Act of Congress, and the will of the President of the United States.”

CREDIT: Library of Congress

IMAGE 2 of 3: Julia Dent Grant

DESCRIBING: A small, oval, black and white photograph. 

SYNOPSIS: An 1864 black and white oval photograph of Julia Dent Grant. The 38 year-old is shown seated on a wooden chair turned to her right at almost a profile position with her hands clasped in her lap. Julia’s dark hair is parted in the middle and pulled tightly back and tied into a bun. She has a prominent nose, her eyes are closed and she is not smiling. She wears a dark colored, closely buttoned dress which is pulled tightly at the waist and flows freely to the floor. She has a white collar and white ruffled blouse sleeves. She has wide cuffs with two white bands surrounding a darker band. There are two designs on each upper arm consisting of those same white bands encircled by darker colored ribbon.

CAPTION: Julia Dent Grant later recalled that this 1864 photograph ”was taken by Brady in New York when I was on my first visit to N.Y. the spring that General Grant first came East.”

CREDIT: Library of Congress

IMAGE 3 of 3: The Grant family

DESCRIBING: A medium, rectangular, black and white photograph. 

SYNOPSIS: This black and white photograph of the Grant family was taken around 1866. The portrait shows the family against a washed-out background that appears to be a wall, with decorative panels across the bottom and a broad baseboard. The portrait is stiff and formal and is in contrast to the warm and loving relationship the family actually had evident from their positions in the photograph.

IN-DEPTH DESCRIPTION: The Grants are arranged in a row, with 11-year old Ellen – nicknamed Nellie – standing at the far left. She is attired in an ankle-length, graph-checked dress that appears to be off her shoulders and has a full hoop skirt. The dress is belted at the waist. She wears a pair of what look to be leather shoes with cross straps at her ankles. Her hair is parted in the middle and lays flat against the side of her head. She is wearing a beaded necklace that hangs loosely around her neck. Nellie’s left hand is resting gently on the left shoulder of her father, Ulysses Grant, who is seated to her right. He is wearing a Union officer’s uniform that is open at the front, exposing a white shirt and a bow tie. He has crossed his right leg over the left at the knee. He wears a neatly trimmed beard and mustache and short hair. His left arm is draped around the waist of 8-year old Jesse, the youngest of the Grant children. Jesse’s dark hair is parted on the left side and falls to near his ears. He leans against his father in a relaxed pose. Jessie is wearing what may be a boy’s version of a uniform, with white socks and dark shoes. The shirt has dark lines running down each side and meet at a wide belt. The pants are loose and are closed at the ankles. Jesse stands to the right of his older brother, Fred. Fred is 16 years old and is standing very straight with his right arm bent slightly across his waist, while his left arm is hanging at his side. His hair is parted on his right and is short, reaching just above the ears. He is wearing a military-type uniform, with epaulettes and a wide three-button cuff with dark trim. The jacket is open at the front with buttons on the left, revealing a white shirt underneath. Fred is wearing a pair of straight, loose trousers. To Fred’s left is Julia Grant, Ulysses Grant’s wife. Julia, like Ulysses, is seated. She is dressed in a full-length black dress with hoop skirt. The dress reaches her neck and ends with a small white collar. Her hands are held demurely on her lap. Like Nellie, her hair is parted in the middle and straight on the sides with a bun in the rear. The boy standing on Julia's left is Ulysses S. Grant Jr., about fourteen, more commonly known as Buck. He is also dressed in a military uniform but his jacket is buttoned to the neck. His left arm is bent across his chest and right arm hangs at his side, partially hidden behind his mother. His uniform is almost an exact duplicate of the one worn by Fred. Nellie, Ulysses Grant, and Jesse appear to be looking at the camera, while Fred, Julia, and Buck are gazing to the left. 

CAPTION: The Grant family ca. 1866: Ellen “Nellie,” Ulysses, Jesse, Fred, Julia, and Ulysses Jr. “Buck.”

CREDIT: NPS


AND, IF ... MULTIPLE MEDIA ITEMS OF THE DIFFERENT TYPES (THE PRIORITIZED ORDER)

If multiple types of media are gathered together in a package of media, that needs to be kept together to be understood fully, this is the hierarchy we use to stack the descriptions (as UniD style, not based on empirical study):

A. COLLAGE / IMAGE(S) = photo or illustration / 

B. MAP / 

C. TIMELINE / 

D. CHART / 

E. QUOTE / 

F. TEXT 

EXAMPLE (from Lincoln Memorial): Image is described first, then the quote is added afterward

IMAGE and QUOTE: Lincoln Memorial

DESCRIBING: A large vertical photograph of the Lincoln Memorial at night covering the entire front side of the brochure, with a quote and a text block overlaying the bottom half of the image. 

SYNOPSIS: This full-color photograph shows the Lincoln Memorial at night, as seen from the reflecting pool. The evening sky behind the memorial fades from light fuchsia at the top to deep plum closer to the horizon. Greenish lights can be seen illuminating the black outlines of buildings of the city skyline in the distance. The top quarter of the page is filled mainly by the memorial itself. 

IN-DEPTH DESCRIPTION: The memorial is a white rectangular structure, designed to resemble an ancient Greek temple. It is fronted by 12 columns that bulge slightly in the middle before tapering at the top and bottom. These columns support a large marble roof with a smaller rectangular attic perched on top of it.  Engravings of eagles with their wings outstretched are connected by a carved garland of leaves, draped  across the top edge of this attic. This detail work is visible in the golden glow of spotlights cast upward from the lower roof. 

The lower rectangle of the building contains the main chamber of the memorial. The white marble is visible in the golden lights being cast down behind the columns. This divides the memorial into three sections, with the outer thirds strongly illuminated and the center third much darker. The front wall of the memorial opens behind the center four columns, revealing the illuminated statue of President Lincoln seated within. This statue is centered between the middle two columns of the memorial and is lit by the same golden light as the outside of the building. 

The monument appears to float in darkness, elevated from the water of the Reflecting Pool, a long rectangular body of water in front of the memorial. Light from the memorial reflects in the pink and orange water that takes up the bottom three-quarters of the photograph and extends from the first fold to the bottom of the image. The surface of the Reflecting Pool is gently ruffled by the wind.

CREDIT: Robert Lautman.

QUOTE:  "It is rather for us to be here dedicated to the great task remaining before us—that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion—that we here highly resolve that these dead shall not have died in vain—that this nation under God shall have a new birth of freedom—and that government of the people, by the people, for the people shall not perish from the earth." – Abraham Lincoln, Gettysburg Address, November 19, 1863.


NOTE: Remove all document navigation directions in the texts, which are likely to cause confusion when disassociated with the document design. 

For example, in the text below, from Charles Young Buffalo Soldiers National Monument, I would remove "(above left)", "(above)" and "(right)."

Original:

RELATED TEXT:
The NAACP honored Young in 1916 (above left). The army was not supportive and kept Young out of World War I. He rode his horse (above) from Ohio to DC to prove his fitness. Instead, he was sent to Camp Grant in Illinois to train troops (right).

Edited:

RELATED TEXT:
The NAACP honored Young in 1916. The army was not supportive and kept Young out of World War I. He rode his horse from Ohio to DC to prove his fitness. Instead, he was sent to Camp Grant in Illinois to train troops.


Last updated by: Brett Oppegaard, Aug. 1, 2021

UniD Best Practice No. 1: Practice Hypermediacy (Medium Orientation)

Every description should begin with some sort of a medium orientation. In other words, descriptions should start with a quick summary of what precisely is being described, in terms of the medium, before the content of that medium is described. Otherwise, the listener likely will have difficulty understanding the description in its context. In academic contexts, such an approach is called "hypermediate." That term just means that the medium is recognized in the communication process as having a form (with affordances and constraints) and given its place in the process. Its opposite academic term is "immediacy," in which the intent of the communication process is to make the medium seem to disappear and not to be involved at all in the encoding, decoding, and interpretation processes.

We primarily work with two-dimensional imagery, including photographs, illustrations, maps, charts, etc., remediating those from visual to audible media. So in those cases, we recommend two specific orientation practices:

1. Give each description a Medium Type Name – When choosing the title for a description, lead that title with the medium type. In UniD projects, we use an all-caps type name, such as MAP: Yosemite Valley Map or IMAGE and TEXT: Abraham Lincoln Portrait. This helps to convey immediately to the listener what sort of a thing is being described before that deeper description happens.

2. Also, describe the Medium Type – At the start of each description, we give a bit of shape to the medium type by describing it. In UniD Style, we use the term DESCRIBING in all-caps, to separate the Type Name from the description. So we might write something like this:

DESCRIBING: A small, square black-and-white photograph

DESCRIPTION: A middle-aged Abraham Lincoln – wearing his iconic stovepipe hat – looks directly at the viewer in this portrait. He is shown from the shoulders up, in a black suit, with a black tie knotted tightly around his neck. His bearded face, with no mustache, indicates that the image was taken either right before his presidency or during it, because he only wore a beard when running for president or serving in that office. And so on. ...   


* UniD Best Practices are practices that we have developed in our Descriptathons, and in other research studies, that we feel confident will stand up to empirical research scrutiny (and we are conducting such research on them ourselves). If you are a practitioner, we encourage you to try them and let us know how they worked. If you are a researcher, we encourage you to test them and even try to disprove them.

The community of users who are blind, have low vision, have a print-related disability, or are auditory-oriented learners are diverse. They use different equipment based on their needs and technology skills. The UniD system allows for multiple outputs to make audio-described “unigrid” brochure content accessible. 

Each audio-described NPS unigrid brochure in this project has been added to the UniDescription mobile app, available for free on the App Store (Apple / iOS devices, https://goo.gl/zAWWj6  ) and Google Play (Android devices, https://goo.gl/EU9pjc  ).

The UniD system allows additional formats to be created (HTML5 for website integration, Mp3 audio files, and text files), for distribution on websites, social media, or person-to-person sharing, based on the user’s needs and available tools. These distribution formats are intended to cover all use-case scenarios involving park visitors who are blind, visually impaired, print dyslexic, or audio-oriented learners. 

Providing UniD audio described content on your organization's website is highly recommended. Two examples of how the National Park Service does this are: 

National Park Service employees may go to the Accessibility and Section 508 UniDescription Project page within the National Park Service’s Digital Community site for more instructions.

"Why audio description? 

In a society broadly shifting toward visual media, those who are blind or visually impaired are at risk of being excluded from socially and culturally important discourses, including access to primary sources of education and entertainment, such as national parks. This long-term research project addresses that issue by building audio description resources as well as accessible mobile apps for national parks."

UniDescription Tools: How do I use the basic functions and features of UniD?

This section describes how the basic functions and features work.

The UniDescription Project's website (www.unidescription.org) is an open-access and open-source resource for learning about Audio Description. Audio Description is the remediation of visual media into audible media, primarily for the benefit of people who are blind or visually impaired, but this translation process also can be useful for people who are print dyslexic or audio-oriented learners. If you are familiar with Captioning for people who are deaf or hard of hearing, and Sign Language Interpretation, Audio Description is an equivalent process of ensuring accessibility for people who cannot see or see well. To listen to our descriptions, just download our free apps for Android or iOS.

This UniD site offers an array of helpful production tools, which are open-access and open-source. In other words, anyone can create a free account and just start building Audio Description with our easy-to-use interface. Besides online help in using those tools, this site also offers open-access training on Audio Description genres (such as describing portraits or maps) as well as scholarly resources about this field, and helpful project-management tools.

To learn more about the UniD project, this site has seven main informational areas, accessible through the navigation bar at the top of every page. 

Those are:

  • Research: A storehouse for our team's scholarly work on this subject, including academic papers and posters.
  • Impact: A documentation of the real-world impacts of our work, including links to descriptions by place. 
  • Academy: A place to learn more about Audio Description and also how to use this UniD site.
  • Library: A storehouse for laws, industry standards, best practices, and other guiding documents.
  • Descriptathon: An overview of our Descriptathon training process.
  • Backstory: A collection of the stories about us, where we started, and what we've done.
  • and, About: A list of benefactors, team members, and volunteers.

UniD is open-access, meaning it is free to use (supported entirely by grant funds). To create your free account, select the "Register" link in the top navigation bar, at the far right of that bar, or, via a screenreader, at the link list's end on that bar.

The Register and Sign In processes both share this page, but you will only need to register once. Therefore, Sign In is the first set of prompts on the page, and Register is the second set of prompts, just below Sign In. If you are creating your account for the first time, you will need to register through your email. 



To do that, skip over the Sign In section, and add your full name under the Register label. This is your display name when working on the UniD project, so it should be something that you and your potential collaborators easily can recognize.

After entering your display name, there is an input box for your email, followed by an input box for your password. Check your email carefully, because that email will receive automated messages from UniD about the use of your account and projects. The password strength is up to you, but we suggest following typical guidance about the length and diversity of the password characters. Once you are registered and signed in, you can use UniD freely.

Our Privacy Policy describes how we protect your data, but, in short, 

  • We take your data rights seriously.
  • It is your data, now and always. You are sharing it with us. We make no claim to it.
  • By using this site and sharing data, you are making the world a more-accessible place.
  • Making media more accessible to more people, you can do that for as long as you want here; we thank you!
  • We will never sell your data.

If you forget your password, look for the "Forgot Password" link under the "Sign In" button on the Sign In page. The link to the Sign In page is in the upper right-hand corner of this website, in the navigation bar. When you are added to a project, UniD automatically will create an account for you. It sends you an email to help you log in for the first time. So you also might want to look for that email by searching for "unidescription" in your spam folder. If you try to register or sign in, and UniD says you already have an account, you probably were added to a project by a colleague already. All you have to do to get into the project is just select the "Forgot Password" link to recover your account access.

To customize your appearance on UniD, select the "My Account" link in the upper-right corner of the site, in the navigation bar, after you have signed in. The drop-down menu will offer a "Settings" option. On that Account/Settings page, just follow the prompts.

After signing in, you can select the "Projects" drop-down menu in the upper-right corner of the screen, in the navigation bar. Then choose the "Create New Project" drop-down menu link.

Open your Project. Select the Backstage tab. Scroll down until you find the Share Project tool. Add the email of your colleague to the box and select the Share button. That will create an account for your colleague, based on the email you provide, and it also will send an email to your colleague to let that person know about the account. If that person cannot find the UniD email, then recheck that the right email was provided (and that the person is checking the right account). Also, the receiver of the email can search for "unidescription" in the spam folder, which also might be where it ended up.

If you are a part of a larger app project (such as the U.S. National Park Service's UniD app), then we will add your project directly into that app when you tell us it's ready. Even if the descriptions are shared via that project app, you also should consider adding them to your own website (via the HTML export). You also might have a patron who wants the Mp3s or Text files directly. That's also fine to share, and our tools allow you to export and share freely. Again, this is your (and your audience's) content. Give it away as liberally as possible. We want as many people to hear it as possible.

UniDescription Tools: How do I use the Backstage on UniD?

This section describes how the Backstage works.

The online UniD toolkit has three distinct views of your project and its contents: Backstage, Frontstage, and Component perspectives. The Backstage, the first of those tabs, is used for the administrative and distribution tasks. The public does not have access to any of the information in this view, except the Project Name, which can be changed here. All of the screenshots below either will show directly what is being explained in text, or, if any additional visual information is provided, it will be audio-described as a part of the explanatory text. 

So, again, the first tool available on the Backstage is the Project Name tool, which allows the user to change the name of the project. This name is also the label that appears at the top of all expressions of the content, including in the web version of the project, and in the mobile-app version. So this Project Name should be considered as the first text your user will read. This first screenshot shows the three tabs that control the view of the project (Backstage, Frontstage, and Component), with the Backstage selected. The selected tab is always green, while the unselected tabs are always brown. There is a third tab shown in this screenshot, called Phonetics, which will be covered separately in The UniD Academy. The Component view does not have a tab visible unless a specific component is selected. At that point, there are four tabs on this page, with the Component view labeled as the name of the Component being edited.

TOOL: Project Name


Screenshot 1: The Project Name tool, with three tabs above it, Backstage (active), Frontstage, and Phonetics.


TOOL: Version / Version Notes

Right below the Project Name tool is an open text box tool called Version / Version Notes. This box is designed to hold whatever text the team thinks will be useful, including version numbers, notes about the content, notes about the participants, etc.

Screenshot: The Version / Version Notes tool is shown as a text box with a clip of text about the Descriptathon project within it.

TOOL: Language

The Backstage tool titled "Language" allows the user to select the native language of the description, from English, Spanish, German, French, Italian, etc. ... It's just a drop-down menu.

Screenshot: The Language drop-down menu.

TOOL: Geolocation Tag

On the backstage, the Geolocation Tag connects a GPS coordinate with the overall project, to allow the mobile app to sort by distance (what site is closest to the user) and to also alert the user when that person is within the site's boundaries (by vibrating the phone and putting the site description directly on the mobile-app screen, as long as the app is open at the time). The tool uses Decimal Degrees (DD) to connect the information with place, so users will need to convert their coordinates to Decimal Degrees, or the user can simply stand at the place with a smartphone and this page open on the project, and select the Get My Coordinates button, which then will grab the user's coordinates and connect those with the project.

  Screenshot: This tool has two text boxes, one for latitude and one for longitude, plus a brown button that reads "Get my coordinates."

TOOL: Project Photo 

The Project Photo uploaded here via this tool will be used inside the UniD system only, to give a visual thumbnail of what is being described, on your projects list.

Screenshot: The Project Photo tool has recommended specs, 300 DPI, 1920X1280 Max, plus a Choose File button, to upload the photo. To replace this photo, once uploaded, the user is instructed to click Choose File again.

TOOL: Project Assets

Users can store important documents related to the project here, as a way to keep them centrally available to your team.

Screenshot: The Project Assets tool has a "Choose Files" button, for uploading documents and other related files.

TOOL: Exporting

The Exporting tool is at the top of the right-hand column of tools.

It first has a text that states: "When your project is completed, you can preview the app by clicking on the Preview App button below. Only sections marked as completed will be in the export. When you're ready to upload the app to the app store, click on the Create App button." Underneath that text, the tool lists the Active Components (those on the Frontstage marked as Complete and ready for public view) and the Inactive Components, which are not marked as complete on the Frontstage, meaning they are still in drafting mode.

Underneath that text and component status are six buttons that will allow UniD projects to be saved as well as exported to a unique URL ("Preview App"), to text ("Download Text Export"), to Mp3s ("Download Audio Only"), or to a HTML snippet ("Download HTML Export"). The sixth button adds the project to the UniD online game called Describe It.

Screenshot: The Exporting tool has six buttons, allowing the user to save, add it to Describe It, or export it as a URL, text, Mp3s, or HTML.

TOOL: Change Owner / Share Project

Below the exporting options, the Change Owner and Share Project tools are available, allowing a team environment to be created and for a project to be shared among many people, also with an option to allow each user to either view or edit the project.

Screenshot: Both the Change Owner and Share Project tools are shown as text boxes, where an email is inserted, followed by a button to initiate the action.

UniDescription Tools: How do I use the Frontstage and Component views on UniD?

This section explains how the Frontstage and Component views work on UniD.

This screencast helps to conceptually explain the Frontstage index and the Frontstage component-level views, as compared to the Backstage view.

This screencast explains how the Table of Contents works. 

When you select an item on the Table of Contents, UniD will open a Component-level view of that item, creating a fourth tab near the top of the screen. In this screenshot, that fourth tab reads "TBADescriptathon 8: Prizes at Stake." The first three tabs are Backstage, Frontstage, and Phonetics.

Screenshot: Shows the top of the Component-level pageview. It has the four tabs, described above, plus a horizontal bar in light blue that reads "You can edit this page," and a Navigation tool that allows the user to move to the next or previous Component without exiting this pageview.

Inside this Component-level view, there is the Component Label tool, which is the name you gave this Component when you created it. So this is where you can change the name.

Screenshot: Shows a text box labeled "Component." Inside the box is text that serves as the label for the Component, "TBADescriptathon 8: Prizes at Stake," plus a Play button, a Download (Mp3) button, and an Upload Mp3 button. Uploading an Mp3 to this tool will replace the machine-voiced description generated automatically by UniD with the recording you have uploaded (typically that is a human performance of the same text).

The other main text-entry box on this page is where your main description goes. 

Screenshot: This textbox has the same functionality of the Component label box, only this box is much larger, and it is expected to hold the bulk of the Component's description.

This screencast explains how to attach and publish (or not) a photo related to the description.

On the Backstage tab of every project, there is a tool called the "Geolocation Tag," the text of which is shown in the following screenshot (and all of which also will be explained in the text that follows.

With that tool, at the project level, you can geotag your project to allow anyone with the app open to find it simply by being in the place you have tagged (the app will vibrate the phone and bring this project to the screen).

To use the Geolocation Tag tool, you can follow the provided text directions, which are:

"This GPS coordinate is used to display how far away a person is from this location while in the UniD app. You can click the map marker icon above to get your current GPS coordinates or fill them in manually below."

And then either add the Decimal Degrees Latitude and Longitude numbers, or select the Get my coordinates button to grab the coordinates from where you are standing at that moment (with your phone or your computer). 

With that tool, at the component level (geotagging an individual description), the describer navigates to the component-level view (where the descriptions are written) and locates the Geolocation Tag tool again, with the screenshot below showing the textual interface:

This part of the interface has the same basic functions, with the text reading:

"GPS coordinates are used in UniD apps to create a locational trigger for users, in which a device (with the app open and permissions allowed) will vibrate and present particular content in a particular place, within the radius selected. Designers can either input the coordinates remotely or go to the place, open the UniD design tool on a smartphone or tablet, and select the geolocation button, which will grab the current coordinates and link those to the UniD component."

The is a text box to include the Radius here, which allows more tailoring of the experience. 

The UniD system functions upon a few foundational principles, such as: 

1. As acoustic media, Audio Description is experienced in a linear fashion. So it should be designed as a linear experience. But that linear experience should be customizable.

2. Audio Description – as a remediation of media in a different medium (visual content in a visual medium) – needs to clearly state what the source material was before describing it. In other words, Audio Description needs to be hypermediate in its approach, describing the medium first before the contents of the media, as a form of orientation.

3. This hypermediate labeling should be included in the titles of all description (the components) and in the beginning of the description (we use the all-caps label, DESCRIBING to indicate we are focused on the medium). So the audience knows that the description refers to, say, a photograph and not a painting, and makes clear that the description is of the media artifact not of the inspiration for the media. In other words, we are describing a photograph of a horse, not the horse itself.

4. The Frontstage allows content to be separated into smaller, more-digestible pieces or combined to make more-holistic media packages, depending on what the designed thinks will be the best experience. It also allows for the content to be sorted in different ways, including nested in groups, to create the most-cohesive and comprehendible route through the material. 

The Frontstage view is selected through the tabs near the top of the project page. There is a search bar directly underneath it, to allow for searching through the table to find search terms.

Screenshot: Three primary UniD tabs are shown, with the Backstage tab first, in brown, as in not selected. The second tab is Frontstage (active), in green, and the third tab is Phonetics. When a Component is opened, a fourth tab also appears in this place, as a way to toggle among views.

Underneath the tabs is the Table of Contents list. Those each represent a Component that contains description.

  

Screenshot: The Table of Contents view shows rectangular boxes that frame symbols and text. On the far left of the Table of Content is an icon of three stacked lines (some people call it the "hamburger" icon), which allows for the Component to be moved up or down in the Table but also into nested positions, up to three levels deep. To the right of the Component name is a rounded square icon, either blank or filled in with a checkmark. If marked with a checkmark, then the Component is live to the public and available for viewing. If no checkmark, then the Component still is in draft stage. To the right of the checkmark box is a red X icon that would allow the user to delete the Component, if so desired.

If a Component is deleted accidentally, there is a small tool at the bottom of the Table of Contents that allows the user to check a box that reads "Show Deleted Items," and the deleted Components will be shown in their archive, where they can be restored. 

Screenshot: This text, at the bottom of the Table of Contents, reads "Frontstage: Show Deleted Items," with a small checkbox that can be selected. If selected, the user can restore the deleted components.

TOOL: Project Progress

At the top right of the Project's Frontstage interface, there is a Project Progress bar. It simply counts the number of components on the Frontstage compared with the number of components marked as complete and informs the user how many of those components are left to finish.

Screenshot: This tool shows the text Project Progress and a horizontal gray bar that is only partially filled in, with green, stating 14, as in 14 percent of the components in this project have been marked as complete.

TOOL: Enter a New Component Label

If you want to make a new Component, or even if you just want to make a new label over a set of components, you can navigate to the bottom of the Table of Contents stack, where you will find this New Component tool. It is simply a text box with an "Add" button that allows you to create a new piece of content. This content can be used as a top-level label for a group of Components or for any individual Component.

Screenshot: A text box with a green Add button. Adding text into the box and pressing the button will make a new Component.

UniDescription Tools: How do I use the Phonetic tool to improve the machine voice?

There are two basic options: Adjusting the phonetics of the machine voice or replacing the machine voice with a human voice. 

One of the biggest obstacles to making and sharing Audio Description is the performance aspect. Who will give it voice? The UniD project allows for (and encourages) human performances to be recorded and uploaded to the system, but, if that is not possible or practical, we also offer a machine-voiced option. In short, users of the UniD system can type in their descriptions on their projects and immediately have them voiced by an open-source machine processor. This will allow for instantaneous playback and an option for creating Mp3s of the voiced descriptions. But it's also not a perfect option, because the open-source machine voice at times has trouble with some words. In response to that issue, the UniD system has a Phonetic tool that allows the user to change the way the description sounds in the machine voice without altering the original text. To use that system, within your project, just select the Phonetic tab, type in the word that is not being voiced properly, and then in the second box, type in how it should sound. You can play around with the phonetic text to make it work for you. Your audience will not have direct access to the Phonetic text, so you can adjust however you need to do it, and this project-level adjustment will not affect other projects you or other users have in place.

In the UniD system, the text you type into your project becomes the sound your audiences hear. You have the option of uploading a human-voiced version (a different Tool Training covers that option). But you also have the option of having our open-source machine voice processor convert your text into an audible Mp3 file. That's all done automatically any time you use the system. But, what if the machine is not pronouncing a word properly, and you want to change it? That's where our Phonetic tool steps in and takes over.

The Phonetics tab is near the top of each project page (so just open your project, and it is there). UniD projects are viewed from the Backstage (first tab), Frontstage (second tab), Phoentics (third tab), and if you have a particular component open, there will be a fourth tab that lists the title of that component. To use the Phonetic Tool, just select the Phonetics tab, among the others. 


Underneath the tab, once selected, is the following text: "The words in the phonetic library for this project will automatically substitute words found in the primary text of a component, unless that component is already overridden with custom phonetics."

What that means is that if you are hearing the machine voice mispronounce a word, just open the Phonetics Tool, and select the ADD NEW WORD button, which is directly above the lists of words already being adjusted on this project.

The Add New Word dialogue box that opens next has five options. 

1. Word. This is the text box where the original troublesome word should be typed in; the text needs to exactly match the word you want to replace phonetically.

2. Phonetic. This is the text box where you add the text as you want it to sound. Often, just adding a hyphen or a space between syllables can help the machine voice to figure out what to do. 

3. Play. This button lets you test the new phonetic version to make sure it sounds right.

4. Cancel. If you want to abandon this effort, you can cancel out of it.

5. Yes, add phonetics. Once you have the original word in place, and the phonetic replacement, and you've listened to make sure it is coming out like you want, you can add this phonetic override to your project, and in all cases where the original word is found in the project text, it will be replaced phonetically by the new version you created here.

Once you have changed a word, or words, you always can go back and edit and adjust these phonetics through the list of your contributions on the Phonetic Tool home page.


 

Most of the users of UniD prefer the machine voice for practical reasons (it's much quicker to produce and easier to maintain). But Audio Description audiences also enjoy human-voiced performances. If you want to convert your UniD script into a human-voiced version, all you have to do is record the performance of the script in separate Mp3 files for each component (so that each component's text, including the title, is a single Mp3 file). With that file, you can override and replace any machine-voiced version by selecting the Component and the Uploading option at the end of the Description line. That area of the interface has three buttons. The first is a Play button, so you can listen to the description you have written. The second is a Download button (recognizable by a down arrow), which allows the user to download that particular component in a machine-voiced version. The third button, which looks just like the download button, only the arrow is pointing up, is the Upload option: 

By selecting to Upload, your provided file will replace the machine-voiced audio on that component. The text in the component will remain the same.

UniDescription Tools: How do I describe or judge during a Descriptathon?

These are walk-through descriptions of the basic activities in a Descriptathon. 

During our Descriptathon workshops, we practice Audio Description through a rapid-prototyping exercise called a "Practice Description."

Step 1 of this process is to locate the hyperlink in your To Do list that relates to the exercise. It will be called something like this, "Descriptathon 7: Round 1, A Portrait (The Practice Description)." The key details in that link are the Round (in this case 1), the topics (Portrait), and that this is a "Practice Description" (formerly known as a "Quick Descript") and not a "Challenge."

This first image is a screenshot of that searchable text, with no images, which shows the texts: "My To-Do List," "Welcome back, Brett!," "Show the To-Do List Only," "To Do - Highly Recommended (in bold)," followed by the link you want to locate and select. Once you select that link, you will be given an option of being the writer (or not) in a binary choice. If you are just viewing, select "I am not the writer for this round." If you plan to add text, select that you are the "writer for this round."

Step 2 is to choose whether you are the "writer" or not for this activity. Only one person on your team should choose to be the writer (so decide that role performer as a team before the activity begins). Being the writer allows that person to edit the text. If you select to not be the writer, you will be in view-only mode. Only one person can write in this box at a time. So choosing the writer role takes over that role for your team (even if someone else already is doing it). 

Before writing as a team, we ask that people write their own Practice Description individually, too. That way you bring ideas and models to your team discussion. For people who are blind or who have low-vision, we ask that for your Practice Description, instead of writing a description, you instead represent the intended audience and create a list of what you want to hear in a description when someone tells you that they are going to describe something, like a cultural artifact (i.e., the JFK bust at The Kennedy Center). Then, when you come together with your team to talk about the Practice Description, you have your checklist of what would make a great description, and you can compare it to the descriptions your teammates are writing. In the Practice Description, for the individual contribution, each person brings either a description or a list of questions or expectations for the description. Bringing all of that together typically makes for a great discussion and submission. 

This next image is a screenshot that shows the two buttons for this step of choosing to be a "writer" or not. One button, which is dark brown, says "I am not the writer for this round." Selecting that button will give you view-only access to the description. The other button is green, and it reads "I am the writer for this round." By selecting that green button, you will get read-and-write access to the description. 

Step 3 happens after you select if you are the writer or not. With either selection, you end up on the same screen. But if you choose to be the writer, you can edit this text. If you are not the writer, you are in view-only mode. On this screen, the image that is being described will be shared, and the Component Description box will be available. When you start a Practice Description, there will be sample text to help your team shape the description. It provides a prompt for "DESCRIBING," which is intended as a place to briefly describe the image being described, as an artifact. An example of that sort of text is: A vertical black-and-white photograph. This should only be a sentence or two and focused only on the medium, not the content of the medium. Next in the text is the "DESCRIPTION:" label, which is where team's description should go (replacing the text that reads "Description of the image goes here." The Caption and Credit are provided as text from the source material. Those should not be altered.

The main difference in the process between the Practice Description and the Challenge is that during the Practice Description, this submission box is used as a place to gather notes and compare descriptions. In the Challenge, this is the final text that will be judged in the tourney contest, so only the final description that you want judged should be in this box when you mark it as complete. When you mark the Challenge as complete, it will be forwarded to the judge team for feedback. During the Practice Description, this is more of an online workspace for your team. You can leave questions, notes, etc., in the box, as a way to outline your process and prompt discussions among team members.

This next image shows a vertical photograph of Abraham Lincoln next to a text box labeled "Component Description," which is where the group's description should go. 

Underneath the image and Component Description box are three green buttons, for the writer to use: "Save" (saves the current state of the text), "Save & Return to Descriptathon home page" (which saves the work for the moment, but that does not submit the work as complete, for feedback, discussion, or judging). When your team is ready to complete this description and submit it for feedback, discussion, or judging, the writer should select the button "Mark as Complete," which forwards the description in the backend system to the admin team and the judges. If you accidentally mark the description as complete, and want to reverse that decision, you can just return to this page and select the new button that reads "Mark as Not Done," and the Practice Description will return to the To Do part of the To-Do List, rather than the Done stack.

During our Descriptathon workshops, we practice Audio Description through a hackathon-inspired activity called a "Challenge."

Step 1 of this process is to locate the hyperlink in your To Do list that relates to the exercise. It will be called something like this: "Describe - The Coconut Express (D7) - The Opener (Round 1) - Ash Meadows (9) NWR vs Cuyahoga NP (10)." The key details in that link are that the activity is to "Describe," (not to Judge), the Round (in this case 1), and the teams involved ("Ash Meadows" and "Cuyahoga"). 

This first image is a screenshot of that searchable text, with no images, which follows the texts "My To-Do List" and "Welcome back, Brett!,"  with "Show the To-Do List Only," "To Do - Highly Recommended (in bold)," followed by the link you want to locate and select, starting with "Describe." 

Step 2 is to choose whether you are the "writer" or not for this activity. Only one person on your team should choose to be the writer (so decide that role performer as a team before the activity begins). Being the writer allows that person to edit the text. If you select to not be the writer, you will be in view-only mode. Only one person can write in this box at a time. So choosing the writer role takes over that role for your team (even if someone else already is doing it). 

This next image is a screenshot that shows the two buttons for this step. One button, which is dark brown, says "I am not the writer for this round." Selecting that button will give you view-only access to the description. The other button is green, and it reads "I am the writer for this round." By selecting that green button, you will get read-and-write access to the description. 

Step 3 happens after you select if you are the writer or not. With either selection, you end up on the same screen. But if you choose to be the writer, you can edit this text. If you are not the writer, you are in view-only mode. On this screen, the image that is being described will be shared, and the Component Description box will be available. When you start a Challenge, there will be sample text to help your team shape the description. It provides a prompt for "DESCRIBING," which is intended as a place to briefly describe the image being described, as an artifact. An example of that sort of text is: A vertical black-and-white photograph. This should only be a sentence or two and focused only on the medium, not the content of the medium. Next in the text is the "DESCRIPTION:" label, which is where team's description should go (replacing the text that reads "Description of the image goes here." The Caption and Credit are provided as text from the source material. Those should not be altered.

The main difference in the process between the Quick Descript and the Challenge is that during the Quick Descript, this submission box is used as a place to gather notes and compare descriptions. In the Challenge, this is the text that will be judged in the tourney contest, so only the final description that you want judged should be in this box when you mark it as complete. When you mark the Challenge as complete, it will be forwarded to the judge team for feedback. During the Quick Descript, this is just an online workspace for your team.

This next image shows a vertical photograph of Abraham Lincoln next to a text box labeled "Component Description," which is where the group's description should go. 

Underneath the image and Component Description box are three green buttons, for the writer to use: "Save" (saves the current state of the text), "Save & Return to Descriptathon home page" (which saves the work for the moment, but that does not submit the work as complete, for feedback, discussion, or judging). When your team is ready to complete this description and submit it for feedback, discussion, or judging, the writer should select the button "Mark as Complete," which forwards the description in the backend system to the admin team and the judges. If you accidentally mark the description as complete, and want to reverse that decision, you can just return to this page and select the new button that reads "Mark as Not Done," and the Challenge will return to the To Do part of the To-Do List, rather than the Done stack.

After the Challenge round writing is over, with your team, next is your chance to judge and provide feedback on the descriptions created during that Challenge round. 

To complete that judging, Step 1: Sign In, and open your Descriptathon Home Page. Navigate to the To Do List. Find a link that starts with "Judge" and reads something like this, "Judge - The Coconut Express (D7) - The Opener (Round 1) - Hot Springs NP (11) vs. Sleeping Bear Dunes NL (12)." The key details in that link are that you are being asked to "Judge," the Round (in this case 1), and that two teams are competing in this round. In some rounds, there are as many as eight teams competing.

This first image is a screenshot of that searchable text, with no images, which comes just after the texts: "My To-Do List," "Welcome back!," and "Show the To-Do List Only." The first line is: "To Do - Highly Recommended (in bold)," followed by the link you want to locate and select.


After that link is selected, you will be taken to a new Table of Contents, for the interface of the judging page. This next image is a screenshot of that searchable text, with no images within it. The text reads: "Table of Contents" and "Overview," followed by jump links to the descriptions being compared and judged against each other on this page (in this case, Hot Springs vs. Sleeping Bear Dunes). The first links (two in this case, but there could be more), are labeled with "Description" as the first word on the line, and they take you directly to each description that you are being asked to read (one link for each description). It might be helpful to return to the Table of Contents after using each link, to assist with orientation. 

The next section of links is for providing the feedback. The first set of questions you are asked is, Which of those descriptions did you prefer? And Why? By picking one, you are picking the "winner" of this round (your vote will be counted among the others on your judging team to determine the winner of the round). By answering the why, though, this is really a chance to help set the agenda for the group discussions. Write as much of the "Why?" as you can in the time allowed, because often that qualitative feedback opens up broader discussions that even carry outside of the Descriptathon and can be some of the most valuable moments of this workshop.

The last section of feedback asks you to respond to roughly five "radio buttons," typically asking for a scope of responses to a question, i.e., from strongly agree to strongly disagree. This section also will ask a single demographic question, about what level of visual acuity you have, which is intended to help preserve your anonymity but also gather important research data about Audio Description preferences among and between people who have been blind from birth, people who have become blind later in life, and people who are visually impaired.  The image below is a screenshot of searchable text you can find on the judging page. 

Learn About Audio Description Production: How to Set Up a Brochure Project

This is the basic process for setting up a Unid project to audio describe a printed brochure.

To audio describe a brochure – like the ubiquitous Unigrid brochures at most U.S. National Park Service sites – the first step is to get a physical copy of the artifact (the brochure), to examine it carefully, and to start to deconstruct its contents as the source material for its eventual audio remediation. Make sure you have the most up-to-date version, including the digital versions of the texts (to ensure exact replication of those texts). Think about why the designer chose this particular imagery and put it in these particular places. In other words, look carefully at the brochure and think about what the designer of the brochure wanted to communicate to the site visitor and for what purposes. That's the same basic effect you want to create with the Audio Description of the brochure's components, only with a specific aim at people who are blind, visually impaired, print dyslexic, or audio-oriented learners. 

This is a design process and not a rigid transcription-only process. So there are many design choices that the audio describer will have to make, which might be made differently by a different designer. That's OK. It's not really any different procedurally than the design choices made by the designers of the original brochure. Someone had to choose what to say and how to say it. The primary difference in this context is that Audio Description is a visual-to-audible remediation of the brochure content, and the brochure content originally was filtered from a much larger set of source materials. 

As a metaphor, think of the brochure as a distillation of everything that could possibly be communicated about the site, and the Audio Description is a further filtration process, only with as little informational loss as possible. 

With Audio Description, the source material strictly is the brochure (not the vast original set of source materials that the brochure designer considered and discarded). With that brochure as the sole source material, the audio describer's role is to reimagine that brochure, as is, only as being heard not seen. The job of the audio describer is not to remake the brochure. The job is to make the existing brochure more accessible, especially to people who are blind or visually impaired.

This online training session will address some of the potential design choices that must be made. 

For this training, we will use a couple of different brochures as models.

Example 1: Lincoln Boyhood Home

Here is its Side 1:

And here is its Side 2:

Example 2: Santa Monica Mountains National Recreation Area

Here is its Side 1:

Here is its Side 2:

In addition, I have these paper brochures in my hand to match with the electronic versions. And if you have yours (the one you want to describe, at least the e-version), you are ready to move to Step 2.

This step focuses on the deconstruction of the brochure into describable components. Now that you have the brochure in hand, including its digital versions, and can look carefully at how it is designed, you can begin to make design decisions necessary to turn this piece of paper into an audible experience. What we are doing in this step is deciding what parts get described together, in the same audio-file component, and what parts get described separately. 

First important point: Any decision you make now in UniD always can be changed later (and fairly easily) in our system. You might, at first glance, for example, think the descriptions of Lincoln's parents should be combined into a single component:

But then later, you think, those really should be described separately in separate components, for whatever reason. In the UniD system (per other training modules on this site, about how to use the Frontstage tools), such combinations or separations are easy to make. Your structure likely will change and evolve as you write your descriptions and get more deeply familiar with the content. Depending on what you say about Lincoln's parents in one component, for example, will affect what you describe in another. So my recommendation for the setup is to make decisions that you think are best but also be prepared to update those decisions as you go.

Second important point: A reason a physical brochure is helpful is because it helps to simulate the experience of the user. As the audio describer, you can hold the brochure, as it is folded, and flip it over, and unfold it into sections to allow those folds to also help guide your decisions. 

A few more examples:

On this cover of the Lincoln Boyhood Home, what first appears to be just one element to describe (the main photograph) actually is six distinct elements that need to be either combined or separated.

Here is what I see on the page:

1. The black bar. This is a distinctive design element of NPS brochures using the Unigrid style. They all have this black bar, and that's not just negative space. It signals to the viewer that this is a NPS-branded document. It can be described as a visual element, either alone or as a container for other parts listed below in 2-4. The designer needs to decide should it stand alone as a component or be described with 2, 3, or 4 below (and maybe even 5 and 6), or all of the above.

2. The NPS Arrowhead. One of the most well-known logos in U.S. history, this arrowhead also is part of the branding, communicating to the viewer that this document has the NPS stamp of approval on it. Look closely at the logo and its elements, which represent the scope of the NPS. 

3. and 4. This is standard branding text, in the same Helvetica typeface used in all Unigrid brochures. Our recommendation always is to transcribe all text on the brochure, so the listener can hear everything that was written on the brochure. That text then is part of the description, but a designer also could choose to go a step further, and, for example, describe the typeface in more depth.

As in, according to the NPS, Helvetica was chosen for this design because:

"it has crisp, clean details and typographic texture that make it esthetically appealing and easy to read. It has a clearly defined hierarchy of sizes and weights with known typographic results and thus is compatible with such special applications as maps and tabular material found in NPS folders. Park names, set in Helvetica display in the title bar, establish the folder's typographic scale and serve as a logotype for the series.

Helvetica is particularly suited to the offset process used for printing NPS folders because of its line strength, consistent color, a lack of idiosyncrasies, and large x-height. Helvetica's large x-height, the height of lower case letters, such as the x, strengthens the word form and therefore the text's legibility. Helvetica is one of the few typefaces with this large x-height that is also neutral in style.

This type is available in a wide range of sizes and weights in both metal and film composition. It serves large display and small caption purposes without loss of character. Helvetica, when used as specified, promises legibility, a savings in time and money in the design process, and a consistent typographic appearance for the series."

My point in including all of that description about the typeface is that even the smallest details in a brochure are important, deeply considered and articulated by the designer, and have meaning to the audience, even if that meaning is not consciously understood. At the bare minimum, though, the listener should be able to hear all of the included text on the brochure.

5. A quote. Sometimes, these quotes are connected explicitly to another component. And sometimes they are not. In this case, the quote refers to Lincoln's boyhood home, but that original home cannot be seen in the underlying image, of the Memorial Visitor Center, creating something of a visual disconnect. So the audio describer has to decide: Is this quote explicitly associated with this particular image (creating a connection that needs to be described together, in a single component), or does this quote transcend its placement on the page, to cover more than just this other element (and should therefore be described separately, in its own component, standing alone)? Again, either way is "right." But when your audio design comes together, you might decide that the quote works better alone or with this photo, and adjustments can be made. 

6. The main image. This shows the front of the Memorial Visitor Center, not the boyhood home, which is not explained until deep into the text on the front side of the brochure. So this is where the Audio Description actually will reorder the content, to some degree, as a part of the remediation, and people listening to the description of the Memorial Visitor Center would hear about it sooner than a reader would see its description in the brochure text. That's also OK. 

7. Sometimes tiny bits of text can go undetected, but those texts might, as in this case, credit the creator of the image. That is an element that was included in the visual version, so it also should be included in the audible version.

The goal is to design an equivalent experience, not an identical experience. Again, your design choices might not match someone else, but you have been given this job to describe the brochure. So you get to make the artistic choices. Your listeners trust that you will make the best choices you can for them.

Let's next deconstruct the rest of the Lincoln brochure's Side 1. Take a look at it, again, and think about what you see:

Here is how I see the rest of it:

(Numbers 1-7 are described in the previous step)

8., 9., and 10. These two quotes – apparently by Lincoln but not attributed as such – have been put together at the top of this stack of texts. Are they related enough, in your opinion, to include together, and maybe with the text block 10? That's a choice you'll have to make. The most important part is that all text on the brochure is transcribed into the UniD system (copy and paste from your original document, rather than retype). The first major part of the Audio Description process is to transcribe all of the included text. The second part, which is more fun, is to describe all of the included visual media. And then there are outliers, like the cream-colored box in the background.

11. Back to the portraits of Lincoln's parents. It makes sense to me to keep them together in a single component, labeled something like "Lincoln's parents," rather than splitting them into two separate components, such as "Thomas Lincoln portrait" and "Sarah Bush Lincoln portrait," but I also could argue for making them independent and giving them their own space and place in the descriptions. Up to you. When faced with portraits, this would be a good time to refer to our online training module related to portraits.

12. and 17. These are two more Lincoln quotes, presumably, interjected into the texts as thought breaks. Is that placement in those particular texts important? Or do you get an equivalent experience when they are broken out, into separate components? If broken out, I would add text that associates them explicitly as having been said by Lincoln.

13.  This is a distinct text box, but it doesn't have a bold-faced intro text of any sort, so it blends into the previous piece of text and the quote above. As the audio describer, you can decide whether to combine with the above or separate.

14. This text block has its own bold intro, "A New Household," which helps to distinguish it as a component, but there also is the pesky 17 in the middle of it. What to do?

15. and 16. I typically include the three elements of a photograph package as such: A. The image (which still needs to be described), B. The caption (which needs to be transcribed), and C., the credit (which needs to be transcribed). All of those are present here, so they make a nice package of information together. Whether this image package is connected to other elements, such as the 17 quote, is another matter and subjective.

17. (See 12, 15 and 16, above).

18. This text has a clearly distinguished part, "Moving On," but also a tone switch at "The park preserves ...," so the describer could decide to keep those together or break them apart into distinct components.

19. A collage. With the images overlapping and clustered together, I would consider this a collage, rather than three distinct images, and keep them together in a single component. Others may have a differing opinion. 

20. and 21. This is a classic photo-text package, like 15 and 16 above, but with a text that clearly is associated with the image. In this case, I would include it all (headline, text, photo, caption, and credit) in the same component.

Let's next deconstruct the Santa Monica Mountains brochure's Side 1. Take a look at it, again, and think about what you see:

Here is how I see the rest of it:

1-2. The NPS black bar – This black bar is a part of the NPS branding on UniGrid brochures, and they pretty much all have it. So the describer should include a description of it, along with the basic text that states the name of the site, what type of a site it is (National Recreation Area), the state it is in, etc. If you are following along from the Lincoln Boyhood example, I won't repeat my advice (you can read it there) other than to say the describer at this point can decide to describe the black bar as a package of elements, which might include the iconic NPS Arrowhead logo, or individually, as in, the black bar as one thing, the logo as another thing, the text as another thing. If it were my choice, I would put all of this together, because I think it is meant to be seen as a singular design element. 

3-4. This brochure is unusual because it puts English and Spanish text side by side. I don't speak Spanish, but I can pick up a few words, and it appears one is a direct translation of the other. So the describer now has the choice of making both languages as stand-alone components, splitting the brochure into a Spanish version and an English version, or combining them into a single component. My first reaction is to split them, giving equal space and equal headings in the two languages, so a listener can choose one or the other. I suspect that very few people will listen to both, so it just makes sense to me to give people different paths to follow, depending on their interests.

5-6. To call this collage a single image is both true and a major understatement. This reminds me a lot of the huge maps we describe that have no intended focal point and no one way to enjoy and use them. In these cases, I recommend first determining the purpose of the image. The way I read it, the collage is intended to show the wide breadth of activities and inhabitants of the Santa Monica Mountains, including people, animals, plants, and landscapes. So I suggest to first write a Synopsis of the collage, covering the artistic style and the imagery's highlights. Then, following the structure of the Synopsis, I suggest to break the image down into its distinct scenes and to give each of those a name. For example, in the upper-left corner, I see a cityscape at sunset, so I might name that component Los Angeles at Sunset (presuming L.A. is being shown), and then just describe that particular scene. Then, in an adjacent component, I might describe Orange Convertible on Mulholland Highway (again, presuming that is the road being shown), and in that scene, I might mention how the car is driving into another scene, which appears to represent a forest fire, giving the image action and movement (or something like that). As these scenes are described, I recommend contextualizing what is around them, too. For example, in the Convertible scene, I might write that the roadway is bordered by the L.A. cityscape at the top, the NPS black bar on the left, the abstract forest fire scene on the bottom, and the expansive hillside scene to the right, however those get labeled and described. Then, within that context, your listener now will be able to picture both the scene itself and how it transitions and connects to other scenes in the collage. There are a lot of ways to approach this description challenge, and there is no empirical research yet about what works best in such cases. So I'm leaning on some of the map research we have done, in terms of non-linear imagery, and some of the related empirical studies about how the brain best uses audible information to take a shot at it. So, collage research is on our list now. ... And don't forget about (6), audibly crediting the artist.

Now that you're getting the hang of this, you can flip the brochure over to the second side.

It looks like this:

Here is how I see it:

22. Another black bar, with text (not just the name of the park again). So needs at least the transcription of the text.

23. A text block with a clear headline, "Living Historical Farm," which indicates it should be a component. The tricky part of this design is that any of the three photographs in this gray box (should the gray box be described?) could be associated with this text. But is the connection strong enough to put them together in the same component? That is the question.

24. This "Crop Area" text seems distinct enough, and without any other clear connections to nearby media that it is a pretty straightforward TEXT box component.

25. and 26. Per 23 above, these photos could be associated with the entry text on the page or be described separately. Or the two in 25 could be described together (they share a cutline, which could be tricky to separate) and 26 could be a separate component. I think I would put 23, 25 and 26 all together in a single component, because they tell basically the same part of the story and work well together. Whatever way the describer goes, though, these picture packages all have captions and credits to include as well.

27. and 28. A lot of NPS brochures have this catch-all part, with a bunch of short texts conveying distinct messages. In these cases, I try to find any connections that can be made with other elements to build more an audio structure around an idea. For example, the "Pioneer Cemetery" text looks like it relates to 28 and its marble headstone, although that's not exactly clear when reading the text. So I would check with park staff to make sure that headstone is in the Pioneer Cemetery. If so, those would go well together, and that is an example of a photo-text package that can be created in audio that doesn't necessarily exist in the same way visually. 

29. There is a mention of the Memorial Visitor Center under "Things to See and Do," so I would either connect this image with that text in the Audio Description, or I would keep them in separate components (but I would lean toward combining them).

30. This accessibility text is very important to people listening to Audio Description, and this is one place where we have been authorized to add new information to the brochure. If this site has anything else specifically designed for people who are blind or visually impaired, such as tactile artifacts, touchable maps, an audio-described film, etc., this would be a good place to highlight those offerings in a separate component.

31. We also recommend breaking mobilization information into a separate component that is easy to find. You don't want people to get frustrated with buried text that answers common questions, such as "What's the site's phone number?" Boilerplate, such as the "390 parks" info, also can be attached to mobilization information.

32. We have done a significant amount of research on describing maps and have an online training module set up just for this issue. In short, though, the describer first has to determine what the purpose of the map is. In this case, it's a high-level overview of the highway system around the park. It would be used for drivers, I suppose, to get to Lincoln City, and then the more detailed map at 33 would do the rest of the navigation assistance. I think the "Getting Here" text at the top of 27 would nicely pair with this map, so I probably would include that text with this map.

33. For a more complicated map, like this one, I would refer to the academic paper we wrote about maps to help set some expectations. But this one also has a reasonable amount of information on it, all of which can be described, and all of which should be included in the description. This map, more than 32, tells the story of the place, including its boundaries, trails, and highlights. So I would approach it as a visitor might: With a grand overview of the setting (its roughly rectangular shape, stretching mostly north and south, with a railway line roughly intersecting it around the middle, and its relationship to downtown Lincoln City, and such). A visitor (who is the primary audience for the brochure; the brochure was not designed for a staff member) might wonder where does this place begin? To that, I would start by describing what appears to be the landing point, the parking lot, near the rail line, and then unwind the story from there, as in, just to the north is the Cabin Site Memorial and a series of trails, which then could be described in more details. Directly to the south is the Memorial Visitor Center, and this is an example of how even a single map could be broken into multiple components, with a map overview as one component, and trails as another component, and the visitor center area, with the Pioneer Cemetery, as another component, and so on. A map tells the story of the place, and so should the audio describer of that map.

The back side of this brochure is actually a bit less complicated to approach, besides the always-daunting challenge of the large map. At a glance, I see six discrete images, a graphic element at the top, two text blocks (one in English, and one in Spanish), and the map. 

It looks like this:

Deconstructed, here is how I see it:

7. This black bar graphic element, in the NPS black bar motif, features icons of footprints from different animals, such as frogs, humans, horses, etc., which I see as a stand-alone graphic.

8-13. Often in NPS brochures, the images fade into each other and overlap and bring into play a complicated gestalt, but when I look at these images, I instead see discrete snapshots of scenes in the National Recreation Area that can be experienced in a modular fashion. So the primary design choice here is to either include them all together in a single component, as a series of photographs or to break them apart into individual images / individual components. I have gone back and forth on this in my mind, primarily because the lack of captions makes me think they are meant to be shown together, so my suggestion is one component with six images, held together with a great Synopsis of why these images work well with each other.

14-16. These short text blocks can either be kept together or broken apart into separate pieces of information. For navigation's sake, I recommend breaking them apart.

17. The second big challenge of this brochure is the map. Probably the most-difficult part here is that a lot of details are shared in small visual spaces, so, as I've suggested before, and in our research on maps (more in the UniD Academy), the first step to great map description is to determine its purpose. Is this map navigational in nature (as in the describer expects a person driving a car to be able to pick it up and navigate the site) or is it more likely intended as a tool for cognitive mapping, which helps a person to make sense of a place and why it is special? While I think an argument could be made for navigation, I consider this more of a cognitive-mapping tool, showing area highlights and spatial relationships among them. Therefore, I would approach this map with the idea that I want to provide the overview, at a glance, impressions, but I also would want to break the map down into its most important parts, where people mostly go, and considering their uses for the place. For example, I spot both a cultural center and a visitor center, which I expect are places to help people make sense of the Santa Monica Mountains as a public attraction. So I recommend creating components about those. And then any other key areas on the map that the describer thinks are worth a closer look. That could be place-by-place, i.e. Point Mugu State Park, Circle X Ranch, Malibu Springs, etc., or there could be other organizing principles that the people at the site would know better than I do. Whatever approach is used, though, I recommend unspooling as much of the details of the map as possible, in an orderly fashion, because the sighted viewer of the map has access to this information, so why shouldn't that information also be available to people who are blind or have low-vision?

Once you have carefully looked over your source material (in this case, a brochure) and decided on a coherent description plan (what is going to be described together and what is going to be described separately, and in what order, per Steps 1-4 in this Setting Up a Brochure Project training), you are ready to Create a New Project.

To do this, from any UniD page (www.unidescription.org), you simply can open the drop-down Projects menu on the top navigation bar and select the Create New Project option:



Then, you will be asked if you want to use a predesigned Template, or not. If you choose to not use a Template (by choosing No Template), the Table of Contents on your new project will start with nothing in it, and you can build from scratch. If you chose to use a Template but then later decide not to use its parts, all of those templated decisions (such as the language used by the machine voice) can be changed within your project. 

The Project Name will be seen by the public, and it will be the top label on your descriptions however you export them. So think of this name as the label under which people will find your descriptions (this can be changed later, if you want). In this case, since we are working with a U.S. National Park Service site, we will select the NPS template and name the project the official name of that site.



When the Save Details button is pressed, this project is created (using the Template you selected), with you as the Owner of the project (that designation can be changed on the Backstage). And your interface changes to the Table of Contents view:



This is the UniD interface for creating and ordering (or reordering) components into a linear listening experience. It's like a playlist, in which the listener starts at the top and then each file is played in order (unless the user decides to skip around). To create a new Component, just type the name of it into the bottom box in the stack (which reads "Enter a new ..." and press the ADD button.



To change this order, at any time, just select the "pancakes" icon of the component you want to move (holding down the left mouse button and dragging the component to the location you want). 

When you get the order the way you want it, at the levels you want, press the SAVE button at the bottom of the Table of Contents, which saves the new order. If you do not press the SAVE button, the components will not disappear, but they will revert to the original order.

 


You also can use the "pancakes" icon to create a hierarchy among the components, in which some are nested underneath others, as deep as three levels. 

Say, for example, you want to have a 1st-Level Component that described the Back Side of Brochure, and then underneath that, because the back side of the brochure showed several trails, you wanted to have a label called Trails (2nd Level) and then underneath Trails, you wanted to have descriptions of all of the trails in one place (i.e., Trail 1A, Trail 1B, etc.) to keep those orderly and easy to find. 

This nesting function also allows descriptions to be categorized and kept together when moving them around (by grabbing the pancakes icon for Trails, you also are grabbing all of the components nested underneath it). Remember to press the SAVE button to save the order.  



More about the various functionalities of the UniD system, like that, can be found on this UniD Academy page.

Learn About Audio Description Genres: Portraits (Describing People)

You are describing an image of a person. What are the best ways to approach that activity?

We would like to help you better understand certain aspects of Audio Description at the same time that we learn from you about how you approach it. This online training / survey module will present various Audio Description opportunities for you – as well as examples and best practices – while your answers to these survey questions will help us better understand how you learn about the topic.

“Diversity: Describing Race, Skin Color, Ethnicity, Gender, and Disability within Contemporary Audio Description"

Attendees heard a direct and thoughtful discussion by a panel of audio description experts on the importance of diversity within the field, how to describe in today’s diverse world, and reflected on what consumers want to hear in their descriptions.

Moderator: Kim Charlson, Audio Description Project Co-Chair, Watertown, MA

Panelists:

♦ Dr. Rachel Hutchinson, Project and Community Engagement Manager, Royal Holloway, University of London, Inclusive Description for Equality and Access (IDEA) #InclusiveAD, London, UK

♦ Thomas Reid, Reid My Mind Radio, Stroudsburg, PA

♦ Cheryl Green, Audio Describer/Voicer, Portland, OR

♦ Maria Vicky Diaz, Ph.D., Dicapta Foundation, Oviedo, FL

♦ Renee Arrington-Johnson, Audio Description Consumer, Member, ACB ADP Steering Committee, Lyndhurst, OH

Co-sponsored by: ADP and MCAC


  • Recorded during the 2021 ACB National Convention on July 22

Learn About Audio Description Genres: Describing Cultural Artifacts (or Objects)

You are describing a cultural artifact (aka a non-personified object). What are the best ways to approach that activity?

We would like to help you better understand certain aspects of Audio Description at the same time that we learn from you about how you approach it. This online training / survey module will present various Audio Description opportunities for you – as well as examples and best practices – while your answers to these survey questions will help us better understand how you learn about the topic.

Learn About Audio Description Genres: Maps

You are describing a map. What are the best ways to approach that activity?

We would like to help you better understand certain aspects of Audio Description at the same time that we learn from you about how you approach it. This online training / survey module will present various Audio Description opportunities for you – as well as examples and best practices – while your answers to these survey questions will help us better understand how you learn about the topic.

Learn About Audio Description Genres: Collages (Images Combined to Make Meaning)

You are describing an amalgamation of images. What are the best ways to approach that activity?

We would like to help you better understand certain aspects of Audio Description at the same time that we learn from you about how you approach it. This online training / survey module will present various Audio Description opportunities for you – as well as examples and best practices – while your answers to these survey questions will help us better understand how you learn about the topic.

Key Terms: What are the specialized terms scholars use when we discuss Audio Description?

Interdisciplinary researchers studying Audio Description have chosen a particular vocabulary to describe what they do and how they do it. Here are some of the most-common (yet most-opaque) terms in the field: 

If we are going to focus on this research topic, then we should define it well. Here are various academic definitions:

  • "a description of visual information delivered via an audio channel ... it can be said that whereas subtitles improve media accessibility by letting audiences read what they cannot hear, audio description lets audiences hear an account of what they cannot see"
    – Salway, A. (2007). A corpus-based analysis of audio description. In Cintas, J.D., Orero, P., & Remael, A. (Eds.), Media for all: Subtitling for the deaf, audio description and sign language (pp. 151–174). New York, NY: Rodolphi. DOI: https://doi.org/10.1163/9789401209564_012

  • "can refer to both product and process"
    – Szarkowska, Agnieska (2011). “Text-to-speech audio description: Towards wider availability of AD.” The Journal of Specialised Translation 15, 142-162. URL: https://www.jostrans.org/issue15/art_szarkowska.pdf

  • "a creative writing modality to make audiovisual content accessible for all"
    – Matamala, A., & Orero, P. (2013). Standardising audio description. Italian Journal of Special Education for Inclusion, 1(1), 149-155. URL: https://ojs.pensamultimedia.it/index.php/sipes/article/view/328

  • "Traditional AD is typified by five characteristics: it is exclusive; neutral; non-auteur; third-party and post hoc."
    – Fryer, L. (2018). The independent audio describer is dead: Long live audio description!. Journal of Audiovisual Translation, 1(1), 170-186. URL: http://www.jatjournal.org/index.php/jat/article/view/52

  • "It’s an artistic process that needs to be interrogated.”
    – Cavallo, A., & Fryer, L. (2018). Integrated access inquiry 2017-18 report. URL: https://iris.ucl.ac.uk/iris/publication/1582552/1

Audio Description researchers to date have spent most of their time focused on what Sabine Braun (2007) called "dynamic" description, or description related to ephemeral or moving imagery, such as what is produced for television programs, films, theatrical performances, and opera. 

The origins of American Audio Description research appeared in the 1970s, around the West Coast-focus of Gregory Frazier on television and film description. In the 1980s, inspired by Frazier's work, the East Coast-focus emerged around live events, such as theater and opera, led by Margaret Pfanstiehl and her organization The Washington Ear.

The UniDescription Project can be useful for those purposes, based on transcendent practices across all genres of Audio Description, but its focus primarily is on what we call "static" media description, or description of visual media that is fixed in a certain temporal state, such as photographs, illustrations, and maps.

Fryer (2018) identifies five core characteristics of traditional Audio Description: It is exclusive;  neutral;  non-auteur;  third-party; and post-hoc. She also notes how tradition and practice are changing and creating new research tensions that need to be addressed, such as: 

1. Exclusive: Traditional Audio Description is exclusive in that it transmits to lone audience member, usually via headphones. This practice separates individuals from groups, even if the group (or subgroup) of people all are using the same Audio Description. The philosophies of Universal Design, though, push back against that idea, demanding better design, in which Audio Description listeners are integrated into the audience experience, not treated separately as an accommodation.

2. Neutral: Traditional AD prescribes a "neutral" approach to word choice and delivery. Others call that just dull, as in uninspired and uncaring. This is an area where many researchers, especially in Europe (but also within the UniD team) hypothesize that Audio Description as a medium has untapped potential in both artistry and delivery (which was the inspiration of our 2020 National Endowment for the Arts grant). 

3. Non-Auteur: Traditional AD is done independently of the artists creating the original work, which has raised many questions in the research community as to why?

4. Third-Party: Traditional AD also typically has been done by outsiders of even the field being shown, creating questions about describer knowledge of the source material as well as description validity and accuracy. 

5. Post-Hoc: Traditional AD is created afterward, at the end of the production process or as a way to retrofit an existing piece of visual media; this approach begs questions about why not integrate this process with the original production of the source material, among the people who know it best?

Fryer, in short, flips all of those traditions around, to challenge and to interrogate them, and to propose that an integrated and more-creative approach could equate to better Audio Description. 

Source: 

Fryer, L. (2018). The independent audio describer is dead: Long live audio description!. Journal of Audiovisual Translation, 1(1), 170-186.

  • "an interpretation of verbal signs by means of signs of nonverbal sign systems."
    – Jakobson, R. (1959). Linguistic aspects of translation. In R.A. Brower (Ed.), On Translation (pp. 232-239). Cambridge, MA:
    Harvard University Press. URL: https://www.hup.harvard.edu/catalog.php?isbn=9780674731615&content=toc

  • Converting one set of signs (semiotics) into another, such as "turning images into words"
    – Matamala, A., & Orero, P. (2007). Designing a course on audio description and defining the main competences of the future professional. Linguistica Antverpiensia, New Series–Themes in Translation Studies, (6), 329-344. URL: https://lans-tts.uantwerpen.be/index.php/LANS-TTS/article/view/195/126
  • "a 'translation' of visual images into verbal text ... sets AD most distinctly apart from other forms of translation"
    – Braun, S. (2008). Audiodescription research: State of the art and beyond. Translation Studies in the New Millennium, 6, 14-30. URL: http://epubs.surrey.ac.uk/303022/

"a form of vivid evocation"

Source:

Webb, R. (1999). Ekphrasis ancient and modern: the invention of a genre. Word & Image15(1), 7-18.

"Hypotyposis is the rhetorical effect by which words succeed in rendering a visual scene."

Source:

Braun, S. (2008). Audiodescription research: state of the art and beyond. Translation studies in the new millennium, 6, 14-30.

As an alternative to "translation" or "accommodation" models, remediation theorizes that "new media" invariably achieves cultural significance through a process of remediation, in which the new media pays homage to, or rivals, or refashions earlier forms of media, in an evolutionary process. For example, in purely visual media, photography remediates painting. 

Audio Description, though, is a much more complicated case. Audio Description remediates visual media (such as words remediating a photograph) while also synthesizing a variety of audio-based media forms designed for sharing information and stories, such as radio programs, audio tours, and docent lectures.

Key Statistics: About blindness and visual impairment worldwide

A collection of the most important tallies about these communities, conducted by the most-reliable sources. 

"Statistical Snapshots is your one-stop source for statistical facts, figures, and resources about Americans with vision loss. Relying upon the most recently available data, this regularly updated site is always evolving and should answer your most frequently asked questions."

– American Foundation for the Blind. (n.d). Statistical Snapshots from the American Foundation for the Blindhttps://www.afb.org/research-and-initiatives/statistics

"Approximately 12 million people 40 years and over in the United States have vision impairment, including 1 million who are blind, 3 million who have vision impairment after correction, and 8 million who have vision impairment due to uncorrected refractive error."

– The Centers for Disease Control and Prevention. (n.d.). Fast Facts of Common Eye Disorders. https://www.cdc.gov/visionhealth/basics/ced/fastfacts.htm

"There are estimated to be over 30 million blind and partially sighted persons in geographical Europe."

– European Blind Union. (n.d). About Blindness and Partial Sight. http://www.euroblind.org/about-blindness-and-partial-sight/facts-and-figures

"Globally in 2020: At least 2.2 billion people have a vision impairment that may or may not be addressed. Of those, at least 1 billion people have a vision impairment that could have been prevented or has yet to be addressed."

– The International Agency for the Prevention of Blindness. (n.d.). Blindness and Visual Impairment: Global Facts. https://www.iapb.org/vision-2020/who-facts/

"Of the 285 million people in the world who are blind or have low vision, only a relatively small percentage have no light perception. For everyone else, blindness is a gradation. Some people see quite clearly, in certain light conditions. Others see only shapes and colors. For some, their field of vision is complex and hard to explain. The diversity of these extra functions is what makes blindness particularly confusing to the unacquainted observer. For those with changing vision, the daunting part is not usually the fear of darkness, but the fear of admitting that you’re different."

"There are several ways to define blindness. Many people regard blindness as the inability to see at all or, at best, to discern light from darkness. The National Federation of the Blind takes a much broader view. We encourage people to consider themselves as blind if their sight is bad enough—even with corrective lenses—that they must use alternative methods to engage in any activity that people with normal vision would do using their eyes."

– National Federation of the Blind. (n.d.). Blindness Statistics. https://www.nfb.org/resources/blindness-statistics

"With the youngest of the baby boomers hitting 65 by 2029, the number of people with visual impairment or blindness in the United States is expected to double to more than 8 million by 2050, according to projections based on the most recent census data and from studies funded by the National Eye Institute, part of the National Institutes of Health. Another 16.4 million Americans are expected to have difficulty seeing due to correctable refractive errors such as myopia (nearsightedness) or hyperopia (farsightedness) that can be fixed with glasses, contacts or surgery."

– Varma, R. et al, “Visual impairment and blindness in adults in the United States: Demographic and Geographic Variations from 2015 to 2050,” JAMA Ophthalmology, https://www.nih.gov/news-events/news-releases/visual-impairment-blindness-cases-us-expected-double-2050

"The National Consortium on Deaf-Blindness estimated in 2008 that there are approximately 10,000 children (ages birth to 22 years) and approximately 40,000 adults who are deaf-blind in the United States. This census of the deaf-blind in the United States did not count the many senior adults with severe combined hearing and vision loss." 

"The Vision Problems in the U.S. report and database provides useful estimates of the prevalence of sight-threatening eye diseases in Americans age 40 and older. This report includes information on the prevalence of blindness and vision impairment, significant refractive error, and the four leading eye diseases affecting older Americans: age-related macular degeneration, cataract, diabetic retinopathy and glaucoma."

"The population of people with disabilities inhabit a distinct position in the U.S. economy, both for their contributions to the marketplace and roles in government policies and programs. People with disabilities bring unique sets of skills to the workplace, enhancing the strength and diversity of the U.S. labor market. In addition, they make up a significant market of consumers, representing more than $200 billion in discretionary spending and spurring technological innovation and entrepreneurship."

"Globally, 1.1 billion people were living with vision loss in 2020

- 43 million people are blind (crude prevalence: 0.5%).

- 295 million people have moderate to severe vision impairment (crude prevalence: 3.7%).

- 258 million people have mild vision impairment (crude prevalence: 3.3%).

- 510 million people have near vision impairment (crude prevalence: 6.5%)"

"Blindness and vision impairment affect at least 2.2 billion people around the world. Of those, 1 billion have a preventable vision impairment or one that has yet to be addressed.  Reduced or absent eyesight can have major and long-lasting effects on all aspects of life, including daily personal activities, interacting with the community, school and work opportunities and the ability to access public services.

Reduced eyesight can be caused by a number of factors, including disease like diabetes and trachoma, trauma to the eyes, age-related macular degeneration and cataracts. The majority of people with vision impairment are over the age of 50 years; however, vision loss can affect people of all ages. Blindness and vision loss are felt more acutely by people in low- and middle-income countries where accessibility and specific government services may be lacking. In those countries, the most common cause of vision impairment in children is congenital cataract."

– World Health Organization. (n.d). Blindness and Vision Impairment. https://www.who.int/health-topics/blindness-and-vision-loss#tab=tab_1

Theoretical Models of Disability: How do scholars conceptualize Audio Description?

One way to better understand Audio Description is to reflect upon how we are conceptualizing it in a relationship to a model of disability that has been theorized. These theories can be used in many ways, including to frame our perceptions. Once we choose that theoretical frame – and commit to its use – we can benefit from its powers as a device to develop deeper thoughts, including to describe positions, perspectives, and boundaries. Or, if none of these theoretical frames are working well enough for our particular purposes, we can use them as theoretical foils and create our own distinct theory, especially through contrasts to these established paradigms.

"For most of the twentieth century in ‘Western’ societies, disability has been equated with ‘flawed’ minds and bodies. It spans people who are ‘crippled’, ‘confined’ to wheelchairs, ‘victims’ of conditions such as cerebral palsy, or ‘suffering’ from deafness, blindness, ‘mental illness’ or ‘mental handicap’. The individual’s impairment or ‘abnormality’ necessitates dependence on family, friends and welfare services, with many segregated in specialized institutions. In short, disability amounts to a ‘personal tragedy’ and a social problem or ‘burden’ for the rest of society."

– Barnes, C., & Mercer, G. (2003). Disability. Polity. http://www.blackwellpublishing.com/content/bpl_images/content_store/sample_chapter/0745625088/Barnes.pdf

"In many countries of the world, disabled people and their allies have organised over the last three decades to challenge the historical oppression and exclusion of disabled people (Driedger, 1989; Campbell and Oliver, 1996; Charlton, 1998). Key to these struggles has been the challenge to over-medicalised and individualist accounts of disability. While the problems of disabled people have been explained historically in terms of divine punishment, karma or moral failing, and post-Enlightenment in terms of biological deficit, the disability movement has focused attention onto social oppression, cultural discourse and environmental barriers."

– Shakespeare, T. (2010). The social model of disability.  In L.J. Davis (Ed.), The Disability Studies Reader (2nd Ed., pp. 197-204). New York: Routledge. http://thedigitalcommons.org/docs/shakespeare_social-model-of-disability.pdf

These perspectives encourage d/Disabled people to assert their capabilities, personally and politically, rather than remain objects of pity (Mackelprang, 2012). They encourage persons with disabilities to see themselves as part of the great mosaic of diversity that makes up our society. Rather than remaining passive objects of service and service providers, d/Disabled people become active and capable producers and consumers. Rather than organizing their lives around their deficits and problems, they acknowledge and build on their strengths and take control of their lives. Personal decision making replaces passivity, and empowerment replaces powerlessness. This awareness of strength and control has resulted in significant social and political change. 

— Mackelprang, R. W., Salsgiver, R. O., & Parrey, R. C. (2021). Disability: A diversity model approach in human service practice. Oxford University Press.

"The Convention on the Rights of Persons with Disabilities (CRPD) is a modern human rights treaty with innovative components. It impacts on disability studies as well as human rights law. Two innovations are scrutinized in this article: the model of disability and the equality and discrimination concepts of the CRPD. It is argued that the CRPD manifests a shift from the medical model to the human rights model of disability. Six propositions are offered why and how the human rights model differs from the social model of disability. It is further maintained that the CRPD introduces a new definition of discrimination into international public law. The underlying equality concept can be categorized as transformative equality with both individual and group-oriented components. The applied methodology of this research is legal doctrinal analysis and disability studies model analysis. The main finding is that the human rights model of disability improves the social model of disability. Three different models of disability can be attributed to different concepts of equality. The medical model corresponds with formal equality, while the social model with substantive equality and the human rights model can be linked with transformative equality."

– Degener, T. (2016). Disability in a human rights context. Laws5(3), 35. URL: https://www.mdpi.com/2075-471X/5/3/35

"The economic model of disability approaches disability from the viewpoint of economic analysis, focusing on ‘the various disabling effects of an impairment on a person’s capabilities, and in particular on labour and employment capabilities’ (Armstrong, Noble & Rosenbaum 2006:151, original emphasis). While the economic model insists on the importance of ‘respect, accommodations, and civil rights to people with disabilities’, such concerns are subservient to the economic model’s estimation of a disabled person’s ability to work and contribute to the economy (Smart 2004:37)."

– Retief, M., & Letšosa, R. (2018). Models of disability: A brief overview. HTS Teologiese Studies/Theological Studies74(1). https://www.ajol.info/index.php/hts/article/view/177914

"This article attempts to explain why the social constructionist paradigm has failed to replace the medical model in American disability theory. The social movement led by American disability activists attempted to reframe the definition of disability using a minority group model based on the social constructionist paradigm. This paper argues that the disability movement was unable to successfully advance the social constructionist paradigm because the activists accepted the Americans With Disabilities Act (1990) despite its ideological basis in the medical model of disability, and the social constructionist theory does not adequately account for the importance of structural constraints to redefinition."

– Donoghue, C. (2003). Challenging the authority of the medical definition of disability: An analysis of the resistance to the social constructionist paradigm. Disability & Society18(2), 199-208. https://digitalcommons.montclair.edu/cgi/viewcontent.cgi?article=1000&context=sociology-facpubs

"What I argue in this article is that an exclusively special needs approach to disability is inevitably a short-run approach. What we need are more universal policies that recognize that the entire population is “at risk” for the concomitants of chronic illness and disability. As the following pages will show, without such a perspective we will further create and perpetuate a segregated, separate but unequal society—a society inappropriate to a larger and older “changing needs” population."

– Zola, I. K. (2005). Toward the Necessary Universalizing of a Disability Policy. The Milbank Quarterly, 83(4). https://doi.org/10.1111/j.1468-0009.2005.00436.x

"Our understanding of disability is based on the Nordic Relational Model of Disability (NRM), which has been guiding policy and practice for disabled people in Norway for approximately 40 years. According to the NRM, disability comes into existence when there is a discrepancy between the person’s capabilities and the functional demands of the environment (Tøssebro 2004). The relational understanding of disability indicates that this is not a fixed category but rather a phenomenon constructed in space and time, thus leaving a relative interactionist perspective (Gustavsson 2004). Gustavsson refers to Morten Söder: ‘It is impossible to understand the processes producing disability, and consequently exclusion and discrimination, without studying the interaction between the individual and the context.’ (Söder, in Gustavsson 2004:63). The NRM thus gives the opportunity to a multi-level approach guided by an empirical sensitivity to what is going on (Gustavsson 2004). To our understanding, the interactional perspective of the NRM is mirrored in current international policy documents (UN 2006), and has much in common with the social-relational model of disability as well, as both include both environmental and impairment factors. However, as we interpret it, the disabling elements of the social-relational are recognised as external barriers and oppression; this is in contrast to the NRM perspective that focuses on interaction (Shakespeare 2014)."

– Langørgen, E., & Magnus, E. (2018). ‘We are just ordinary people working hard to reach our goals!’ Disabled students’ participation in Norwegian higher education. Disability & Society33(4), 598-617. URL: https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2487117/Artikkel%2B1_Disability%2Band%2BSociety_jan18.pdf?sequence=1

"Over the last decade Amartya Senís Capability Approach (CA) has emerged as the leading alternative to standard economic frameworks for thinking about poverty, inequality and human development generally. In countless articles and several books that tackle a range of economic, social and ethical questions (beginning with the Tanner Lecture ëEquality of What?í delivered at Stanford University in 1979), Professor Sen has developed, refined and defended a framework that is directly concerned with human capability and freedom (e.g. Sen, 1980; 1984; 1985; 1987; 1992; 1999)."

– Clark, D.A. (2006). The Capability Approach. The Elgar Companion to Development Studies, Cheltenham: Edward Elgar.

"Disability is defined as a deprivation in terms of functioning and/or capability among persons with health conditions and/or impairments. The human development model highlights in relation to wellbeing the roles of resources, conversion functions, agency, and it uses capabilities and/or functionings as metric for wellbeing. It does not consider impairments/health conditions as individual characteristics; instead, they are themselves determined by resources, structural factors, and personal characteristics, and thus the model is informed by the socioeconomic determinants of health literature."

– Mitra S. (2018) Disability, Health and Human Development. Palgrave Pivot, New York. https://link.springer.com/content/pdf/10.1057%2F978-1-137-53638-9.pdf


Research Questions: What are RQs in Audio Description we have (and haven't) addressed?

Audio Description is a relatively new field of academic inquiry. It formally was started in the 1970s in the United States (by Gregory T. Frazier of San Francisco State University). But it only has been developed widely by interdisciplinary scholars in the 2000s, mostly by Audiovisual Translation (AVT) scholars in Western Europe.

This is a complicated question with a simple answer: It depends. The quality of the description and the importance of the information to the listener will dictate how long the connection between the two can be maintained. Listeners will have physical constraints, such as how long they are willing to stay in place to listen, and mental constraints, with complex and lengthy descriptions creating a potential for cognitive overload.

In English, people talk at roughly 140 to 180 words per minute, but recent research (i.e. the Bragg, et al. study below) has found that they can listen to and comprehend human voice at around 300 words per minute. In screen-reader contexts, with machine voice, that rate typically is even faster for people who are blind or low-vision, with some "super listeners" able to comprehend at almost 1,000 words per minute. This rate also will depend upon the technical level of the description, the familiarity with the topics, the aptitude for audio learning, engagement in the material, etc. But, in very rough estimations, we can project people listening comfortably to descriptions at about 250 words per minute (or 500 words for every two minutes). That information can be used as a benchmark.

According to Art Beyond Sight's Verbal Description guidelines (full guidelines linked), audio tour companies aim for 90 seconds to two minutes (300-500 words), per stop. But those are guidelines were generated from private research studies (so we can't investigate the data), and they are studying people with sight who are using audio as a supplement to visual information, rather than as the primary channel. So this dynamic, we think, could be significantly different for people who are blind or low-vision. More research is needed, in other words. 

From our Constructivist perspective, though, we will build from psychoanalytic theories as outlined by Nazaret Fresno (in "Is a picture worth a thousand words? The role of memory in audio description," linked below). In that piece, Dr. Fresno categorizes memory as operational in three distinct but sometimes overlapping realms: Sensory Memory (like smells or touch sensations), Working Memory, and Long-Term Memory. 

The entire piece is worth reading, but, in short, the Working Memory is where new non-sensory information is processed and either converted into Long-Term Memory or discarded. This is the cognitive act in which someone conveys to you a piece of information (in some medium), and you either process, learn, and remember it (filing that information into Long-Term Memory), or it slips quickly out of your mind. Working Memory has a very short span in which it can hold information, maybe 30 seconds, and it can hold only a certain amount of information (when chunked, about four pieces at a time, but up to eight chunks, or so). 

In terms of capacity, this brief moment of holding and processing information accounts for the Cognitive Load. The more a person is trying to process, the more Cognitive Load, and at a certain point, relatively quickly, the brain just gets overloaded (which relates to feelings of exhaustion and overstimulation). So all of this is a long-winded way of saying that descriptions should take into account the listener, the amount of information being shared, including at what pace and in terms of creating what load. A very dense and complicated sentence with a lot of precise facts is going to be harder to process than a simple and straightforward sentence, especially for someone listening to it. That doesn't mean the information has to be simplified or simple-minded but just that it should take into account the reception process and the information load it creates. So in our template, we recommended first a DESCRIBING statement that tells the listener what exactly will be described. Stop. The listener then can hear the SYNOPSIS description, which is relatively short and hits the highlights. Synopsis, by the way, originated as a Greek word that meant "together" (sun) "seeing" (opsis), and I think "together-seeing" is a great way to describe Audio Description, in general.

So as a heuristic, 90 seconds (up to two minutes) is a helpful flag for describers and designers. We have anecdotally found that our listeners like to be able to navigate around the content, to find the most interesting, or the most useful, and to dive into that section rather than listen to lengthy components that mix different themes. So at the 90-second mark, and especially if the descriptions are double that, or longer, we would ask ourselves as describers, is there a way that this information could be broken down into smaller segments? Think Cognitive Load; can you break your ideas into 4-8 chunks of ideas every 30 seconds or so? But also give people chances to pause, process, and think between them? 

Thereby, after the SYNOPSIS is delivered, then the listener can choose to hear more, which we call the IN-DEPTH DESCRIPTION, and those chunks can go on and on as long as they hold the listener's attention. Description is easier to write in small chunks like that, but, and maybe more importantly, it better connects with the listener. That's the ultimate goal, not to blast a firehose of information into a person's ear but to connect with your listener and help that person to learn.  

In that vein, the UniD system is designed to allow describers great flexibility in how the description is structured, including length. 

Storytelling, actor-focused

This needs more empirical research, but we have adopted and adapted Art Beyond Sight's Verbal Description guidelines (full guidelines linked), which argue for the following, in this order:

First, the basic information, including the title of the work, the artist’s name, the medium, maybe the year it was done, maybe where it can be seen.

Then, describe the artifact as an object of observation (what does it look like as an image), including its shape, dimensions, and point of view.

Then, describe the content of the image. Who is doing what (and to whom), when and where and how and why?

In what order should that information be presented? This is a matter of significant debate.

Marza Ibanez (2010, p. 147) suggests the content unspools in this order for dynamic AD (we are not aware of ordering studies in static AD, so we think ordering needs more research):

1. The spatio-temporal setting is recognized first, in the Where: The setting, and also the spatial relationships between characters. In other words, set the scene, and the When: When is this visual happening (during a specific event, sometime in history, during an unarticulated time in the recent past?)

2. Who: Who are the characters in the scene? What do they look like? What facial and corporeal expressions are they making? What types of clothing are they wearing? 

3. What: What action is happening here. What are these people doing, and why?

4. How: How are they carrying this out; with what sort of intentionality and attitudes.

So, in that approach, the organization would be: Where was the image made, and When, and then within that scene in that spatio-historical context, Who is doing What (and to whom), how and why?

Or, Explanatory, fact-focused

And for another option, we also have the Explanatory approach, which is focused more on facts that are intended to create a larger picture of what this thing in the image is and why it's important.

Sometimes an expression is clear and can be efficiently communicated, but often, the ambiguity in the expression leads to individual interpretation. If the expression is open for interpretation, and interpretation is a part of the meaning-making process of viewing the image, then that activity also should be extended and afforded to the Audio Description listener. Other times, though, a person "smiles" is all that needs to be said to get the point of the visual image across. 

Facial expressions are a complicated and well-studied area of human communication. We still need to know more through empirical testing about how much of this type of description is enough in Audio Description studies, along the lines of Leung's research below. At this point, our recommendation is to focus on general and commonly recognized expressions, such as saying a person is smiling to indicate happiness, or frowning to indicate sadness, rather than getting too technical with facial muscle descriptions that might not be understood without detailed explanation.

But to be more technical about it, starting with Darwin and developed by others, such as Hjortsjo, and then Ekman, Friesen, et al, the Facial Action Coding System (FACS) has been used by many researchers to isolate and identify expressions and the intent behind them. So that could be an avenue for further research.

An emerging line of inquiry about how animals should be audio described has developed around this 2019 article:

Kim, J. S., Elli, G. V., & Bedny, M. (2019). Knowledge of animal appearance among sighted and blind adults. Proceedings of the National Academy of Sciences, 116(23), 11213-11222.

Abstract:
"How does first-person sensory experience contribute to knowledge? Contrary to the suppositions of early empiricist philosophers, people who are born blind know about phenomena that cannot be perceived directly, such as color and light. Exactly what is learned and how remains an open question. We compared knowledge of animal appearance across congenitally blind (n = 20) and sighted individuals (two groups, n = 20 and n = 35) using a battery of tasks, including ordering (size and height), sorting (shape, skin texture, and color), odd-one-out (shape), and feature choice (texture). On all tested dimensions apart from color, sighted and blind individuals showed substantial albeit imperfect agreement, suggesting that linguistic communication and visual perception convey partially redundant appearance information. To test the hypothesis that blind individuals learn about appearance primarily by remembering sighted people’s descriptions of what they see (e.g., “elephants are gray”), we measured verbalizability of animal shape, texture, and color in the sighted. Contrary to the learn-from-description hypothesis, blind and sighted groups disagreed most about the appearance dimension that was easiest for sighted people to verbalize: color. Analysis of disagreement patterns across all tasks suggest that blind individuals infer physical features from non-appearance properties of animals such as folk taxonomy and habitat (e.g., bats are textured like mammals but shaped like birds). These findings suggest that in the absence of sensory access, structured appearance knowledge is acquired through inference from ontological kind."


With at least two follow-up pieces:

Lewis, M., Zettersten, M., & Lupyan, G. (2019). Distributional semantics as a source of visual knowledge. Proceedings of the National Academy of Sciences, 116(39), 19237-19238.

And, 

Ostarek, M., Van Paridon, J., & Montero-Melis, G. (2019). Sighted people’s language is not helpful for blind individuals’ acquisition of typical animal colors. Proceedings of the National Academy of Sciences, 116(44), 21972-21973.


* Thanks to Research Assistant Andreas Miguel for originally finding these open-source articles.

You want to make Audio Description?

Nothing is stopping you now! This open-access site offers robust training sessions as well as the production tools you need to create and share Audio Description; just sign in and get started.

By using this site, you agree to follow our Terms, Conditions, License, Privacy Policy, and Research Protocols.