Scott Carter | SeeReader

SeeReader: Reading on the go

SeeReader was a tool that combined TTS services with automatic content recognition and document presentation control that let users listen to documents while also being notified of important visual content. The goal of the work was to allow users to read rich documents on mobile devices while maintaining awareness of their visual environment.

I built the SeeReader mobile client in J2ME. The application also relied on significant pre-processing provided by a number of tools developed by other FXPAL researchers (Laurent Denoue in particular).


SeeReader relied on several preprocessing steps to decompose a document into regions, generate summaries for regions, and link text content to figures and text. For example, if the preprocessing engine analyzed the body text, "the redwood tree (see Figure 1) has fire retardant bark" it would create a link between the text "Figure 1" and the actual location of Figure 1 in the document. All of the sentences were also marked with document locations and converted to audio.

The user could then "play" the document, and the app would speak the document's text while showing the user visually the current region and sentence being spoken (top). When the app encountered a sentence that had been linked to an image it could zoom and frame the image in the view (middle). In this way the user would always see the most relevant part of the document when they had a moment to glance at the screen. In another mode the interface could simply draw a reference to the linked figure (bottom).


SeeReader also supported eyes-free document skimming. Users could move a finger in a circle (similar to Apple's click wheel) on the display to skip back or forward among regions.

SeeReader preprocessing engine generated summaries for regions. Like the body text, these summaries were also converted to audio and were played as the user skimmed to a new region.


SeeReader could also inject other audio content, such as page transitions, to augment eyes-free document navigation. Here, the user skims through the document then lifts her finger off of the screen. The app immediately begins reading the document aloud at the document location she selected.

(When playing this video be sure to turn up the volume on your headphones)