Reaching Into a Webpage

Ever want to point someone not just to the top of a web article, but to a particularly good quote – or even several? I’ve done a lot of thinking about that.

The InSite system I’ve been working on for Duke University was built for enabling interactive transcripts of audio and video interviews. Interactive transcripts allow you to navigate media and share excerpts at the sentence level. Click on an excerpt link like this one…

Because it seemed to me that if they have weapons of mass destruction then there has to be an order of battle for their use. I mean I knew this from our own system when I was involved. Who has authority, at what level, what kind of authority to initiate the use of weapons of mass destruction? And I was struck by the fact that they didn’t know.

… and you’re taken right to the sentence you want so you can see it in the full context of the interview, like this:


During the Putin Files project that I helped FRONTLINE with, we realized that it would also be useful to enable sharing and annotation abilities for interviews that weren’t connected to media. That project included 32 source interview transcripts connected to media. We also enabled an additional 24 source interview transcripts that weren’t connected to media so they could also include annotations and be shared at the sentence level.

Building on that experience, we expanded this concept to the Rutherfurd Living History website at Duke, and automated the process.

And using this method we’ve enabled our interview report as an interactive page. Interactive pages don’t have media attached, but, like interactive transcripts, can be annotated and shared at the sentence level.

So you can click on an excerpt link like this one…

There was a strong consensus among the journalists that managing interviews is more difficult than it should be.

… and get right to that sentence so you can see it in full context of the report, like this:

Here’s another:

“The big thing that’s missing – it’s half of source material – you can search and report on text, but not audio, video and images,” said Cohen.

The mechanism that allows for the sentence-level URLs is built into the backend of the InSite publishing system and piggybacks on the WebVTT-based system we already use for interactive video transcripts. Content creators press a “Format No-Media Text” button that automatically separates the text by sentence, preserves paragraphing, and assigns each sentence a WebVTT timecode. The timecode makes it possible to assign each sentence a separate URL for sharing and serves as a place to link annotations.

The following image shows the Format No-Media button and what the formatted report looks like after the button has been pressed:


Some details: We have the Format No-Media Button working reasonably well with HTML tags, but it’s not perfect yet. I formatted the report in the WordPress page editor, copied the HTML to the interactive page tab and hit the Format No-Media button. There was a fair amount of formatting in the report. The tags came through unscathed, but I still had to fix some spacing.

This is a neat feature of the InSite system, but it would be even better if sentence-level URL’s were available everywhere. We could come closer if WordPress were to work out a way to enable them…