Why am I writing a post about Verbatim Speech to Text?
My lipspeaking post was really helpful for me to write and seemed to generate positive responses from people. I want to raise awareness of a variety of communication support options available to assist deaf (and other where appropriate) people access spoken information as I think many of these things are not widely known about.
I am using the word deaf to mean anyone with a hearing impairment and or has any kind of difficulty accessing spoken speech sounds e.g auditory processing disorder etc.
What is Verbatim Speech to Text Reporting?
Verbatim Speech to Text Reporting (VSTTR) also known as "STTR" "STT" and palantypy (especially amongst UK deaf folk) derives from court reporting and uses the same technologies. I'm going to use VSTTR throughout this post to emphasise the verbatimness of this communication support option.
A highly trained operator uses a customised keyboard chorded input system to output realtime text of what is spoken. A UK AVTTR operator should be able to output 180wpm at an accuracy of at least 98%. Many are additionally qualified up to 200 or 250 wpm. However each word they output has to be put into their dictionaries so most mistakes will be custom words.
Palantype and Stenography are different input systems which produce similar/equivalent output.
What terms are used in North America?
CART (Communication Access Real-Time Translation) is the most common term I've seen used in the USA and Canada.Others include open captioning and realtime stenography.
What does VSTTR look like?
When used for large conferences with more than one anticipated user VSTTR output should be on a large screen that is visible to everyone in the room. Anyone who attended BiCon 2010 plenaries would have seen VSTTR in action. The London 2012 Olympics also had some VSTTR (Scroll to about 4/5 of the way down the page).
For use by individuals or those wanting more discretion AVSTTR output can also be transmitted to individual devices like laptops, tablets and smartphones.
I can't find any UK VSTTR videos and the only US ones I can find aren't themselves subtitled which is offensively annoying.
There are also other slightly different ways of presenting VSTTR output so as overlay onto a video screen/slides or similar which only shows 3 or so lines at a time which some people prefer. This tends to be known as "captions" but not always so do be careful to check what the output options are.
What is VSTTR useful for?
VSTTR can be used for most situations where there is someone or a group of people speaking. Lectures and presentations are relatively simple to manage even with questions from the floor. Group discussions and seminars can also be managed but do need excellent sound, careful chairing with one person speaking at a time and speakers need to introduce themselves before speaking each time to ensure the VSTTR operator can label people individually.
VSTTR isn't so ideal for times when a user needs to move around a lot say standing around small group chatting sessions where a lipspeaker or sign language interpreter could walk around with someone but I have seen VSTTR operators be creative about this too where needed.
Like lipspeaking, VSTTR requires a decent level of English for the user to be able to follow it properly. It isn't a replacement for sign language interpreting as British Sign Language (BSL) is a different language. Some sign language users who have good English skills may find VSTTR useful so suitability will be down to individuals.
Ultimately the choice to use VSTTR will depend on a number of factors, suitability for the deaf users, other communication support requested/provided, availability of VSTTR operators and things like available funding.
There is some interesting work being done about realtime text access like VSTTR benefiting people who have English as a second language as it helps them understand spoken information ensuring they haven't misheard what was said thus consolidating their vocabulary. Much the same as subtitles on TV or DVDs.
Professional bodies for VSTTR
The professional body which represents VSTTR operators for deaf people is the Association of Verbatim Speech to Text Reporters (AVSTTR) http://avsttr.org.uk/.
VSTTR operators train for approximately 5 years, usually being court reporters first and then doing extra deaf awareness and technology courses before becoming NRCPD (National Register of Communications Professionals working with Deaf and deafblind people) registered in the UK and NCRA (National Court Reporters Association in the USA.
You can request booking information via AVSTTR's website at no additional cost, complete the webform and that will get sent out to members. VSTTR operators also have their own websites which you can find lined from the main site with photos and contact them directly based on their listed region.
You can also search for VSTTR operators on the National Registers of Communication Professionals (NRCPD) website at http://www.nrcpd.org.uk/ Tip, you have to provide a location before it'll let you select professional type in the next field.
Other agencies I have used are Sign Solutions in Birmingham, BID in Birmingham and RNID now Action on Hearing loss but they will charge extra fees on top compared to going directly to AVSTTR or individuals.
The VSTTR operators on AVSTTR all set their own fees which are about £200 per half day or £300 per full day plus travel/accommodation/subsistence if needed - they may need to charge extra if they travel a long way.
Remote VSTTR
In the last few years remote VSTTR has become available. This requires
- Good sound,
- Good Internet and
- Reliable technology.
Basically the sound of the speakers is transmitted via the Internet using Skype or similar to the remote VSTTR operator who outputs the text which can be accessed via various secure or custom webpage environments for the user. This can be shown on a large screen or smaller device.
121 Captions was the first UK remote VSTTR provider, run by Tina Lannin who is herself deaf. 121 also specialise in a number of overseas services and in 16 different languages.
There is also Bee Communications run by Beth Abbott who I have worked with over the last year to pilot remote captioning for students at my workplace.
Both 121 Captions and Bee Communications use VSTTR operators from all over the world and have slightly different focuses but having seen work by both I am impressed by the quality and standard.
Remote costs seem to be about £70 per full hour and some providers will break down the cost into shorter chunks after an initial hour or two.
Remote VSTTR has advantages in that you can book for shorter periods of time without paying for travel and other costs for the operator to travel so you're not restricted to half or full days and can get costs down. There is also added privacy/discretion for the deaf user, so they don't have a visible "support worker" in the room with them, they can be using a laptop, iPad/tablet or even smartphone to access the content. Remote VSTTR is also bookable at shorter notice than in-person VSTTR which is especially useful when the user is not always able to get their schedule in advance.
However, there are disadvantages as well. The Internet *needs* to be good, preferably wired and reliable. WiFi is often not good enough or randomly goes unreliable and flaky (contention on the network etc). MiFi/4G dongles or devices can work sometimes but again are limited by reception. The chance of technical issues is a lot higher than in-person VSTTR in a new place with unfamiliar users.
In general I will still aim for in-person VSTTR for whole-day events and places where I'm not familiar with the tech setup. Remote VSTTR is however very useful for students as it allows flexibility and discretion and if we're able to provide the students with MiFi dongles as well as university WiFi it seems to be usable for most students most of the time.
How many STTR operators?
In the UK we only have about 25 qualified VSTTR operators listed on the AVSTTR website. However there seem to be more in the US and Australia and timezone differences can increase availability so remote services opens up a wider pool of available operators.
The AVSTTR operators work closely together to share vocabulary dictionaries and if using remote and not booking too late it may be possible to find operators with relevant specialist knowledge for the subject.
For complex work; sessions of longer than a few hours total; or where 5-10 minutely breaks every hour or so are not possible it is recommended that two VSTTR operators are booked. This is about personal health and safety for the operator cos because their work is cognitively very demanding, they need breaks to rest their hands. You may need to think about structuring your event to allow for regular breaks which benefit a number of people and allow for concentration realities anyway.
My views on VSTTR
I *like* VSTTR, I think that's obvious from this entire post. For me it's ideal cos not only is it text which I do consider to be my first and preferred language but if it's full-screen output then there's a whole screen worth of text. It's like scrollback on tap! It solves the issue of access to the spoken content AND note-making as I have a transcript for use later if I want.With a speaker I can hear reasonably well I use VSTTR as a visual supplement to listening to and lipreading the speaker while glancing at the VSTTR output to ensure I've understood what is said and consolidate it in memory. My reading speed is *fast*. Access to VSTTR increases my stamina for spoken content drastically and means I have a good memory of what has been said and is going on. Where a speaker is harder to hear (i.e quietly spoken, male, or has an accent) I rely more on the VSTTR and don't struggle to lipread or hear.
How I found VSTTR
I was never offered VSTTR at university (it was uncommon in the UK in the early noughties although I knew of people who had it in the US as CART) and I suspect it'd have been difficult to arrange for classes. I also wasn't offered VSTTR in the workplace despite having 3 AtW assesments (Once in 2007 and twice in 2008/9) with assessors who knew BSL was not my first or preferred language. I wonder if my ability to understand and converse with my two deaf assessors in BSL (well more SSE speech and SSEish sign at the same time) meant they believed my sign was more durable and reliable than it is. I didn't use my AtW BSL budget allocated in 2008 because it didn't quite meet the need I had and it was a lot of effort to arrange and not possible at short notice. I am still often told that my BSL is better than I think it is - certainly reception for interpreted content of spoken information.
Over the last 6 years in full-time employment I have gained a better idea of what my difficulties with hearing and processing speech are. I know it's endurance of about 60 mins without a break and 3 hours total before I get exhausted and dizzy. I also have a poor memory for things I have heard and struggle to remember the sense and meaning as well as detail of things delivered via conferences or presentation. In contrast, my visual memory for things in text is superb, which is why I like notes and being able to read as much in text as possible.
I am hoping to get Access to Work (AtW) funded STTR for work conferences, meetings with people I don't know and webinars soon.
Remote VSTTR and Webinars
Webinars which are becoming more common in my field of Assistive Technology are extremely difficult for me to access as they tend to involve one or two speakers on a poor video link which isn't lipreadable from + a web chat which I can access. I find after only a few minutes of trying to follow speakers that I get headaches and feel unwell from the effort of understanding audio alone. Remote VSTTR will be ideal as the operator can log into the webinar and get good audio and I can follow the typed output on one of my work screens (I have two).