A three-year-old attack technique to bypass Google’s audio reCAPTCHA by using its own Speech-to-Text API has been found to still work.
A researcher uses an old unCAPTCHA trick against the latest audio version of reCAPTCHA, with a 97% accuracy rate.
Researcher Nikolai Tschacher disclosed his findings in a Video of proof-of-concept (POC) of the attack on January 2.
An old attack method dating back to 2017 that uses voice-to-text to bypass CAPTCHA protection turns out to still work on Google’s latest reCAPTCHA v3.
“The idea of the attack is very simple: You grab the MP3 file of the audio reCAPTCHA and you submit it to Google’s own speech-to-text API,” Tschacher said in a write-up. “Google will return the correct answer in over 97% of all cases.”
CAPTCHA, introduced in 2014, is an acronym for Completely Automated Public Turing Test to Tell Computers and Humans Apart.
ReCaptcha is Google’s name for its own technology and free service that uses image, audio, or text challenges to verify that a human is signing into an account.
It’s a bit of code available free of charge from Google for accounts that handle less than 1 million queries a month. Google recently started charging for larger reCAPTCHA accounts.
reCAPTCHA is a popular version of the CAPTCHA technology that was acquired by Google in 2009. The search giant released the third iteration of reCAPTCHA in October 2018. It completely eliminates the need to disrupt users with challenges in favor of a score (0 to 1) that’s returned based on a visitor’s behavior on the website — all without user interaction.
The report includes a video showing how Tschacher’s bot works. He added that this attack method works on even the latest version, reCAPTCHA v3.
Tschacher pointed out that his bot wouldn’t be easy to exploit at scale for three specific reasons: Google rate-limits audio CAPTCHA access; Google is likely tracking bot metrics; and, it creates a fingerprint of each browsing device to stop bots.
To carry out the attack, the audio payload is programmatically identified on the page using tools like Selenium, then downloaded and fed into an online audio transcription service such as Google Speech-to-Text API, the results of which are ultimately used to defeat the audio CAPTCHA.
The whole attack hinges on research dubbed “unCaptcha,” published by University of Maryland researchers in April 2017 targeting the audio version of reCAPTCHA who then reported they “achieved 85 percent accuracy” with the tech they Named“UnCAPTCHA.”
Google responded with improved browser automation detection and the use of spoken phrases instead of numbers, according to the researchers’ GitHub reports. But by June 2018 researchers found the latest reCAPTCHA was easier to trick that its predecessor.
Following the attack’s disclosure, Google updated reCAPTCHA in June 2018 with improved bot detection and support for spoken phrases rather than digits, but not enough to thwart the attack for the researchers released “unCaptcha2” as a PoC with even better accuracy (91% when compared to unCaptcha’s 85%) by using a “screen clicker to move to certain pixels on the screen and move around the page like a human.”
In March 2018, Google addressed a separate flaw in reCAPTCHA that allowed a web application using the technology to craft a request to “/recaptcha/api/siteverify” in an insecure manner and get around the protection every time.
The German researcher published the PoC code.
The automatic resolution of CAPTCHA challenges has become a very popular area of research, even free browser extensions have been developed that help users respond to these tests at the push of a button.