Publication
Big Data 2022
Conference paper

Robust Text CAPTCHAs Using Adversarial Examples

View publication

Abstract

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a widely used technology to distinguish real users and automated users such as bots. However, the advance of AI technologies weakens many CAPTCHA tests and can induce security concerns. In this paper, we propose a user-friendly text-based CAPTCHA generation method named Robust Text CAPTCHA (RTC). At the first stage, the foregrounds and backgrounds are constructed with font and background images respectively sampled from font and image libraries, and they are then synthesized into identifiable pseudo adversarial CAPTCHAs. At the second stage, we utilize a highly transferable adversarial attack designed for text CAPTCHAs to better obstruct CAPTCHA solvers. Our experiments cover comprehensive models including shallow models such as KNN, SVM and random forest, as well as various deep neural networks and OCR models. Experiments show that our CAPTCHAs have a failure rate lower than one millionth in general and high usability. They are also robust against various defensive techniques that attackers may employ, including adversarially trained CAPTCHA solvers and solvers trained with collected RTCs using manual annotation. Codes available at https://github.com/RulinShao/RTC.