CAPTCHA Solving With Neural Networks
CAPTCHA Solving With Neural Networks
CAPTCHA Solving With Neural Networks
Abstract
CAPTCHAs, Completely Automated Public Turing tests to tell Computers and Humans Apart, are tests to determine if a user is a human or a machine. Variations include audio and visual CAPTCHAs, which are often found on registration webpages to prevent automated (spam) registration. The focus of this project is on visual CAPTCHAs, which consist of an image with letters or numbers that are to be typed into a form by the user. The goal of this project is to devise a system to break a particular CAPTCHA that can be found on captchas.net.
Background
A CAPTCHA's purpose is to distinguish between a computer and a human by presenting a challenge that is easy for most humans, but difficult for computers. The visual CAPTCHA, often an image with a series of letters and/or numbers, prompts a user to decipher its message. The image often contains distortions to make it difficult for a computer to read, including rotation, translation, scaling, background noise, and color. For a computer to beat a CAPTCHA, it must identify which pixels comprise the letters. This is usually done after removing the background clutter. After the letters are separated from the image, they must be identified, which is often done with a neural network. Neural networks model biological neural systems. Often, there is an input layer, an output layer, and possibly many hidden layers. The input layer takes in data, the output layer gives out data, and hidden layers do intermediate processing. Neurons fire with a value between 0 and 1; when a neuron receives input, it weights the inputs by neurons connected to it, and fires depending on the sum of the weighted inputs. Through training, such a network adjusts the weights on each layer depending on the error produced. The rate at which a network corrects its weights is called the learning rate. Because the entire network is highly connected, neural networks can model highly complex, nonlinear systems and can be proficient in classification and pattern recognition. Previous research in this subject has been done for the past decades. Sherin M. Youssef and Shaza B. AbdelRahman researched license plate recognition in their paper, A Smart Access Control Using An Efficient License Plate Location And Recognition Approach.
The accuracy improves with number of iterations up to a point after which accuracy stabilizes at a maximum. Accuracy also improves with learning rate up to a point at which it overshoots, reducing accuracy. The program often mistook qs for gs and is for ls, probably due to the low quality of the images that were fed into it. Future versions of this program would be able to deal with more distortions, have better segmentation, remove noise more efficiently, and implement more sophisticated neural network structures.