C.A.P.T.C.H.A Completely Automated Public Turing Test to Tell Computers and Humans Ap
Active In SP
Joined: Mar 2010
21-04-2010, 11:42 PM
Completely Automated Public Turing Test to Tell Computers and Humans Apart
A CAPTCHA (an acronym for "completely automated public Turing test to tell computers and humans apart", trademarked by Carnegie Mellon University) or a MAPTCHA (Mathematical) is a type of challenge-response test used in computing to determine whether or not the user is human. The term was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas J. Hopper of Carnegie Mellon University, and John Langford of IBM. A common type of CAPTCHA requires that the user type the letters of a distorted image, sometimes with the addition of an obscured sequence of letters or digits that appears on the screen. Because the test is administered by a computer, in contrast to the standard Turing test that is administered by a human, a CAPTCHA is sometimes described as a reverse Turing test. This term, however, is ambiguous because it could also mean a Turing test in which the participants are both attempting to prove they are the computer.
Since the early days of the Internet, users have wanted to make text illegible to computers. The first such people were hackers, posting about sensitive topics to online forums they thought were being automatically monitored for keywords. To circumvent such filters, they would replace a word with look-alike characters. HELLO could become |-| 3 |_ |_ () or )-( 3 Ã‚Â£ Ã‚Â£ 0, as well as numerous other variants, such that a filter could not possibly detect all of them. This later became known as leetspeak.
The first discussion of automated tests which distinguish humans from computers for the purpose of controlling access to web services appears in a 1996 manuscript of Moni Naor from the Weizmann Institute of Science, entitled "Verification of a human in the loop, or Identification via the Turing Test". Primitive CAPTCHAs seem to have been later developed in 1997 at AltaVista by Andrei Broder and his colleagues in order to prevent bots from adding URLs to their search engine. Looking for a way to make their images resistant to OCR attack, the team looked at the manual to their Brother scanner, which had recommendations for improving OCR's results (similar typefaces, plain backgrounds, etc.). The team created puzzles by attempting to simulate what the manual claimed would cause bad OCR. In 2000, von Ahn and Blum developed and publicized the notion of a CAPTCHA, which included any program that can distinguish humans from computers. They invented multiple examples of CAPTCHAs, including the first CAPTCHAs to be widely used (at Yahoo!).
CAPTCHAs are used to prevent bots from using various types of computing services. Applications include preventing bots from taking part in online polls, registering for free email accounts (which may then be used to send spam), and, more recently, preventing bot-generated spam by requiring that the (unrecognized) sender pass a CAPTCHA test before the email message is delivered. They have also been used to prevent people from using bots to assist with massive downloading of content from multimedia websites. CAPTCHAs are used in online message boards and blog comments to prevent bots from posting spam links as a comment or message.
CAPTCHAs are by definition fully automated, requiring little human maintenance or intervention in administering the test. This has obvious benefits in cost and reliability.
The algorithm used to create the CAPTCHA is often made public, though it may be covered by a patent. This is done to demonstrate that breaking it requires the solution to a difficult problem in the field of artificial intelligence (AI) rather than just the discovery of the (secret) algorithm, which could be obtained through reverse engineering or other means.
CAPTCHAs based on reading text â€ or other visual-perception tasks prevent visually impaired users from accessing the protected resource. However, CAPTCHAs do not have to be visual. Any hard artificial intelligence problem, such as speech recognition, can be used as the basis of a CAPTCHA. Some implementations of CAPTCHAs permit users to opt for an audio CAPTCHA.
The development of audio CAPTCHAs appears to have lagged behind that of visual CAPTCHAs, however, and presently may not be as effective. Other kinds of challenges, such as those that require understanding the meaning of some text (e.g., a logic puzzle, trivia question, or instructions on how to create a password) can also be used as a CAPTCHA. Again, there is little research into their resistance against countermeasures.
For non-sighted users (for example blind users, or the colour blind on a colour-using test), visual CAPTCHAs present some serious problems. Because CAPTCHAs are designed to be unreadable by machines, common assistive technology tools such as screen readers cannot interpret them. Since sites use CAPTCHAs as part of the initial registration process, or even every login, this challenge can completely block access. In other cases, those with sight difficulties can choose to identify a word being read to them.
One alternative method involves displaying to the user a simple mathematical equation and requiring the user to enter the solution as verification. Although these are much easier to defeat using software, they are suitable for scenarios where graphical imagery is not appropriate, and they provide a much higher level of accessibility for visually impaired users than the image-based CAPTCHAs. These are sometimes referred to as MAPTCHAs (M = 'Mathematical').
There are a few approaches to defeating CAPTCHAs: using cheap human labour to recognize them, exploiting bugs in the implementation that allow the attacker to completely bypass the CAPTCHA, and finally improving character recognition software.
Computer Character Recognition
Although CAPTCHAs were originally designed to defeat standard OCR software designed for document scanning, a number of research project and implimentations have proven that it is possible to defeat many CAPTCHAs with programs that are specifically tuned for a particular type of CAPTCHA. For CAPTCHAs with distorted letters, the approach typically consists of the following steps:
Removal of background clutter, for example with colour filters and detection of thin lines.
Segmentation, i.e. splitting the image into segments containing a single letter.
Identifying the letter for each segment.
Step 1 is typically very easy to do automatically. In 2005, it was shown that neural network algorithms have a lower error rate than humans in step 3. The only part where humans still outperform computers is step 2. If the background clutter consists of shapes similar to letter shapes, and the letters are connected by this clutter, the segmentation becomes nearly impossible with current software. Hence, an effective CAPTCHA should focus on step 2, the segmentation.
Neural networks have been used with great success to defeat CAPTCHAs as they generally are indifferent to both affine and non-linear transformations. As they learn by example rather than through explicit coding, with appropriate tools very limited technical knowledge is required to defeat more complex CAPTCHAs.
Since CAPTCHAs are based on open problems in artificial intelligence (AI), they also offer well-defined challenges for the AI community, and induce security researchers, as well as otherwise malicious programmers, to advance the field of AI. (This is similar to research in cryptography advancing algorithms for factoring large numbers.) Several groups have created programs that can pass many CAPTCHAs over 80% of the time (see below). These algorithms represent significant progress in the area of text recognition. CAPTCHAs are thus a win-win situation: either a CAPTCHA is not broken and there is a way to differentiate humans from computers, or the CAPTCHA is broken and an AI problem is solved. Using harder AI problems, our newly developed CAPTCHAs are still not broken.
Search Engine Bots.
Worms and Spam.
Preventing Dictionary Attacks.
Proposed Idea (Implementation):
The idea is to device a CAPTCHA Toolkit that can be used as a plug-in or even independently to create customized CAPTCHAs. This toolkit can be used as a plug-in to standard software packages, to prevent them from custom brute force attacks. Even Operating System logins and such fragile logins can be improved using CAPTCHAs. Moreover using some advanced AI algorithms which will be created on-the-fly, random nature of CAPTCHAs can be achieved.
Any standard OCR software will be used to detect whether the CAPTCHAâ„¢s generated are strong or weak.
Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion