AI Detection Software – Useful Tool or Inaccurate Application?

Facebook
Twitter
LinkedIn
WhatsApp

Do you want to take your content to the next level? Our crew can help! Request a free sample of 5,000 words, or speak with a founder.

Artificial intelligence (AI) has taken the world by storm. With their varied applications in many different industries, these tools are revolutionizing the world as we know it. One of the industries that’s been heavily impacted by AI is content creation. There are a number of AI tools that exist to help create content. However, this introduces issues when you need to create original content, or in the case of content creation as a part of the learning experience.

To combat the use of AI and large language models (LLM) to create entire articles, research papers, and more, developers have also released tools to detect the use of AI during the writing process. These AI detection tools are becoming an integral part of our daily lives. If you work as a marketer, seo manager, content creator, or educator; you often need a way to determine if a third party tool was used in writing the content being submitted.

Yet, just as we think we have a solution to the problem, a new one creeps in. While AI detection tools seemed to be an answer to identifying the harmful or inappropriate use of AI, their accuracy and reliability have been called into question. As such, we’ve decided to run an extensive test on various AI tools to see for ourselves whether these programs are an effective way to detect AI.

What is AI Detection Software?

AI detection tools are specialized apps developed to identify and classify various phenomena within a written piece of text. These tools analyze large amounts of written data using artificial intelligence algorithms to make predictions. The goal of these tools is to accurately identify whether content was written by AI or a human. 

If you’ve been in the content creation industry for the past few years you’ve probably had a front row seat to the evolution of AI and detection tools in our industry. Perhaps, like us, you’ve already noticed the influx of writers using these tools to improve or speed up the production of articles. Perhaps, like us, you’d prefer to have all of your content original and written by a human. Afterall, if a client is paying for original work, that’s what you want to deliver.

Being able to accurately detect the use of large language model (LLM) content is essential in many industries.

Unfortunately, the use of LLM content can have a number of societal effects that many aren’t aware of. Some of the societal risks posed by LLM include:

  • Mass propaganda
  • Fake news
  • Toxic spam
  • Academic dishonesty/AI plagiarism

  • Cheating writers

  • Cheating content agencies

  • Fake product reviews

  • Fake job applications

  • Fake university application essays

  • Fake scholarship applications

Clearly it’s vital that there’s a reliable and accurate way to detect the use of AI in situations where it’s uncalled for. Hence, the rise of AI-detection tools. While these tools have shown a remarkable advancement recently, their efficiency hasn’t really been tested. This uncertainty has an effect on a number of industries. Errors in AI detectors can have far reaching consequences that have already become apparent. One of the most notable is original human written work getting flagged as AI, causing erroneous accusations. This has already been experienced particularly within the education sector.

These situations have placed a spotlight on the efficacy of such tools and whether they’re reliable to use in real-world scenarios. Through thorough and independent testing, we hope to discover just how accurate and reliable these tools are.

How Do AI Detection Tools Work?

Perhaps you’re like us and you started using these tools without much thought about how they work. Afterall, the name tells you exactly what the tool does. We want to be able to deliver the best original content to our clients, and using these tools was a great option to ensure the articles we submitted were original. But we noticed shortly after starting to use these tools that they aren’t all equally good. That’s when we decided to start investigating how these tools really work.

There are three general approaches that AI detectors, or ‘classifiers’ in machine learning talk, can use to differentiate between human-written and AI-generated content. These are known as feature-based, zero-shot, or fine tuning AI model approaches. 

Basically, a feature-based approach relies on certain identifiable differences between AI and human content. They look for these patterns and based their evaluations on how many times these patterns repeat themselves. The zero-shot approach works a little differently. In this case it uses its own pre-trained language model to detect if content was created by a similar tool. Finally, the fine tuning approach utilizes large datasets across different language models that contain both AI and human written content for training.

Let’s take a brief look at how each one of these works in a bit more detail.

Feature-Based Approach

A feature-based approach relies on the fact that there are identifiable and known differences that exist between content that’s written by a human and ones generated by AI or LLM, like ChatGPT. These features include:

  • Burstiness – Refers to the appearance of certain words in clusters, or bursts, rather than being spread out evenly throughout the text. Higher burstiness usually indicates a human writer.
  • Perplexity – Measures how well a probability model can predict the next word. Higher perplexity usually indicates a human writer.

  • Frequency features – Refers to the count of how frequently words, phrases, or word types appear in the text. AI typically overuses or underuses certain words or phrases at a rate not consistent with a human writer.

  • Punctuation – Evaluates the use and distribution of punctuation in the text. Typically AI tends to use punctuation correctly, but stylistically unusual. 

The advantage of a feature-based approach is that once patterns have been identified, they can be repeatedly identified. This makes for a much more cost-effective and fast tool. However, this also means that more advanced LLMs such as ChatGPT4 and Bard can create more varied content to bypass the tools.

Popular examples of AI detectors that utilize this approach include Winston AI and GPTZero.

Zero-Shot Approach

The zero-shot approach uses a pre-trained language model to identify how the text was created. A simple way to explain how this works is as follows. Basically the tool asks itself what the likelihood is that the content it’s seeing was generated by a tool similar to itself. 

This is a much more simplified approach that relies entirely on how the tool was trained rather than looking at things like burstiness, perplexity, and more. This means that the tool is much easier to build and it doesn’t require as much supervised training as a feature-based approach. However, this also means the tool is easier to bypass by using paraphrasing and human-editing.

Examples of AI detectors that use the zero-shot approach include ZeroGPT.

Fine Tuning AI Model Approach

A fine tuning AI model approach is one of the most effective approaches to use when it comes to AI detection. These tools use combinations of LLMs such as BERT or RoBERTa to train on sets of AI and human generated text. It learns to identify the differences between the two in order to detect if the content was created by AI or a human.

Because it relies on such large data sets for training, it’s one of the most effective approaches to detect AI. However, this also makes it more expensive to train and operate. There can also be a lag in their capabilities when testing the newest AI tools since their training needs to be updated which can take longer due to the large dataset required.

One example of an AI detector that uses the fine tuning AI model approach is Originality.AI.

Challenges with Using AI Detection Software

As we started looking into AI detection tools more closely, we also started to realize there were a few challenges with using these tools. Being able to use them accurately wasn’t as much about how the user inputs their content, but more about how the tool was trained.

Since AI technology is progressing so fast, developers are struggling to keep up with the demand of these detection tools. Here are a few of the most notable challenges that developers face with AI detection software.

Training Data Quality

AI detection tools rely on vast amounts of training data. In this case, it can’t just be quantity over quality. The quality of the training data also needs to be high in order to produce the most accurate results. Having access to large volumes of high quality training data is essential during model development. Data that contains bias or that’s inaccurate will result in a model that produces unreliable results, or displays discriminatory behavior. 

Developers need to employ rigorous data collection methods, followed by preprocessing techniques to ensure the data is free from any bias and of a high quality. This endeavor can be expensive and time-consuming.

Model Complexity and Algorithm Selection

The complexity of the model being used for the AI detector also plays a crucial role in the accuracy of these tools. More complex patterns are able to capture more intricate patterns, which leads to more accurate results. However, this can also be computationally more expensive, not to mention that it requires more extensive training data.

Alternatively, simpler models require less resources and are faster to run, at the expense of the accuracy. This makes you think about which models free AI detectors use. At some point, resources invested by the developers into creating the tool needs to be earned back. This is when you start noticing one of two things: ads or paid versions.

Another important consideration is the algorithm selection. Different algorithms perform in different ways depending on the nature of the problem at hand. This means that it’s essential to choose the right algorithm to detect the use of AI for optimal accuracy. 

Developers who write these algorithms themselves have the advantage of creating one that exactly matches their needs. But they’ll also be responsible for constantly updating the algorithm as the technology progresses. Alternatively, using open-source algorithms means you’re not responsible for all the nitty-gritty, but it also means you might have to sacrifice certain performances.

Ethical Considerations

When it comes to AI detectors there are certain ethical considerations that should also be taken into account as these tools could have a direct impact on people’s lives. Developers need to address potential biases and ensure fairness. They also need to be honest regarding the true accuracy of their tools. 

Biased datasets can have a discriminatory outcome which can affect already marginalized groups. Additionally, false positives and false negatives can lead to serious consequences in certain industries. Consider the effects that a false positive or negative can have in education, healthcare, or the justice system.

The accuracy of AI detectors should continually be improved, striving for as-close-to perfection as possible. Developers should follow ethical guidelines and include diverse perspectives in the development process. Clear accountability and transparency are vital in ensuring that AI systems are created to benefit mankind while minimizing harm. 

Picking the Best AI Detection Software

When we started to set up the test, we had to decide what features were important, not only to us but other users as well. Afterall, a tool is no good if it cannot deliver specific services. There are so many options when it comes to AI detection tools and each one will try to convince you that it’s the best. 

When it comes to choosing the best AI detection tool, there are a few key considerations to ensure optimal performance and alignment with specific needs. Let’s take a look at what you, as a user, should look at when choosing an AI detector.

Accuracy

Accuracy is probably the most important criterion you’ll be looking at. The effectiveness of any AI detector hinges on its ability to accurately identify and classify patterns. However, you can’t merely rely on the accuracy the provider posts. There are various ways of calculating accuracy that can make a tool appear more accurate then it really is. As such, evaluating the tools track record and trying it out for yourself is the best way to provide insights into its precision.

Real Time Analysis

Being able to analyze content in real-time is crucial for any applications that require a timely response. A tool’s speed can significantly impact its practical utility, particularly in dynamic situations. 

Adaptive Learning

Adaptive learning mechanisms are essential for AI detection tools. These tools need to improve continuously to stay effective in various scenarios. A tool that’s able to adjust its models dynamically based on new data ensures long term relevance and performance.

User Interface (UI)

A user-friendly interface  contributes largely to a tool’s accessibility and ease-of-use. In turn, this facilitates effective utilization across diverse groups. The easier a tool is to use and understand, the more effective it will be to use.

How We Tested These Tools

In order to get a good overview of how accurate and reliable AI detectors really are, we set up our own test to see it for ourselves. 

To do this, we started by creating our own set of testing data. For the best results we decided to cover three levels of intricacy in the articles we were testing. The first batch was a general or simple topic, something you’d find in a general blog. Next, we had articles that covered a more technical topic, but that was still aimed at beginners. Finally, we chose a topic that was more intricate and technical.

After defining our scope of testing, we started creating the articles. The tests included articles written by a human, ones written by AI and then edited by a human, and ones that are purely AI generated with no human intervention. 

Then we proceeded to put the articles through each of the tools, and then we’d record the results in a table. We used the following quantitative metrics to evaluate the accuracy of the AI detectors:

  • True positive (TP)
  • True negative (TN)
  • False positive (FP)

  • False negative (FN)

These results can then be used to calculate the following metrics:

  • Precision
  • Recall
  • F1 score

The above metrics will provide us with evidence based results to demonstrate the true accuracy and reliability of these tools. Most developers claim that their tools are accurate at least 99% of the time. With our own independent testing we can see whether this still applies in real life scenarios.

For a more in-depth dive into the testing methodology we used and how we calculated the various metrics, take a look at our article on our full testing methodology here.

Test Results of AI Detection Tools

Now, the part that you’ve been waiting for. I have to admit that actually testing these tools was quite the rollercoaster. There were a number of surprises mixed with a few expected results. I wasn’t aware of the extensive number of tools available. I also wasn’t aware of how many offered additional features and what that could mean for content creators. These features were also very varied, with some offering plagiarism and fact checkers, whereas others offered additional AI tools. 

Here are the results of our independent testing of the most popular AI detection tools. This list is not definitive and will continue to grow as new AI detectors are released that show promise, or if the existing ones make improvements or changes.

Keep an eye on this list to see how changes in the industry affect the top ranking AI detectors.

The tools in our review below are listed according to their accuracy and will be updated to reflect any changes in accuracy and reliability. For a quick reference, please consult the table below.

Rank Tool Free Version (Limited) Paid Version F1 Score (Overall Accuracy)
1 Content at Scale Yes Yes 92.31%
2 Winston AI

Yes

Yes 85.71%
3 Originality.AI No Yes 85.71%
4 ZeroGPT Yes Yes 80.00%

5

CrossPlag

Yes

Yes

80.00%

6 AI Detector Pro

Yes

Yes 80.00%
7 GPT-2 Output Detector

Yes

No 80.00%
8 GPTZero Yes

No

76.92%

9 CopyLeaks

Yes

Yes

66.67%

10 Corrector App

Yes

No

66.67%

11 Sapling Yes

Yes

57.14%

12 Undetecable.AI Yes No

54.55%

13 Kazan SEO Yes No

40.00%

14 GpTRadar Yes Yes

40.00%

15

Detecting-AI.com Yes No 33.33%
Rankings last updated on: 22 January 2024

1. Content at Scale

Content at Scale is one of my personal favorites when it comes to reliable AI detectors. However, I must admit that the paid plan is slightly expensive to my taste. That being said, the free version does a great job, even if it’s more effort for articles longer than 2500 characters.

The latest update to Content at Scale modeled their content off GPT-4, Bard, Gemini, and Claude which makes it in-line with the recent generation of AI generators. It has been trained on a diverse range of content such as blog posts, wikipedia pages, essays, and more to understand human writing better.

Content at Scale does provide a number of features when you get the paid version. These features are mostly related to AI, rather than writing. Some of them include a prompt library, AI blueprints, personalized AI, and the ability to rewrite AI content to make it more human.

Core Features

  • AI scan

Additional Features

  • Rewrites
  • Personalized AI
  • AI blueprints
  • Prompt library
  • AI agents
  • Chrome extension (coming soon)

Paid vs. Free

When it comes to AI content, Content at Scale is one of the front-runners. There is a free version, but it’s limited to 2500 characters per scan. This means that users might need to break up longer texts which can complicate the results. The free version also highlights the AI in your text, giving you an opportunity to change it and create more human-like content.

The paid version is quite expensive. However, it’s important to note that with Content at Scale you don’t just get an AI detector, but an AI suite. This means you can use the paid version to create AI texts that read like a human. Generating undetectable AI content is expensive, so if this is part of your goal, this tool is for you. If you’re only looking for an AI-detector, it might be too expensive.

Content at Scale Screenshot testing AI detection software
A screenshot showing Content at Scale in action.

Test Results

Content at Scale performed very well in our testing. There were very few false positives, which helped them maintain a high overall accuracy rating.

Their accuracy according to our testing was 92%, Due to the minor number of false positives, the precision of the tool was 86%. Finally, because there were no false negatives, the recall was 100%.

The F1 score and overall accuracy and reliability for Content at Scale is 92%. This is a good score compared to many of the competitors on this list. However, bear in mind that the tool is better suited to businesses that want to use AI in general.

Pros and Cons

Pros

  • Accurate
  • Many useful AI features

Cons

  • Very expensive
  • Free version is limited

2. Winston AI

Winston AI is a AI detector that can be used to check content for the use of AI. According to its developers, the tool is able to detect content created by some of the leading AI-generators, such as ChatGPT, GPT-4, and Bard. 

Winston AI is trained on a large dataset that has been human reviewed to remove biases and inaccuracies. This helps to minimize the number of false positives. They also train their model with diverse content generated by all the well known LLMs so it’s able to recognize synthetic writing.

The team at Winston AI updates their detection algorithm every week to stay up to date with the latest developments. This also enables them to stay ahead of the newest bypassing technologies such as paraphrasing and AI humanizers.

Core Features

  • AI scan for text, uploaded document, pictures, and handwriting
  • Email and online chat support available

Additional Features

  • AI detector chrome extension
  • Plagiarism checker
  • Team management
  • Shareable reports

Winston AI has a free version available that’s limited to a 2000 word scan. With the free version you still use the same advanced AI detector as in the paid option, it’s only the amount of words that you can check that’s limited. With the free version you still have the ability to scan text, documents, pictures, or handwriting. You also still have access to chat and email support.

The paid version starts with an Essential package. On this package you can scan up to 80 000 words. As with the free plan you can scan text, documents, pictures, and handwriting. However, a feature not available in the free plan that is available in the Essential plan is the ability to generate shareable PDF reports. 

The final plan they have on offer is the Advanced plan. With this package you get everything from the previous plan, except you can scan 200 000 words. You also have access to the plagiarism checker and you can add unlimited team members to your account.

Winston AI Screenshot
A screenshot showing Winston AI in action.

Test Results

During our testing, Winston AI performed quite well. Most of the articles were accurately classified as either human or AI generated. However, there were a few human articles that the tool registered as being created by AI.

The tool’s accuracy was decent at 83%, while the precision dropped a bit at 75% because of the number of false positives. The tool excelled in recall with 100%, which indicates there are fewer false negatives; and in the case of our test none.

Overall the accuracy of this AI-detector 85% which isn’t a bad score when compared to competitors. We found the free version quite useful, but limited if you need to do bulk scans. The paid versions are also slightly more expensive than many of the competitors – however, based on the accuracy in comparison I feel this may be justified. 

Pros and Cons

Pros:

  • Fairly accurate
  • Ability to upload and scan documents, pictures, and handwriting
  • Nice additional features

Cons

  • Free version is limited
  • Noted a few false positives

3. Originality.AI

Originality.AI is another well-known name when it comes to AI detectors. This tool is also able to detect content from some of the most well known LLM such as ChatGPT, GPT-4, and Bard.

Originality.AI uses an advanced algorithm that utilizes natural language processing techniques when scanning content. This makes the results quite accurate, however it can also slow down the tool a bit when compared to some competitors. Despite this, you still get results in real-time. 

The team at Originality.AI take their tool seriously and are transparent about their accuracy. Their dedication to transparency also means that they stay ahead of the trends and update their tool frequently to ensure it stays at the top of its game. 

Originality.AI is marketed towards web publishers, content marketing agencies, and writers. The chrome extension that’s provided is also very useful in proving that your work was human written, even if a tool picks it up as AI generated. It enables you to share the visualization process with supervisors or clients to show you wrote the article yourself. This is a very nice feature as sometimes AI detectors misclassify human work as AI written.

Core Features

  • AI checker
  • Fact checker
  • Plagiarism checker
  • Readability checker

Additional Features

  • Chrome extension
  • Unlimited scan history
  • Shareable reports
  • API
  • Full site scans
  • Scan from URL
  • Team management
  • Scan tags

Originality.AI doesn’t have a free version available. Due to the robust algorithm the tool uses, a free or ad-supported option isn’t realistic for them. The base subscription gives the user 2000 credits. Each credit checks 100 words for AI or plagiarism, whereas one credit equals 10 words for fact-checking. You can top-up your credits at any time for only $0.01/credit.

Originality.AI also offers a once-off option if you won’t be using the tool long term. This is slightly more expensive than the monthly subscription. It includes most of the features that are included with the monthly subscription, except it doesn’t offer file upload, site scans, an API, or team management.

Originality.AI screenshot
A screenshot showing Originality.AI in action.

Test Results

Originality.AI performed quite well in our testing. In most cases it was able to accurately predict whether the text was written by a human or AI. However, there were a couple of instances where human written articles were flagged as being AI written.

The results of our tests with Originality.AI concluded with an accuracy of 83%. Furthermore, the precision of the tool was 75%. During testing the tool didn’t produce any false negatives, providing a recall of 100%.

Overall, when considering all of the above metrics, the general accuracy for Originality.AI was 85.71%. This is a decent score when compared to competitors. I do wish there was at least a trial version for the tool to give users an opportunity to see how it works before committing. At the moment, though, you’ll have to pay before you can start using the tool so if it doesn’t work for you, you’ll have to complete the month you paid for or lose your money. The website does include a lot of video content showing how the tool works, so maybe that’s enough for some to make a decision.

Pros and Cons

Pros:

  • Fairly accurate
  • Chrome extension that also proves you wrote the content
  • Can check an entire site or URL
  • Includes a plagiarism, readability, and fact checker

Cons

  • No trial or free version

4. ZeroGPT

ZeroGPT is another popular tool used for detecting AI. Their website is very simplistic which makes it easy to use. However, there were a lot of ads which did impact the user experience. These ads are removed if you purchase a paid plan. 

ZeroGPT uses DeepAnalyse Technology to train their detection model. This includes a multistage methodology to improve the accuracy and results. The tool is trained on an extensive collection of text sourced from the internet, educational datasets, and their own proprietary synthetic AI datasets. 

It’s also one of the few tools where the free version has quite a bit of functionality beyond just scanning and evaluating the text. For example, the free version keeps track of your result history (excluding the text) and you’re able to upload five batches to be scanned. 

Core Features

  • AI scan
  • Batch scanning
  • AI summariser and paraphraser
  • Grammar and spell checker

Additional Features

  • API

The free version includes all of the features of the paid plans, except that it includes ads and has certain limitations. These limitations include a cap on the number of characters that can be scanned per text, bulk actions that can be completed, and the number of words that can be changed in the AI summarizer and paraphraser.

The paid version of ZeroGPT is affordable and matches the industry standard for similar tools. The biggest difference between the free and paid version of the tool is that the paid version has no ads, and the limitations are raised significantly. This means you’ll be able to process more text when using the paid version.

ZeroGPT screenshot
A screenshot showing ZeroGPT in action

Test Results

For such a simple looking tool, we were quite surprised at the results delivered by ZeroGPT. This was one of the few tools in our test that didn’t deliver any false positives, however there were a few false negatives. 

The accuracy of the tool was decent at 83% and because there were no false positives during our test, the precision was 100%. In this case the recall dropped because of the number of false negatives and ended up at 67%. We noticed that the false negatives were mostly attributed to the more technical articles we tested.

The overall accuracy and reliability represented by the F1 score was 80%. The tool was easy to use, and having access to features such as a paraphraser and summarizer, even in the free version, was a great bonus. 

Pros and Cons

Pros

  • Affordable
  • Decent accuracy and reliability
  • Many additional features to improve writing (even in free version)

Cons

  • Lots of ads in free version
  • Free version has limited characters and words

5. CrossPlag

CrossPlag was an interesting tool to test. Unlike some of the other tools, with CrossPlag you immediately need to sign into an account – even when using the limited free version. They also market their tool towards individuals rather than businesses. 

According to their website, the AI detector is trained using a combination of machine learning (ML) algorithms and natural language processing techniques. It’s trained on a large dataset that includes both human written and AI generated texts.

Once you create an account, it’s nice to see all of your scans stored in one place. Also, the dashboard keeps track of how similar your different documents are. This can be very useful when you’re trying to improve your writing skills. 

Core Features

  • AI scan

Additional Features

  • Useful dashboard with document history
  • Document statistics
  • Live chat and email support

Paid vs. Free

CrossPlag provides users with ten free credits to test out the tool. These credits can scan up to 1000 words. Once your credits have been depleted, you’ll need to buy more for additional scanning. To use the free version you’ll have to create an account.

Buying more credits can become expensive as the pay-as-you-go option is more expensive than the industry standard and you can only scan an additional 5000 words (50 credits). Meaning that if you had to scan numerous documents it can become very expensive.

CrossPlag screenshot
A screenshot demonstrating CrossPlag in action.

Test Results

CrossPlag turned out to be a decent AI detector geared towards individuals. During our testing we noticed that it was quite accurate in detecting the use of AI. However, it also classified many of the human-written texts as AI. In fact, in our test batch, it didn’t detect any true negatives that would indicate human written content.

Due to the lack of any true negatives, the accuracy score was only 67%. The amount of false positives also brought down the precision score to 67%. However, the tool didn’t return any false negatives which gave it a recall of 100% – which raised the F1 score significantly.

The F1 score which reflects the overall accuracy and reliability was 80%. This isn’t a bad score, but I am concerned with the number of false positives detected by the tool. The tool is also quite expensive for only an AI detector with no additional features or tools.

Pros and Cons

Pros

  • Fairly accurate at detecting AI
  • Easy to use dashboard

Cons

  • Expensive
  • Limited credits even on paid version
  • No additional features

6. AI Detector Pro

AI Detector Pro is another AI detection tool that is highly recommended and it’s easy to see why. They have a nicely designed website and the tool itself is easy to use. The design of both the website, and the dashboard when using the tool is clean and uncluttered.

The dashboard is simple to navigate and it’s easy to see the results of previous projects. The tool highlights tests that AI might detect so that you can learn how to create completely unique content even if your writing is reminiscent of AI. Also, it gives suggestions for word changes and re-writes to help replace the parts that are picked up as AI generated.

Core Features

  • AI scan
  • AI eraser

Additional Features

  • Reports
  • Research tools
  • Developer tools
  • Content tools
  • API

AI Content Detector has a free version available, although it’s probably more like a trial as you only get three scans. However, these limited scans also include the AI eraser which gives users suggestions for replacing words, sentences, or sections that get picked up as AI generated. 

The paid version is quite expensive, especially if you consider that it’s only an AI detector. There are no additional tools such as plagiarism or fact checkers that can be used alongside the checker to warrant the high price. Also, for the high price you pay, you only get 100 reports/articles a month on the first paid plan. The first paid plan also doesn’t even include all the features. Despite the price tag for the first tier, you don’t get branded reports or access to the developer tools. You don’t even get the API.

AI Detector Pro Screenshot
A screenshot showing AI Detector Pro in action.

Test Results

This tool performed quite well, but I can’t justify the price tag. There are other tools on this list that perform equally as well that have a price tag that’s more in line with the industry norm, or that offer more features.

Despite accurately identifying many of the texts, there were still a few false positives that reduced the accuracy to 75%. The number of false positives also had a direct impact on the precision which, according to our tests, is only 67%. The tool didn’t return any false negatives, so the recall was 100%.

After our testing AI Detector Pro had an F1 score and ultimate accuracy and reliability of 80%. While this is a decent score compared to the other AI detectors on this list, I just can’t get over the price tag. The AI eraser is a great feature, but I don’t think it justifies this massive price hike.

Pros and Cons

Pros

  • AI eraser with suggestions to replace AI
  • Report history
  • API (only available with Unlimited plan)

Cons

  • Very expensive
  • Limited features

7. GPT-2 Output Detector 

GPT-2 Output Detector is one of the first AI detectors I came across and used before this test. It’s a simple free AI detector that’s still classified as a demo. There is no fluff of extras when using the tool, it’s purely an AI detector.

The tool uses the RoBERTa model in its algorithm. This is a state-of-the-art language representation model that was originally developed by Facebook AI. It is an improvement on the original BERT language model.

Core Features

  • AI scan

Additional Features

  • None

GPT-2 Output Detector is still a demo version. As such, it’s free to use with no current paid plans in place. Because it’s still only a demo, there’s no additional features or limitations on the tool. However, they do recommend adding text that’s longer than 50 tokens for more accurate results.

GPT Output Detector Screenshot
A screenshot showing how GPT Output Detector Works.

Test Results

We were fairly surprised by the results we collected when testing the GPT-2 Output Detector. Despite still being a demo version, the tool performed quite well. In most cases it was able to accurately predict the source of the text with only a few false positives.

The accuracy of the tool according to our testing was 75%. Due to the number of false positives the precision dropped to 67%. The tool didn’t identify any false negatives and therefore the recall was 100%.

The overall accuracy and reliability presented by the F1 score for GPT-1 Output Detector was 80%. We look forward to seeing if they’ll be adding any additional features such as highlighting AI sentences and more.

Pros and Cons

Pros

  • Fairly accurate
  • Free

Cons

  • Limited features due to demo version

8. GPTZero

GPTZero, not to be confused with ZeroGPT, is another popular AI detector marketed towards educators specifically. The developers over at GPTZero use a multi-step approach containing 7 different components to train their model. Their websites also state that they’re able to accurately detect content from Chat GPT, GPT-3, GPT-4, Bard, and more.

The tool also includes quite a few nice features that add more value to it. This includes features like source scanning, a breakdown of your results, document history, writing feedback, plagiarism checker, and more.

Core Features

  • AI scan
  • Batch scanning

Additional Features

  • Plagiarism checker
  • Document history
  • Source scanning
  • Data security
  • Team management
  • Reports
  • API and Chrome extension

GPTZero has a limited free version which enables users to scan up to 15 000 words per month. With the free account, you’re also able to do batch scanning up to 10 files at a time. A surprising benefit of the tool is that you can use the Chrome extension with the free version as well.

The paid version is slightly more expensive than the industry standard, but includes a plagiarism and source checker. The word and batch scanning limits are also increased significantly. With the paid versions you also get access to a dashboard where you can manage your document history and gain even more interpretations of your results.

GPTZero screenshot
A screenshot showing GPTZero in action.

Test Results

GPTZero performed quite well. In most cases it was able to accurately identify the source of the text. There were only a few instances of false positives.

During our testing the accuracy of the tool was 75%. Due to the false positives the precision score dropped a bit to 63%. There were no false negatives during our testing, which means the recall is 100%.

The overall accuracy and reliability as presented by the F1 score for GPTZero is 77%. We appreciated how simple the tool was to use. The free version offers quite a bit for infrequent users. The paid versions however, give access to many more features that especially educators will find useful.

Pros and Cons

Pros

  • Decent free version available
  • Paid version has many useful features
  • Dashboard to manage and store documents

Cons

  • Basic plan is more expensive than industry standard

9. CopyLeaks

CopyLeaks is another AI detector that comes highly recommended. One of the things that makes CopyLeaks different from many other AI detectors is that their model focuses on identifying human-written content rather than AI. This means that the tool isn’t trained on AI generated copy, but rather content written by humans.

The training for the CopyLeaks AI detector started in 2015. Since then, it’s been trained on trillions of pages of human written content. The purpose of this is to teach the tool to identify how humans write. The team at CopyLeaks believes that this approach produces fewer false positives.

CopyLeaks is also one of the few tools that’s able to detect AI generated source code. This additional feature means that the AI detector can be used by software developers and content creators alike.

Core Features

  • AI scan
  • Source code detection
  • URL and sitemap scans

Additional Features

  • Plagiarism detection
  • PDF reports
  • Ability to schedule recurring scans
  • Detects mixed human and AI content
  • Detects manipulated text
  • Grammar checker
  • API
  • Chrome extension

CopyLeaks has a limited free version that can be used to scan the occasional piece of text. One consideration is that the text must be at least 350 characters. This is also beneficial to get a feel for the tool before committing to a paid plan. However, it should be noted that the free version doesn’t provide an in-depth report, and only shows a quick result of the scan.

The paid version starts at an industry standard price, but you have the option to choose a plan that includes a premium plagiarism checker for slightly more. The paid version of the AI detectors includes many more features that aren’t included in the free version. Such as the ability to detect image to text (OCR).

CopyLeaks screenshot
A screenshot showing the CopyLeaks interface.

Test Results

CopyLeaks produced quite mixed results during our testing process. We were surprised by how scattered the results were with equal amounts of true positives, true negatives, and false positives.

During the testing the accuracy measured was only 67% and the precision dropped to 50% because of the number of false negatives. The tool didn’t return any false negatives and ended up with a recall of 100%.

The overall accuracy and reliability of the tool as represented by the F1 score was only 66.67% in our independent testing. The tool was easy to use and we valued the developers transparency regarding how the tool works. It’s also one of the most affordable AI detectors available for bulk scanning.

Pros and Cons

Pros

  • Affordable
  • Option to include a plagiarism detector
  • Ability to detect AI generated source code
  • API and Chrome extension

Cons

  • Free version is limited
  • Accuracy is low

10. Corrector App

Corrector App is a 100% free AI detector that can be used to spot AI in written text. It has a simple user interface where you copy your text and get results in real time. Due to the simplicity, the tool produces results quickly. Unfortunately, because it’s a 100% free app there are some ads to deal with. 

Core Features

  • AI scan

Additional Features

  • None

Corrector app is a 100% free AI detector and as such, doesn’t have a paid version. With the free version you can scan as many articles as you want without limitations. However, the tool only scans 800 words at a time, so for longer articles you might have to break things up and scan sections individually.

Corrector App screenshot
A screenshot showing Corrector App in action.

Test Results

For a 100% free tool, the Corrector App performed decently. However, it doesn’t beat out the paid competitors at this stage. It was able to accurately predict many true positives and negatives, but there were still some false positives. 

The accuracy of the tool based on our testing was 67%. The precision is slightly lower at 50% because of the false positives. During our testing, the tool didn’t have any false negatives which gives it a recall of 100%.

Unfortunately, since this AI detector had slightly more false positives than some of its higher ranked competitors, the F1 score and final accuracy and reliability is only 67%. Personally, I wish there were fewer ads, but being a completely free tool with no paid options I can understand the need for them.

Pros and Cons

Pros

  • Simple to use
  • 100% free

Cons

  • Ads

11. Sapling

Sapling is another AI detector that comes recommended by some of the top websites in recent best AI-detection tool articles. This is an easy to use detector with a simple interface. However, the website is a little lacking when it comes to providing information on the tool. We know very little about which model the system uses or what datasets were used for training.

Sapling states that it’s able to detect AI usage from LLMs like ChatGPT and Bard. It markets the tool to educators, SEO practitioners, and reviewers of AI-generated content. According to the change log on their website, it appears that they update the tool once a month.

Core Features

  • AI scan
  • Encryption

Additional Features

  • Premium suggestions
  • Autocomplete feature
  • Site scan
  • Email and ticket support and chat assist
  • Custom integrations
  • API
  • Team analytics
  • Domain administration

Sapling has a very handy free version that you can use when you need to scan content for AI. The free version isn’t limited, and provides you with a brief summary of the likelihood of the content being AI generated. The free version also provides a breakdown of the overall text and sentences so that you can see which sections appear to be AI written.

They also offer a few paid options, however, the only one that has a visible cost is the Pro plan. The other plans depend on your usage and you need to arrange a meeting with the team for a quote. The Pro plan is slightly more expensive than the free version. There also doesn’t appear to be any difference between the AI tool in the free or paid versions, no limitations or restrictions. The biggest difference is that the paid version has additional features such as the ability to scan an entire website and the autocomplete feature.

Sapling screenshot
A screenshot demonstrating how Sapling works.

Test Results

We had high hopes for Sapling since it does come recommended. However5, in our testing the tool didn’t perform according to our expectations. It accurately predicted half of the test articles, but the remaining half wasn’t identified correctly. 

At the end of our testing, Sapling ended up with an accuracy score of only 50%. Furthermore, the precision of the tool dropped to 40% because of the sheer number of false positives. The only saving grace was that the tool didn’t return any false negatives that wrongly identified AI content as human written. Therefore the recall was 100%. 

Overall the accuracy and reliability score only 57%. Definitely not what we were expecting, especially if you consider the above average cost of their paid version. I wish the website was more transparent regarding their models and training protocols. Also, despite the tool itself being simple to use, the website is not. I wish the website overall was more user friendly.

Pros and Cons

Pros

  • Free version is easy to use
  • Site scans (Pro version only)
  • Customer support (Pro version only)
  • Team Analytics (Pro version only)
  • API (Pro version only)
  • Integrations (Pro version only)

Cons

  • Not very reliable – lot of false positives
  • Paid version is more expensive than most 

12. Undetectable AI

Undetectable AI is a tool recommended by a number of big brands. As such, we were excited to include it in our tests. However, Undetectable AI isn’t just about detecting AI, it’s also a tool to transform AI to human-written content with its AI humanizer feature.

Undetectable AI integrates with a number of powerful AI detectors to identify text created with tools like GPT3, GPT4, Claude, and Bard. This makes Undetectable AI different from many other tools on this list. By integrating with some of the most well known detectors, you benefit from all of their algorithms in one tool.

It should be noted that the focus of this tool is more on creating undetectable AI content than it is on identifying it. 

Core Features

  • AI scan

Additional Features

  • API
  • AI humanizer

Undetectable AI has a free version that has no limitations to the detector. This means you can scan as many texts as you need to for free.

If you want to use the AI humanizer feature, you’ll need to sign up for a paid version. The paid version is reasonably priced when compared to some of the other humanizers we’ve encountered on this list. 

Undetectable AI screenshot
A screenshot showing the Undetectable AI tool in action.

Test Results

Despite the tools claims, it didn’t perform as well as we thought it would in our testing. There were false positives and false negatives that impacted the final scores.

The accuracy of the tool in our testing was only 58% due to the number of false results. The tool did deliver fewer false positives, which means it had a decent precision score at 60%. However, there were quite a number of false negatives which resulted in a score of 50%.

The final F1-score representing the overall accuracy and reliability of the tool was 55%. This was a rather unexpected score from a tool that seemingly gets good ratings. However, I suspect that some of those ratings are more related to the features of the humanizer than the detector.

Pros and Cons

Pros

  • Free AI detector
  • Humanizer available

Cons

  • Detector not very accurate

13. Kazan SEO

Kazan SEO is a website that offers a few different services such as email snipers, content optimizers, and keyword tools. Basically all tools related to improving your SEO. One of their tools is a free AI Detector.

Unfortunately the website isn’t very clear about how the AI model works and how the tool was trained. It’s purely a free AI detector without any fuss.

Core Features

  • AI scan

Additional Features

  • Content optimizer
  • Keyword tool
  • Text extractor
  • Email verifier
  • AI text generator

This tool is 100% free and has some ads running in the side banners. There are currently no restrictions and users can run multiple scans.

Kazan SEO screenshot
A screenshot showing Kazan SEO in action.

Test Results

Unfortunately this tool did not perform well in our tests. It barely detected AI written text and there were a number of false positives and false negatives. 

The accuracy of the tool in our test was only 50%. Also, due to the number of false positives, the accuracy was 40%. The tool also returned a number of false negatives, which means the recall was only 40%.

The overall accuracy and reliability of the tool which is represented by the F1-score is only 40% for Kazan SEOs AI detector. 

Pros and Cons

Pros

  • 100% free

Cons

  • Not very accurate

14. GpTRadar

GpTRadar was another tool that came recommended and when I first looked at it I was impressed. It is clear that the developers put a lot of effort into creating a website and tool that looked good and was easy to use. 

However, once you start using the app, you start running into some annoyances. First off, the free version is limited, but it isn’t stated anywhere what the limit is. This means you’ll be busy scanning your text and suddenly it would just stop working, forcing you to buy more credit. 

After buying credits, there was a bit of a delay in the tool registering that you can scan longer articles. This delay went on for a couple of days and I kept getting an error saying that my text was too long. All of the test articles are approximately 1000 words, so this was a little frustrating.

That being said, the paid version does provide you with some very useful information such as perplexity scores and token distribution. This is very useful if you want to address any sections that get picked up as created by AI.

Core Features

  • AI scan

Additional Features

  • Reports

GpTRadar does have a free version, but it’s very limited with no apparent indication of the cap. Once you’ve reached the maximum limit the tool simply stops working with no redirection to the paid version or an error message stating you’ve exceeded your limit.

The paid version works on a different model than many of the other AI detectors on this list. Instead of a monthly fee, you just buy credits. This is quite convenient as you can only buy as many credits as you need. Your credits also don’t vanish at the end of the month, so you can continue scanning for as long as you have credits. As far as the costs go, it’s very in-line with industry standards.

GpTRadar screenshot
A screenshot showing how GpTRadar works.

Test Results

Unfortunately this tool was all over the place during our testing. We had something of everything coming up. There were true positives, true negatives, false positives and false negatives. 

The accuracy score for GpTRadar was only 50% due to the wide spread of results. Due to the high number of false positives the tool had a precision score of 33%. Because there were also a number of false negatives, the recall was 50%.

The F1-score that represents the overall accuracy and reliability of the tool is only 40% Unfortunately, that indicates that the AI-detector really isn’t accurate to use for detecting AI. The tool shows promise, but the developers need to work on the model and algorithm to improve the accuracy.

Pros and Cons

Pros

  • Good looking interface
  • Ability to view reports

Cons

  • Not very accurate

15. Detecting-AI 

Detecting-AI is another free AI detection tool that we looked at during our test. When first opening the tool, it looks quite promising. You’re met with a clean and simple web page with the AI detector front and center. You also have the option of creating an account if you want to keep track of your scans.

This free AI detector doesn’t have any word limitations for scans. The results are also displayed in a very beneficial way. Sentences are highlighted to show where AI is detected so that you can fix it. This is a rather unusual feature for a free tool.

Core Features

  • AI scan
  • Scans text, URLs, and document uploads
  • Chrome extension

Additional Features

  • AI Humanizer – different tool linked to this one

Detecting-AI is a 100% free tool that’s able to scan copied text, URLs, and uploaded documents. As far as 100% free tools go, this one offers quite a bit. There are no limitations, you can scan various formats, and it provides a sentence-by-sentence breakdown of the content.

Detecting AI screenshot
A screenshot showing the Detecting AI tool in action.

Test Results

As a free AI-detector, Detecting-AI has a lot of potential. Unfortunately, from our independent testing, it doesn’t seem to be the most accurate or reliable tool to use with numerous true and false results to report. 

The accuracy for this AI detector was only 33% reflecting the amount of true and false results. Due to the number of false positives, the precision is also only 33%. Our results also showed quite a few false negatives which made the recall 33%. 

Can you guess what the F1-score is? Our tests revealed that the overall accuracy and reliability is only 33%. This means the tool is really not very accurate at identifying AI content correctly.

Pros and Cons

Pros

  • 100% free

Cons

  • Not very accurate

Conclusion of Test Results

Not quite what you were expecting, right? When we started this project, we had certain expectations of what we’d find. As content creators ourselves, we’ve worked with a few of these tools before. There’s no doubt that they’re becoming more and more popular. People are realizing that AI content can have a detrimental effect on SEO, readability, and even the user’s experience. As such, educators, content agencies and even clients are starting to use these tools to identify instances of AI tools being used to automatically generate content. 

Our results showed one true winner: Content at Scale. Not only did this tool deliver the best results during testing. Which proves it’s the most accurate too when it comes to detecting AI content. It’s also easy to use, with a user-friendly interface. The number of additional features to help improve your work is also very beneficial for writers and other content creators.

However, as many content creators already know, AI detectors just aren’t accurate enough to be used on their own to identify the presence of AI in text. There are just too many variables that affect the results. Limited training data, limitations in the algorithms and many more, play such a big role that there’s no single AI detector currently available that’s 100% accurate.

Our test results reflect this, but it also shows how developers are biased with the accuracy of their own tools. Promising accuracy ratings of 98% or higher and charging exorbitant fees for tools that won’t guarantee a result. 

Our advice is to use AI detection tools with a grain of salt, but don’t rely on them entirely. Set up your own methodologies for identifying AI content that utilizes AI detectors alongside other measures to see if AI content was truly used. 

Alternatively, you could also choose to use AI detector tools like Originality.AI that includes a chrome extension that can be used to prove your work is human written. By doing this, you ensure that even if one of the wide range of AI detectors being used picks up AI content in your work, you can show why it isn’t with facts.

Final Thoughts on AI Detection Software

AI can be very beneficial in a number of different industries. However, when it comes to content creation it can cause major issues. Search engines are starting to penalize AI generated content, which means everyone is focused on having original human written text. However, making sure that content is written by a human can be challenging. While AI detectors aim to solve this problem, they just aren’t accurate enough at this stage to ensure 100% that the content you have is free from AI.

If you’re struggling to create content without using AI generators, consider reaching out to Captain Words. We’re a content creation agency that specializes in writing, editing, and translating a wide variety of texts from different industries without the use of AI.

Share This Article

Facebook
Twitter
LinkedIn
WhatsApp

Leri Koen

After spending several years in the fields of Education, Child Development, and Hospitality, Leri decided to embrace her passion for content. Today, she is helping businesses grow digitally through her skills as a content specialist. Follow on LinkedIn.

Get 5,000 Words Free!

Contact our team to find out how we can help you scale your content to reach a global audience!

Get 5,000 Words Free!

We’d love to show you how our team of passionate writers can help boost your ROI and improve your content. Just fill out the form below for a free, no obligation sample!

Here's your checklist!

Click below to download your PDF checklist.