Welcome! This mega data labeling pricing guide covers:
- Diffgram Pricing
- Scale AI Pricing (Scale Rapid)
- Labelbox Pricing
- V7 Labs Darwin Pricing
- Clarifai AI Pricing
- AWS SageMaker Ground Truth Pricing
Do you know how much it costs to annotate this image?
On average, this one image costs about $5.79. The prices range from $0.00 to over $12.
Why is there such a wide price range?
- What’s the difference?
- How do the pricing models work?
- What is Unlimited pricing vs Per Annotation pricing?
You will learn all of that in this data labeling pricing article.
In this pricing article:
- Comparison of Prices, Usage Scaling, and Free Trials.
- Compare the Pricing Models: Unlimited Per User and Per Annotation Pricing.
- The Impact of Data Labeling Pricing Models on your business.
Let’s dive in!
Table Of Contents
Accurately Calculating Annotation Volume
Custom Pricing Disclaimer
Diffgram offers custom pricing models. At the end of the day our intent is fair pricing for actual and projected usage and being flexible to update that pricing as your needs change. The prices listed here are meant to convey an example for the sake of comparison and understanding other options.
Diffgram is open source. This pricing article is for Enterprise customers. Compare Diffgram pricing here.
Diffgram Enterprise Pricing Philosophy
Like Apple, pay once and get the whole device.
- All you can eat
- Free of artificial restrictions
- Value leader
Diffgram offers customer centric pricing as per server licenses, per user, and more.
Data Labeling Pricing Summary
The following tables compare the price and key value drivers of various data labeling solutions, also called Training Data Platforms, available in the market. We truly worked hard to ensure this was a fair apples-to-apples comparison and invite folks to send us updates and errors.
Pricing Summary Table
Because some offer unlimited annotations and some charge per annotation we provide a normalized per user estimate.
Diffgram offers unlimited users on other pricing models like per server. The per server model works especially well when leaning on Datalake (Data Catalog) and other features where the % of data annotated may be a small percent. Pease note the following example is normalized to Full Time users.
|Vendor||Default Pricing Model||Cost Per Annotation or File||Unlimited?||Half Usage|
|Scale AI||Per Annotation||$0.025||❌||✅|
|Labelbox||Per Annotation and Per User, |
Sometimes they offer Per File if you push them
|V7 Labs Darwin||Per Annotation and mixed and Per Automation||$0.015||❌||⚠️|
|Clarifai AI||10+ Chargeable Items||$0.050||❌||⚠️|
|AWS SageMaker Ground Truth||Per Annotation||$0.080 to $0.020||❌||❌|
Why is there such a Difference? Consider that Diffgram, Scale Rapid, and Labelbox all claim to have a superior product over Clarifai AI and AWS. Yet AWS and Clarifai AI are a higher price then any of the four. The simple answer here is “Buyer Beware”. Clearly the price points are not a reflection of product quality.
The more you use any data labeling platform the better your AI systems become. We want you to use Diffgram as much as possible because we know more use is a win for you. To make this happen we setup Diffgram with a different cost model. We don’t have to pay the same high cloud bills other providers pay due to our different design such as running automations locally. This allows us to have a lower cost structure and pass those savings on to you. With Diffgram you get the best product and the best price.
Unlimited Does Not Have to Be Per User
The intent behind Unlimited is that there is fair pricing for actual and projected usage. This means annotating as much as you want and not tying the pricing specifically to how much you annotate. Usually the volume of users is a good benchmark for scoping overall pricing. But there are other ways we are happy to talk about structuring it to your needs.
Unlimited Offers Better Value
There are two major pricing models for data annotation at play.
- Per Annotation
What is Unlimited Annotation Pricing?
In the Unlimited Annotation pricing model, there is no per annotation cost. Each user can annotate as much as they can.
- Do you pay per word for Google Docs?
- Do you pay per slide for Powerpoint?
- Do you pay per open window in Microsoft Windows?
No? Then why pay per annotation?
The Diffgram Unlimited cost model is a win-win that’s better aligned with your goals.
- Unlimited is better aligned with your goals compared to the Per Annotation price models.
- Unlimited provides more consistent value of unlimited use with less risk of unexpectedly high costs.
- Unlimited model allows you to better manage and bring down your data labeling and data annotation costs drastically.
Diffgram is the only vendor to offer Unlimited Annotations
This means Diffgram is the lowest cost and best value. Switching to Diffgram Unlimited is Easy.
Unlimited Pricing is Better in Most Scenarios:
Cases where the Unlimited Model was better to include:
- Using it often, even with light use “low productivity”. Unlimited Is less.
- Using it once a week and medium productivity. Unlimited is less.
- Any case with high usage, high productivity, or video.
- Using it heavily for a few months, then not at all for a few months. (Inactive users are $0).
Per Annotation Unit Pricing Visually Explained
Per Annotation, pricing can burn deeper holes into the pocket faster and steeper than you think. Let’s demystify Per Annotation pricing. How does Per Annotation pricing really Work?
Let’s take the example of the image shown above and break down its pricing:
In a Per Annotation model, each annotation, each sub annotation (attributes etc), each image, each frame, each word annotated, and each automation would cost you money.
In this image, there are 96 annotations and 96 automations (the box to polygon), plus the image itself. At $0.03 per annotation this is $5.79. In Amazon’s $0.08 per annotation it’s $15.44 and in their lowest $0.02 per annotation it’s $3.86.
As you can see there is really no way to “Win”. Even if the pricing was half a cent (4x lower then Amazon’s current highest volume offer), it would still be nearly $1. What happens when your company has hundreds of thousands, millions, or even billions of these samples (text, images, video, etc)? Is the volume of data unknown? Then in Per Annotation the cost structure is unknown too.
Accurately Calculating Annotation Volume
The best proof here will be your own calculations. But to do accurate calculations we must consider a few key differences.
- Annotation Units vs What we Picture as an Annotation
- How to Calculate it
- Data Lake vs Only Annotation
Annotation Units vs What we Picture as an Annotation
An Annotation Unit is (usually) not what we picture as an Annotation.
Part of what makes it really confusing to try and reason about “per annotation” schemes is that there are so many things that cost money:
- Each geometric annotation costs money
- Each attribute (form) of detail on that same annotations costs money
- Computer generated (PreLabel) annotations cost money
So saying “per annotation” isn’t accurate. There’s a multiple of what we picture in our heads as an annotation and what gets billed as an annotation unit.
1 Annotation = 3-5+ Annotation Units
Again this means for practical purposes that what we may think of conceptually as a single annotation gets counted multiple times, sometimes 10s of times in those schemes.
A good rule of thumb here is whatever you visually picture as an “annotation” to multiple it by 3-5x+ to get a more accurate value of “Annotation Units”.
Some vendors may change or update to make the concept of an annotation align better with what we intuit but the core point will remain. What if you want to compare (and/or correct) 3 data-science models each with slightly different annotations?
What if something changes in your project and each annotation now has an extra attribute? Does that double your cost?
Again – per annotation by itself is hard to justify cost wise, but when some schemes also start including sub annotations (attributes), prelabels, etc. the cost skyrockets.
It’s just common sense here that trying to get overly granular in this area does not make sense!
How to Calculate it
To get a monthly per user value the core formula is:
Monthly Hours Per User * Cost Per Annotation * Annotation Units Per Hour
This ignores base costs, per file costs etc.
So for example if you had a part time annotator, who spent 4 hours per work day annotating, that’s about 80 hours a month. So for AWS’s case that’s:
80 * 0.08 * Annotation Units
Your usage will very but 300 annotation units (about 30-100 real annotations) per hour is extraordinarily conservative.
See the one example above, where a single image is 193 units. If that image had half the annotations but one attribute for each it would still be 193 units.
So now we have:
80 * 0.08 * 300 = $1,920
We can torture the estimated usage, how many annotation units we have to use etc, but it’s just really hard to get a reasonable number starting from this approach. And by the time you do get a reasonable looking number, the usage will be severely restricted.
Usage Based Concepts and Unlimited Work Together
Unlimited is always in the context of something. By default it’s unlimited usage per user. Another example is unlimited usage per cluster. As part of the Enterprise Sales process we will work with you to determine an appropriate Unlimited model.
The point is not to insert an artificial limit. And to still reflect the overall scope of usage.
Per Annotation Unit pricing is a form of usage based pricing. However it artificially restricts the thing you most want to do – annotate!
Further, as most providers in this space still have contractual minimums, tiered pricing, etc. etc. there is really no such thing as “pure” zero to usage pricing.
With Diffgram in all cases commercial models are aligned with maximizing usage and doing the most annotation and prediction, limited only by real world limits such as hardware costs, and consideration that as the impact of usage grows over time that Diffgram’s revenue can grow too in a reasonable way.
Data Lake vs Only Annotation
From a product usage perspective, the more you can keep all your training data in one place the better.
- Search and query
- Data science functions fully integrated, such as model comparison
- Only managing one platform
And other benefits. It just makes a lot more sense to think of your training data system as an overall storage of your training data, not just the human annotations.
So far we have made the argument of better value for unlimited primarily in the context of human annotation. Those arguments hold in either case, but when we start to think of bringing all the data into the annotation store and treating it more like a data lake for training data the value is even greater.
Easy Work Still Uses Lots of Annotation Units.
If the work is easier it will take less time. Meaning you will do more work in an hour.
The time per samples goes down, the volume of samples go up, and total annotation units used remain similar.
Therefore simpler work will cost a similar amount per hour as complex work in the Per Annotation model.
No more paying for every annotation, every image, every annotation, or every minute worked.
Why is Unlimited the Best Value?
- Unlimited Annotations means worry free growth of your AI team and services.
- ‘Per Annotation Unit’ pricing means that the more productive your team is, the more you have to pay.
- Be aware with Per Annotation you could see your bill easily double, triple, or even 10x+ over time.
Zooming out to the Strategic View
Of course, the specifics of the price will vary.
But the point is why take the risk?
Why risk overages that hinder your team’s productivity?
Now you know about the option to go straight for the Unlimited pricing model that’s better aligned with your interests from day one.
Diffgram Unlimited is:
- A common sense pricing model centered around you getting the most value.
- Standards Based: Diffgram is leading the industry in setting the de facto new standard. In part because it’s the only modern platform that’s fully Open Source.
- Best of the best: Diffgram compares favorably against any competing option.
Setting the De Facto Standard
Diffgram is leading the industry in setting the de facto new standard. In part because it’s the only modern platform that’s fully open source. This is leading to a boom in the rapidly growing Diffgram community. Bottom line is that Diffgram is becoming the Linux or WordPress of Training Data making it the safest long-term choice.
Comparison of Usage Based Price Scaling
How does it affect your budget if your data annotation usage is reduced?
All vendors scale based on usage – the devil is in the details. So to more accurately represent a common scenario we use “Half Usage Halves Cost”.
In AWS’s pricing model the starter tier is 4x higher in price. Meaning that unless you are still in the highest usage tier with half usage, you will pay around 80% of the cost even if you halve your usage. Diffgram and Scale Rapid fair best, usually halving cost for half usage.
Usage Based Price Scaling Table
|Vendor||Scales based on Usage Method||Half Usage Halves Cost?||Detail|
|Diffgram||Active Users||Usually||Assuming less active users, not the same users annotating less each.|
|Scale AI Rapid||API usage||Usually||Volume discounts may require minimum usage commitments.|
|Labelbox||Contract-Based||Maybe||There is a mix of the base cost, per user, minimums, and per Annotation.|
|V7 Labs Darwin||Contract-Based||Maybe||There is a mix of the base cost, per user, minimums, and per Annotation.|
|Clarifai AI||Contract-Based||Maybe||Over 10 separate pricing aspects, please see the details below.|
|AWS SageMaker Ground Truth||API usage||No||Sliding scale pricing, 4x (8 cents vs 2 cents) higher cost per unit in lower tiers.|
- On Diffgram Pricing if you had 30 active users one month, and then 15 the next, your cost would halve.
- On Scale Rapid Pricing , if you are not subject to any contractual commitments, half usage will halve cost.
- On Labelbox Pricing , V7 Darwin Pricing, and Clarifai Pricing there are multiple elements to the pricing making it a “maybe”. For example in some of V7 plans it’s Tier based, so less usage will not automatically lower the price.
- AWS Pricing had a 4x delta between their usage tiers, so half usage will not halve the cost in most cases.
Free Demo Plans Of Various Labeling Software Compared
As part of getting familiar with and learning a new system, it’s common for various people at your company to try it out. Most companies offer free trials, plans, or similar.
Most offer similar free plan limits. However, most have a variety of fine print, excluding certain features or requiring contacting sales.
Table of Demo plans
|Vendor||Free Demo Plans||Example||Fine Print Of Note|
100 Files Per Dataset.
|300 Images or Videos||None|
|Scale AI Rapid||10,000 Annotations||50-200 Images, or 1 Video||Some file types may not be available|
|Labelbox||10,000 Annotations||50-200 Images, or 1 Video||Some Features Limited|
|V7 Labs Darwin||Free trial up to 14 days.||N/A||Requires Approval|
|Clarifai AI||1,000 Annotations||5-20 Images||Charges per Operation (e.g. Ingest) Excluded|
|AWS SageMaker Ground Truth||1,000 Annotations||2-10 Images||The first two months only, 500 per month, excludes hardware costs|
- Diffgram Pricing offers free accounts on diffgram.com with up to 100 files per dataset (up to 3 datasets) of any type, images, video etc. There is no fine print of note – all features are available.
- Scale Rapid Pricing offers up to 10,000 annotations, which is about 50-100 Images, or 1 Video. Some file types may not be available.
- Labelbox Pricing offers up to 10,000 annotations, which is about 50-100 Images, or 1 Video. About 8 features are limited including QA Quality Assurance, video segmentation, medical, and tiled editors. These are not included without contacting sales.
- V7 Labs Darwin Pricing does not offer an ongoing free plan, instead a 14 day free trial. This requires approval. It is not clear what limits are placed on the trial.
- Clarifai AI Pricing has a 1000 annotation limit. In theory this is 5-10 images, but may be less since they also take other operational charges out of that 1000 units (such as ingesting files).
- AWS Pricing has one of the most limited demo plans (10-30x less then all others). It includes 500 annotation units per month for 2 months. Meaning if you use 1,000 in the first month you will be billed for 500 units still. It is also excluding hardware costs. This means AWS offers no completely free trial since you must still pay for hardware.
Frequently Asked Questions (FAQs) Regarding Pricing Models Of Data Labeling Software
In this section we cover common FAQs regarding pricing.
What is Methodology for Normalized Pricing?
The intent is to normalize Per Annotation to regular usage costs. This is to better estimate what you actually see on your bill.
The high level process is:
- Determine how long common media elements, like image, text, video etc. take to annotate and how many annotation units they use. e.g. 30 easy images in one hour, or 10 hard ones, or 1 video.
- Smooth and average out these values.
- Assume part time use – e.g. 2-4 hours a day
- Assume use of some Automation is being used.
- Assume some buffer room. For example the single image above uses 193 annotation units and you could do 5-10 per hour easily (965 to 1930 units) . But we only used 300 units per hour. A small fraction of what it could be.
- Ignore base costs in most cases under assumption that at high usage all base costs will be waived.
- Assume that any heavier user would be so heavily volume discounted as to amount to $0. For detailed raw comparison see the above volume discount section.
Back to Pricing Summary Table
Please note that virtually every assumption here is for the benefit of a fair comparison to Per Annotation. In reality you may easily wish to have your team use it full time, use more automations, be more productive (using more annotation units), others may not waive base prices, or may not provide bulk discounts etc.
For anyone skeptical we encourage you to check the math on the above scenarios and please report any errors to us we will update this.
What counts as an In-Active user?
A user that has not logged in or annotated on Diffgram at least once in a 30 day time period is considered In-Active.
What is Unlimited Annotation?
The idea with unlimited annotation is really simple. There are no per annotation cost. Each user can annotate as much as they can. You can use as many automations as you want.
Naturally there are hardware bounds. The expense for each additional annotation is marginal, so when a vender charges per annotation it is a business model decision not a technical decision.
What is Sustainable Pricing?
This technology is key infrastructure that will likely be with you for years, or even decades, if you are happy with it.
Therefore, we think it’s important that the pricing is sustainable for everyone.
There’s no point in someone being shocked by an unexpectedly high bill. And as much as we believe in our product, we realize there are limits to what is in the budget in this space.
Note that we included the hardware costs in the comparison. Because of the superior design of Diffgram each install costs less to run then trying to build up massive capacity in a centralized cloud.
We want to pay more – will you take our money?
In some cases we understand you may need to achieve things outside the scope of our baseline Enterprise offering.
Rest assured we are here to support you in every way we can!
We are happy to quote mission critical support uses cases including:
- 15 minute response time for production critical issues. And similar SLAs/SLOs.
- Prioritized feature requests.
- Exclusive features.
Our baseline pricing already includes reasonable support levels similar to Enterprise support offered by competitors.
We are in a Government Agency, does that change anything?
It makes it easy to adopt Diffgram. Because your IT teams can have complete control to ensure it meets DOD or similar levels of standards. With Diffgram because you store your data on your hardware you set the security controls.
Scale AI Rapid charges extra for sensitive information. Diffgram doesn’t.
Some vendors frankly pay lip service to on-premise installs. Diffgram has the most actual live running installs in the world. We have people creating and fixing issues for obscure operating system specific issues.
Some months we don’t use it so then what happens?
Diffgram Auto reduces user counts, using the In-Active user model described above. So you get the same benefits as the Per Annotation cost model.
How can I lower my Data Labeling costs?
With Diffgram’s Unlimited pricing, you can stop worrying about variable bills. A single fixed cost per user per month is sufficient to annotate huge workloads. Diffgram is a leader in automation for data labeling with $0 cost automations.
Why do Smarter models cost more in Per Annotation?
The smarter you want your model to be, the more media objects and the more annotations you need. With Unlimited you can make your models as smart as possible!
Why do we want more annotations?
- The more you annotate the better your models are
- The more you use automations the more you must do annotations. (Try saying that 3 times fast!)
- The more models you have, the more annotations and volume of annotations.
How will Automations affect this?
The better your automations get, the more your costs go up in Per Annotation pricing models.
Depending on the vendor, most charge for each automation, and sometimes charge twice, once for the automation and once for the annotation. You sometimes get charged many times for the same image or entity. Diffgram charges $0.00 per use for automations and includes the automation product in it’s base price.
More Annotations Per Hour Expected as We Progress
That’s why it’s important to make the right choice and get Diffgram Unlimited today. Any Per Annotation pricing scheme will only cause you even more headaches down the road.
I’m feeling the pain of high annotation costs, how do I switch?
Switch to Diffgram Unlimited today!
I’m on one of these other platforms and my bills are not that high yet?
Yet is the key word here.
- The more you progress with your AI development, the more the bill with alternatives will go up.
- The more you use automations, the more the bill will go up.
With some, the more production pre-labels you send to review, they count that too. So if you are about to:
- go live with a new product launch
- Use more automations
- Start new projects
… get ready for a 10x 100x or even 1,000x price increase as real production data starts to come in!
The solution is easy! Switch to Diffgram stat!
And really, whether it’s 25%, 50% higher, or 2x higher, why risk it going up based on usage in the future?
Do you really want your team to annotate slower? “Hey Bob, forget about productivity goals – it will just cost us more in the tool if they annotate faster, so, uh, just have your team coast along”
With Diffgram you get unlimited annotations and worry free pricing.
Where are the hidden costs with a Per Annotation pricing?
Examples of hidden costs include:
- Effort to purchase
- Hidden fees (Per Annotation Costs)
Showing $0.03 Per Annotation sounds reasonable at first, until a person starts to really think about how much actual annotation work they are doing. Per Annotation costs are a huge hidden costs.
Remember all these things will add up:
- More objects or entities (bounding boxes etc)
- Attributes – each question about each item
- Each frame
With Diffgram’s Unlimited plan, you need not worry about any hidden costs, Unlimited annotations for a worry free fixed and transparent price.
Disclaimers & Notes
- This is the first true public price comparison of its kind. Please independently validate it.
- Want to be on this comparison? Please contact us.
- If you are a vendor already on the list and want to update your info or correct any error please contact us.
- This is meant to be accurate as of time of writing but prices are subject to change.
- Some statements here are made in context of the article. Some numbers are estimated.
- Diffgram Inc is not affiliated with any of the companies listed. Each company sets their own pricing and may update it at any time without notice.
- Some companies list pricing that may appear to contradict this article. We encourage you to look closely at the total cost as we have done here.
- This comparison is focused on Enterprise Pricing. Diffgram is free for up to 20 users with self install. And Diffgram Premium is available at a lower cost for self service.
- We included hardware estimates of $20/user/month for all appropriate services. Naturally your specific situation may vary with this. Unlimited refers to software licensing costs and is unlimited annotation per user. If you have an extremely heavy use case the hardware may be slightly more – think rounding error it’s so small.
Vendors covered included:
- Diffgram Pricing
- Scale AI Pricing (Also called Scale Rapid Pricing)
- Labelbox Pricing
- V7 Labs Darwin Pricing
- Clarifai AI Pricing
- AWS SageMaker Ground Truth Pricing
Vendors not covered:
- SuperAnnotate Pricing
- Google Cloud AI Data Labeling Pricing
- LabelStudio Pricing (Also called Heartex Pricing)
- Dataloop Pricing
- Superb AI pricing
- Hasty AI Pricing
- Encord Pricing
- Supervisely Pricing