We are an independent IT service provider and do not receive any commissions or benefits for mentioning specific products, technologies or brands.

Article written by: Tomas Polach, Tino Bächtold

Updated at: 13. December 2025 , published at: 7. December 2025

Dataset: LLM Token Usage in Everyday Office Tasks

This dataset provides realistic token usage estimates for 64 common AI tasks across different categories common in enterprise office environments: Communication, Coding, Analysis, Planning, Document Processing, and Multimodal tasks (Vision, Audio, Mixed).

About This Dataset

The dataset compares token consumption between standard models (like GPT-4o) and reasoning models (like OpenAI o1). Reasoning models use additional hidden tokens to “think” through problems step-by-step before generating responses, leading to more accurate and reliable results.

This dataset has been compiled from real-world use cases in our enterprise AI implementation projects, then validated and extended with data from authoritative sources listed below. The token estimates reflect actual production workloads and can be used for modeling theoretical scenarios.

Key Insights:

Simple tasks (e.g., “Hello World”) use ~300-500 additional tokens for reasoning
Complex tasks (math, debugging, logic) benefit most: reasoning models use 10-20x more tokens but deliver significantly better accuracy
Multimodal tasks (images, audio) have high base costs before any reasoning
Audio is extremely token-dense: ~1,000-1,200 tokens per minute

Token Usage Dataset

The complete dataset is available for download in CSV file format.

Category	Task	Description	Input Tokens	Output Tokens (Normal)	Output Tokens (Reasoning)	Images	Audio (min)
Communication	Drafting a Short Email	Write a sick leave email to boss	50	100	450	0	0
Communication	Polite Rejection	Decline a wedding invitation politely	60	120	500	0	0
Communication	Rewrite for Tone	Make paragraph sound more professional	150	150	800	0	0
Communication	Cover Letter Generation	Write cover letter for Sales role	200	400	1800	0	0
Communication	Replying to a Text	Give 3 witty replies to text message	40	60	500	0	0
Communication	Grammar Check	Fix grammar in 200-word memo	200	150	900	0	0
Coding	Hello World Script	Write a Python Hello World script	30	20	350	0	0
Coding	Excel Formula Help	Formula to VLOOKUP column A in Sheet 2	50	50	1200	0	0
Coding	Regex Generation	Regex to validate email address	80	70	1500	0	0
Coding	SQL Query Generation	Select top 5 users by spend from tables	100	100	1800	0	0
Coding	Debugging Code	Find error in 50-line Python function	600	200	4500	0	0
Coding	Code Refactoring	Rewrite code to be more efficient	700	300	5000	0	0
Coding	Explain Error Log	What does this stack trace mean?	350	150	2500	0	0
Analysis	Summarize Article	Summarize 1000-word article	1400	200	3500	0	0
Analysis	Extract Data	List all dates and names from text	1000	200	2800	0	0
Analysis	Math Word Problem	If train leaves Chicago at 60mph…	100	50	2500	0	0
Analysis	Logic Riddle Solving	Solve two doors two guards riddle	120	80	2000	0	0
Analysis	Financial Analysis	Analyze CSV rows for trends	600	200	3500	0	0
Analysis	Sentiment Analysis	Is customer review positive?	80	70	600	0	0
Planning	Meal Plan	Create healthy 3-day meal plan	150	350	1500	0	0
Planning	Trip Itinerary	Plan 3-day weekend in Tokyo	200	600	2500	0	0
Planning	Brainstorm Titles	10 catchy titles for AI blog	100	100	800	0	0
Planning	Write a Haiku	Write haiku about the ocean	30	30	400	0	0
Planning	Gift Ideas	Gift ideas for dad who likes golf	100	200	1000	0	0
Planning	Roleplay Scenario	Pretend you are a career coach	150	450	1500	0	0
Document Processing	Extract Invoice Data	Extract vendor total date from 2-page invoice	900	300	2800	0	0
Document Processing	Summarize Contract	Summarize key terms from 10-page legal contract	4000	500	9000	0	0
Document Processing	Resume Screening	Extract relevant skills from 2-page resume	1100	400	3200	0	0
Document Processing	Translate Document	Translate 5-page Spanish document to English	2300	700	6500	0	0
Document Processing	Format Markdown	Convert 500-word Word doc to structured Markdown	1200	600	4000	0	0
Document Processing	Parse JSON Schema	Validate and fix malformed JSON document	500	300	2200	0	0
Document Processing	CSV to SQL	Convert 100-row CSV to INSERT statements	1200	800	4500	0	0
Document Processing	Extract Table Data	Extract and restructure table from PDF (500 rows)	2800	700	7000	0	0
Document Processing	Compare Versions	Identify changes between 2 versions of 5-page doc	1700	500	5500	0	0
Document Processing	Review Code PR	Review 200-line code pull request for bugs	1300	500	4500	0	0
Document Processing	Generate API Docs	Create documentation from 50-function source file	1800	700	5500	0	0
Multimodal (Vision)	Describe Image	Describe content of single photograph	800	150	2200	1	0
Multimodal (Vision)	OCR Document	Extract text from image of handwritten note	850	200	2400	1	0
Multimodal (Vision)	Analyze Chart	Interpret data trends from bar chart image	950	350	3000	1	0
Multimodal (Vision)	Screenshot Analysis	Debug UI from application screenshot	900	350	3800	1	0
Multimodal (Vision)	Identify Objects	List all objects in image of warehouse	800	300	2800	1	0
Multimodal (Vision)	Compare Images	Find differences between 2 product photos	1600	600	4500	2	0
Multimodal (Vision)	Read Whiteboard	Transcribe equation written on whiteboard photo	800	250	2600	1	0
Multimodal (Audio)	Transcribe Audio	Transcribe 5-minute audio interview	5000	800	11000	0	5
Multimodal (Audio)	Extract Meeting Notes	Generate summary and action items from 30-min meeting	30000	1000	58000	0	30
Multimodal (Audio)	Identify Speaker	Identify speaker and emotion in 2-min audio clip	2000	300	4800	0	2
Multimodal (Audio)	Translate Audio	Transcribe and translate 10-min German audio to English	10000	1000	21000	0	10
Multimodal (Mixed)	Document + Image	Match text document to related photos	1500	1000	5500	2	0
Multimodal (Mixed)	Video Description	Describe content from 2-min video (frames + audio)	2300	2200	9500	3	2
Multimodal (Mixed)	Multi-Image Comparison	Compare changes across 5 product design mockups	4200	600	9500	5	0
Document Processing	Summarize 50-page Technical Report	Summarize key findings from 50-page technical PDF without images	20000	1200	26000	0	0
Document Processing	Extract KPIs from 50-page Annual Report	Extract revenue profit and growth KPIs from 50-page annual report	22000	1500	28000	0	0
Document Processing	Summarize 100-page Regulatory Filing	Create executive summary of 100-page regulatory filing (10-K/10-Q)	40000	2000	52000	0	0
Document Processing	Compare Two 50-page Contracts	Identify differences and risks between two 50-page legal contracts	38000	2500	60000	0	0
Document Processing	Audit 5k-line Codebase File	Review a 5000-line single code file for bugs and architecture issues	35000	3000	70000	0	0
Multimodal (Vision)	Process 20-page Scanned PDF	OCR and structure 20-page scanned PDF (image-only)	16000	2000	30000	20	0
Multimodal (Mixed)	Analyze 50-page Report with Charts	Summarize 50-page PDF containing text plus 10 chart images	23000	2000	32000	10	0
Multimodal (Audio)	Transcribe 60-min Podcast	Full transcription of a 60-minute podcast episode	60000	3000	75000	0	60
Multimodal (Audio)	Summarize 90-min University Lecture	Generate structured notes and sections from a 90-minute lecture recording	90000	4000	90000	0	90
Multimodal (Audio)	Analyze 2-hour Support Call Log	Extract issues sentiments and escalation points from 2-hour support call	120000	5000	110000	0	120
Multimodal (Mixed)	Describe 10-min Product Demo Video	Summarize features and UX from 10-minute demo video (screen + narration)	18000	3000	22000	10	10
Multimodal (Mixed)	Summarize 45-min Webinar with Slides	Generate structured summary from 45-min webinar audio plus 30 slide images	75000	4000	80000	30	45
Multimodal (Mixed)	Review 60-min Security Camera Footage	Identify key events in 60-min silent security recording	48000	2500	52000	40	0

Data Sources & Methodology

This dataset was compiled from the following authoritative sources:

General Tokenizer

Tiktokenizer (OpenAI): Standard text tokenization rule: 1 word ≈ 1.3 tokens (1000 tokens ≈ 750 words).

Reasoning Models

OpenAI o1 System Card: Reasoning tokens are hidden output tokens used by the model to “think” before answering. Can range from hundreds to tens of thousands depending on complexity.
PromptLayer Analysis (o1 vs GPT-4o): Reasoning models often use 3-10x more tokens for complex tasks like coding or math due to internal chain-of-thought generation.
Reddit Community Analysis (Hidden Tokens): User benchmarks showing simple tasks might use ~300 hidden tokens, while complex coding tasks can exceed 5,000+ hidden tokens.
Arxiv: Comparative Study on Reasoning Patterns: Comparative benchmarks showing reasoning models consuming 10x-20x more tokens on complex logical tasks.
Clarifai Reasoning Model Comparison: Benchmarks for hard math/logic problems showing reasoning token usage often exceeding 30,000+ for difficult queries.
Databricks: Long Context RAG & o1: Highlights that reasoning models can fail or hit output limits when reasoning over very large contexts (e.g., 100+ pages).

Vision Tasks

OpenAI Vision Documentation: Images are processed in 512x512 tiles. High-detail mode costs ~85 tokens base + 170 tokens per tile. A standard 1080p image is often ~765-1105 tokens.
Cursor IDE Blog (GPT-4o Image Costs): Practical breakdown of image costs: Low detail is fixed at 85 tokens. High detail scales with resolution.

Audio Tasks

OpenAI Pricing (Audio): Audio inputs are billed separately from text. GPT-4o Audio input is ~$0.06/min (Realtime).
Microsoft Azure AI Blog (Audio Tokens): Audio tokenization is dense. Approximately 1 minute of audio ≈ 1,000 - 1,200 audio tokens for billing purposes.
OpenAI GPT-4o Audio Guide: Technical details on how audio is tokenized and processed, confirming the distinction between input audio tokens and output text tokens.

Document Processing

Arxiv: Chain of Draft: Discusses token efficiency in reasoning models for drafting and document tasks, highlighting the overhead of “thinking” steps.

General Tasks

Awesome LLM Tasks (GitHub): Curated list of practical LLM tasks used to derive the common daily task list categories.

Use Cases

Cost Estimation: Calculate expected API costs for your AI applications
Model Selection: Choose between standard and reasoning models based on task complexity
Budgeting: Plan AI infrastructure costs for production workloads
Research: Benchmark and compare token efficiency across different task types

Want to see how these tasks translate to real-world workloads? Check out our detailed analysis:

AI Costs by Office Role - We use this dataset to calculate typical daily token consumption for different business roles (Executive Assistant, Recruiter, Financial Analyst, Corporate Counsel, Software Engineer) and reveal what drives AI costs in your organization.

Citation

If you use this dataset in your research or applications, please cite:

onprem.ai Research (2025). Real-World LLM Token Usage Dataset.
Retrieved from https://onprem.ai/en/knowhow/llm-token-usage-dataset/

Last Updated: December 2025 Version: 1.0 License: Creative Commons BY 4.0

Dataset: LLM Token Usage in Everyday Office Tasks

About This Dataset#

Token Usage Dataset#

Data Sources & Methodology#

General Tokenizer#

Reasoning Models#

Vision Tasks#

Audio Tasks#

Document Processing#

General Tasks#

Use Cases#

Related Resources#

Citation#