• 3 mins read
  • Published
  • updated

Small Language Models Promise Faster, Cheaper Ad Tech Workflows

Ken Doctor media analyst FAYFO.com

by Ken Doctor

Small Language Models Promise Faster, Cheaper Ad Tech Workflows FAYFO.com
Small Language Models Promise Faster, Cheaper Ad Tech Workflows

ZeroGPU launches specialized small language models for ad tech, aiming to cut costs and speed up repetitive marketing tasks. Dappier reports a 50% expense reduction after switching from large models.

Media and marketing teams facing rising AI costs now have a new option for handling high-volume, repetitive tasks. ZeroGPU has introduced a suite of small language models (SLMs) designed specifically for ad tech, offering a faster and more affordable alternative to large language models (LLMs) for content classification, document summaries, and moderation.

According to ZeroGPU founder and CEO Maddy Arvapally, many ad tech workflows do not require the massive scale of LLMs, which are trained on vast internet datasets and contain trillions of parameters. Instead, SLMs with fewer than 10 billion parameters can deliver the necessary performance for specialized tasks, while reducing both processing time and infrastructure costs.

Unlike LLMs that depend on expensive GPUs and cloud infrastructure, ZeroGPU’s SLMs run efficiently on standard CPUs and can even operate within web browsers. This shift has led to significant savings for clients. Dappier, an AI monetization company, reported a 50% drop in overall expenses after adopting three of ZeroGPU’s SLMs for content classification, intent classification, and moderation. Dappier’s CEO Dan Goikhman said these models power brand-specific chatbots and AI agents that must quickly classify conversations and extract commercial intent from user interactions.

For example, when a user reads an article on a parenting site, Dappier’s chatbot-powered by an SLM-can suggest conversation prompts tailored to the context and continuously map both the article and the chat to IAB contextual categories. This helps publishers and advertisers understand the nature of the content and user engagement, informing which prompts to display and which advertisers might be interested in the resulting interactions. This approach to contextual targeting echoes strategies seen in other marketing initiatives, such as Lipton’s use of local creators to scale content across regions, as described in this related report.

Transitioning to ZeroGPU’s SLMs is designed to be straightforward. The company provides OpenAI-compatible endpoints, so clients only need to update a backend URL to switch from OpenAI’s API to ZeroGPU’s. Goikhman noted that Dappier completed the migration in about five minutes. Previously, Dappier relied on major LLMs like OpenAI and Claude for prompt generation and context analysis, but now trains SLMs on its own chatbot conversations and best practices for more targeted results.

ZeroGPU’s SLMs are trained for specific use cases, such as classifying articles and conversations into over 1,500 IAB categories. However, these models are not generalists; a content classification SLM, for instance, cannot perform sentiment analysis unless specifically trained for it. Arvapally explained that this specialization reduces hallucinations and speeds up response times, with SLMs delivering sub-50 millisecond results-something frontier models struggle to match due to their broader scope and slower processing.

Related articles