Classification Types
Type | Example | Output | Use Cases |
---|---|---|---|
Binary | spam / not-spam, approve / reject | Single label (2 options) | Content moderation, fraud detection, A/B testing |
Multi-Class | news topic → Politics / Sports / Tech | Single label (3+ options) | Document routing, sentiment analysis, intent detection |
Multi-Label | article tags → AI + Healthcare + Ethics | Multiple labels | Product tagging, skill extraction, content categorization |
Why Use LLMs for Classification?
Why LLMs for Classification?
- Zero-shot: No training data needed—just describe your labels and go. Great for quick prototyping and new domains.
- Few-shot: Add 2–5 examples to boost accuracy, especially for niche or tricky cases.
- Multilingual: One model, all languages. Handles code-switching and mixed content out of the box.
- Reasoning: Understands nuance, context, and even sarcasm. Can explain its decisions if you need transparency.
Common Use Cases
Domain | Task | Traditional ML Challenge | LLM Advantage |
---|---|---|---|
Customer Support | Intent classification | Requires large labeled dataset | Works with just intent descriptions |
Content Moderation | Toxicity detection | Struggles with context/sarcasm | Understands nuanced harmful content |
E-commerce | Product categorization | Needs product-specific training | Generalizes across product types |
HR/Recruiting | Resume screening | Biased on historical data | Can focus on skills vs. demographics |
Finance | Document classification | Regulatory compliance complexity | Adapts to new regulations quickly |
Healthcare | Symptom triage | Requires medical expertise | Leverages medical knowledge from training |
When to Use LLMs vs. Traditional ML
Choose LLMs when:
- Limited labeled data (< 1,000 examples)
- Rapid iteration needed on label definitions
- Complex reasoning required (context, tone, implications)
- Multilingual support needed
- Explainable decisions are important
Choose Traditional ML when:
- Massive labeled datasets available (> 10,000 examples)
- Ultra-low latency required (< 10ms)
- Extreme cost sensitivity (millions of predictions/day)
- Simple pattern matching sufficient
Hybrid Approach: LLM → Small Model Distillation
- Use LLM to generate training data and initial predictions
- Fine-tune a smaller, faster model on LLM outputs
- Get 80-90% of LLM accuracy at 10x lower cost/latency
Need multiple labels per input? See our Multi-Label guide for handling non-exclusive categories.