Zero-Shot & Few-Shot

Zero-shot means you provide no labeled examples—the model relies solely on its pre-training plus the instructions (and optional label names or schema) you include.
Few-shot adds a small set (typically 2-5) labeled examples to the prompt—these can be hard-coded or retrieved automatically with vector search (dynamic few-shot)—and often boosts accuracy by ~5-15%.

Zero-Shot Example: Sentiment Analysis

curl https://api.inference.net/v1/chat/completions \
  -H "Authorization: Bearer $INFERENCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.2-3b-instruct/fp-16",
    "messages": [
      { "role": "system",
        "content": "You are a sentiment classifier. Reply with JSON only." },
      { "role": "user",
        "content": "The battery lasts forever—love this phone!" }
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "sentiment",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "label": {
              "type": "string",
              "enum": ["positive","neutral","negative"]
            },
            "confidence": {
              "type": "number",
              "minimum": 0,
              "maximum": 1
            }
          },
          "required": ["label","confidence"],
          "additionalProperties": false
        }
      }
    }
  }'

Few-Shot Example: Customer Support Intent

Adding examples dramatically improves accuracy for domain-specific tasks:

curl https://api.inference.net/v1/chat/completions \
  -H "Authorization: Bearer $INFERENCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "model": "meta-llama/llama-3.2-3b-instruct/fp-16",
  "messages": [
    {
      "role": "system",
      "content": "You are a support ticket classifier."
    },
    {
      "role": "user",
      "content": "Classify customer support requests into intents. Examples:\n\n\"My order hasn't arrived yet\" → billing_issue\n\"How do I reset my password?\" → account_help\n\"The app keeps crashing on iOS\" → technical_support\n\"I want to cancel my subscription\" → billing_issue\n\"Can you explain how the free trial works?\" → product_info\n\nNow classify: \"I was charged twice for the same order\"\nReply with JSON only."
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "intent_classification",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "intent": {
            "type": "string",
            "enum": ["billing_issue", "account_help", "technical_support", "product_info", "other"]
          },
          "confidence": { "type": "number", "minimum": 0, "maximum": 1 },
          "reasoning":  { "type": "string", "maxLength": 100 }
        },
        "required": ["intent", "confidence", "reasoning"],
        "additionalProperties": false
      }
    }
  }
}

Model Selection Guide

Model Size	Use Case	Accuracy	Speed	Cost
Small (1-3B)	Simple binary classification, high volume	Good	Fast	$
Medium (7-8B)	Multi-class, nuanced sentiment	Better	Medium	$$
Large (70B+)	Complex reasoning, few-shot learning	Best	Slower	$$$

Advanced Techniques

1. Chain-of-Thought for Complex Cases

prompt = """Classify this email as urgent/normal/low priority.

Think step by step:
1. What is the sender asking for?
2. Are there time-sensitive keywords?
3. What's the business impact?

Email: "Hi, our production API is returning 500 errors for all users since 2pm. Customers can't complete purchases. Please help ASAP!"

Classification:"""

2. Confidence Thresholding

There are two good ways to get a confidence score for an LLM:

You can simply ask the model to return a confidence score, and parse it out of the structured output response. This will work well enough for some use cases, but note that LLMs are not great at this, and even worse at returning decimal values.
You can use logprobs to get a confidence score. Some Inference.net models support logprobs, which are the log of the probability of the most likely token. You can then use the logprobs to get a confidence score.

To learn more about using logprobs to assess confidence for classification tasks, check out the OpenAI cookbook: https://cookbook.openai.com/examples/using_logprobs. Once we have a confidence score, we can use it to flag low confidence results for review, pass them to a more powerful model.

3. Batch Classification

For high volume, use the Batch API to process large volumes of data.

Why Zero-Shot Works So Well

Pre-trained Knowledge – Models already understand concepts like sentiment, topics, and intent
Natural Language Labels – No need for numeric codes; use descriptive names like “frustrated_customer”
Context Awareness – Considers surrounding text, not just keywords
Robustness – Handles typos, slang, and informal language naturally

Common Pitfalls & Solutions

Problem	Cause	Solution
Inconsistent labels	Vague categories	Use specific, non-overlapping labels or prompt label descriptions
Low confidence	Ambiguous input text	Add few-shot examples for edge cases
Wrong language	Model defaults to English	Specify language in system prompt
Slow responses	Large model for simple task	Use smaller model or batch processing

Get Started

Features

Fine-Tuning

Use Cases

Resources

Zero-Shot Example: Sentiment Analysis

Few-Shot Example: Customer Support Intent

Model Selection Guide

Advanced Techniques

1. Chain-of-Thought for Complex Cases

2. Confidence Thresholding

3. Batch Classification

Why Zero-Shot Works So Well

Common Pitfalls & Solutions

Get Started

Features

Fine-Tuning

Use Cases

Resources

​Zero-Shot Example: Sentiment Analysis

​Few-Shot Example: Customer Support Intent

​Model Selection Guide

​Advanced Techniques

​1. Chain-of-Thought for Complex Cases

​2. Confidence Thresholding

​3. Batch Classification

​Why Zero-Shot Works So Well

​Common Pitfalls & Solutions

Zero-Shot Example: Sentiment Analysis

Few-Shot Example: Customer Support Intent

Model Selection Guide

Advanced Techniques

1. Chain-of-Thought for Complex Cases

2. Confidence Thresholding

3. Batch Classification

Why Zero-Shot Works So Well

Common Pitfalls & Solutions