Phone Agent Training Pipeline

What it is

A data generation and fine-tuning pipeline that uses knowledge distillation to teach small language models (3.8B-7B parameters) how to handle inbound phone calls for a plumbing business. It converts raw conversational data into structured training examples with slot injection, normalizes conversation phases, and produces MLX-compatible datasets for efficient local training on Mac hardware.

Features

Generates structured training data with slot injection for context-aware responses
Normalizes conversation phases to align with finite state machine workflows
Converts data into multiple chat template formats including MLX and Phi-4
Fine-tunes local models on Apple Silicon using LoRA for efficient adaptation
Includes edge case handling for after-hours calls and emergency scenarios
Produces stratified train/val/test splits to ensure balanced phase coverage

Quickstart

python3 scripts/normalize_and_split.py
python3 scripts/convert_to_chat_templates.py -i data/splits/train.json -f mlx -o data/splits/train_mlx.jsonl
python3 scripts/convert_to_chat_templates.py -i data/splits/val.json -f mlx -o data/splits/val_mlx.jsonl
pip install mlx-lm
python -m mlx_lm.lora --model microsoft/phi-4-mini-instruct --data data/splits/ --train --batch-size 2 --lora-rank 8 --iters 600 --adapter-path adapters/phi4-mini

Architecture

flowchart TD
    RawData[Raw Conversations] --> Normalize[Normalize and Split]
    Normalize --> Splits[Train Val Test Splits]
    Splits --> Convert[Convert to Chat Templates]
    Convert --> MLXData[MLX Ready JSONL]
    MLXData --> FineTune[LoRA Fine Tuning]
    FineTune --> Adapters[LoRA Adapters]
    Adapters --> Deploy[Local Phone Agent]

How it's built

The pipeline starts with raw conversation data which is normalized and split using Python scripts. It injects structured context including current time, available slots, caller mood, and FSM state into every training example. Scripts convert these examples into turn-level prompt/completion pairs compatible with MLX, Phi-4, Mistral, or Gemma templates. Finally, the model is fine-tuned using mlx-lm with LoRA adapters to learn specific receptionist behaviors like slot filling and phase transitions.

How it runs

sequenceDiagram
    participant Developer
    participant NormalizeScript
    participant ConvertScript
    participant MLXLibrary
    participant Model
    Developer->>NormalizeScript: Run normalize_and_split.py
    NormalizeScript->>NormalizeScript: Phase normalization
    NormalizeScript->>NormalizeScript: Stratified splitting
    NormalizeScript-->>Developer: Train/Val/Test splits
    Developer->>ConvertScript: Run convert_to_chat_templates.py
    ConvertScript->>ConvertScript: Inject slot context
    ConvertScript->>ConvertScript: Format for MLX
    ConvertScript-->>Developer: MLX JSONL files
    Developer->>MLXLibrary: Run mlx_lm.lora
    MLXLibrary->>Model: Load base model
    MLXLibrary->>Model: Apply LoRA updates
    Model-->>Developer: Save adapter weights

How to apply & reuse

Use this pipeline when you need to customize a small language model for a specific voice agent workflow without relying on expensive cloud APIs. It is ideal for developers with Apple Silicon hardware who want to iterate quickly on domain-specific dialogue policies, such as scheduling logic or empathy handling, by generating synthetic data that mirrors production FSM states.

At a glance

CapabilitiesSynthetic data generationSlot injection trainingPhase normalizationLoRA fine-tuningTemplate conversionLocal deployment preparation

Componentsnormalize_and_split.pyconvert_to_chat_templates.pybuild_final_training.pyrewrite_with_slots.pyretrain_from_fsm.pyadd_json_responses.py

TechPythonMLXLoRAJSONLApple SiliconPhi-4 Mini

Depends onmlx-lmpython3jsonpathlibrerandom

Integrates withphone-agent-schedulerClaude APIGGUF quantizationMistral templatesGemma templates

PatternsKnowledge distillationSlot fillingFinite state machine alignmentTurn-level decompositionStratified sampling

Reuse tagsvoice-agentlocal-llmfine-tuningdata-pipelineapple-siliconcustomer-service

⚠ Needs attention

unmerged_branch: dependabot/pip/pip-804f26c466 is 1 commit ahead of the default branch
open_pr: PR #1: Bump transformers from 4.57.6 to 5.5.0 in the pip group across 1 directory