The three options
Prompt engineering: guide the model with text. Free, immediate. RAG: retrieve relevant documents and pass them to the model. Moderate cost, dynamic. Fine-tuning: retrain the model on specific data. High cost, deep.
Prompt engineering: start here
80% of use cases solve with good prompt engineering. System prompt with clear instructions, few-shot examples, chain-of-thought reasoning, specific output format. If your use case requires no specific data or training, this is the answer.
RAG: for dynamic data
If you need the model to respond with private or updated information (product catalog, documentation, knowledge base), RAG is the architecture. Setup cost: ~$20-100. Monthly: $100-500. Time: 1-2 weeks for solid implementation.
Fine-tuning: when you really need it
Fine-tuning works for: style/voice (the model needs to write exactly like your brand), specific domain (medical, legal, scientific with specialized jargon), specific complex tasks (very specific code, specific languages). Cost: $500-$50K depending on scale.
Hybrid approach
Many real solutions combine the three: fine-tuned model for style/voice + RAG for updated data + prompt engineering per request. Best of three worlds, higher complexity but very competitive solution.
Comparison
Time: hours
Time: weeks
Time: months
Decision tree
1. Try prompt engineering first. Always. If it solves it, you're done. 2. If you need dynamic/private data: implement RAG. 3. If after RAG + prompt engineering still doesn't work for style or domain: consider fine-tuning. 4. Never start with fine-tuning — it's the most expensive option and usually unnecessary.
Conclusion
The intuition that "you have to fine-tune your model" is usually wrong. Modern frontier models with good prompt engineering + RAG solve 90% of enterprise cases. Fine-tuning makes sense in niche cases where the other two don't suffice. The skill is knowing which is which.