Queryloop

Latest from Queryloop

Stay updated with our latest research findings, product developments, and insights into AI optimization

Filter by:
Sort by:
Queryloop Team
March 3, 2025
5 min read
Product

Why Building Production-Grade RAG Applications Is So Hard

Learn why creating demo RAG applications is easy, but building production-grade systems is exponentially harder, and how Queryloop solves these challenges.

Creating a demo for Retrieval Augmented Generation (RAG) is easy, but building a production-grade app is 10x harder—if not more. For every blog or tutorial claiming you can launch a RAG app in under an hour, there are hundreds discussing the complexities of building LLM and RAG systems that reliably deliver acceptable accuracy, latency, and cost.

RAG
LLM
Optimization
Production
AI Applications
Queryloop
Zain ul Abideen
July 7, 2024
6 min read
Research

Align Phi3 with CPO-SimPO

Align your LLM with less memory and speed efficient approach than DPO

Aligning LLMs for optimal performance typically starts with Supervised Fine-Tuning (SFT). Commonly, the model is loaded in 4-Bit, and config for LoRA training is applied. The standard practice involves loading the model in 4-bit mode and applying configurations for LoRA (Low-Rank Adaptation) training. Direct Preference Optimization (DPO) is another prominent technique for optimizing models with lower costs. The standard practice involves coupling SFT+DPO to further improve model performance but can be costly. Odds Ratio Preference Optimization (ORPO) replaces the SFT+DPO into a single step with more enhanced performance by adding an odds ratio-based penalty to the conventional negative log-likelihood (NLL) loss for differentiating the generation styles between favored and disfavored responses. Another technique for more stable training and improved performance is CPO-SimPO. It aims to counter SFT's dependency on training data quality for model performance, DPO's memory + speed inefficiency (if dealing with both parametrized and reference policy) and to prevent the generation of long but low-quality sequences. In this blog, I will introduce this technique in detail and further train Phi3-Mini-4K-Instruct on CPO-SimPO.

AI
Machine Learning
Deep Learning
Optimization
CPO
SimPO