Adapting Large Language Models: In-Context Learning, Fine-Tuning, and RLHF
Contextual Learning and Indexing
Modern generative Large Language Models (LLMs) demonstrate contextual learning capabilities, allowing them to perform new tasks without weight updates. By providing a few examples within the input prompt, the model can infer the desired pattern and generate appropriate responses. This approach is particularly advantageous when internal model access is restricted, such as when interacting via APIs.
A related concept is prompt modification. Hard prompting involves manually altering the input tokens to steer the output, which is labor-intensive and suboptimal. Conversely, soft prompting (or prompt tuning) optimizes continuous embeddings algorithmically, offering a parameter-efficient alternative, though it may struggle with complex task adaptation.
Indexing, commonly associated with Retrieval-Augmented Generation (RAG), extends contextual learning by converting the LLM into an information retrieval engine. External documents are chunked, transformed into vector embeddings, and stored in a vector database. Upon receiving a query, the system computes the similarity between the query embedding and the stored vectors, retrieving the top-k matches to contextualize the LLM's response.
Three Feature-Based Adaptation Strategies
When full access to the model is available, adapting the LLM using domain-specific data typically yields superior results. Three primary methodologies exist for this adaptation, applicable to both encoder and decoder architectures.
Embedding Extraction (Feature-Based Approach)
This method utilizes the pre-trained LLM as a frozen feature extractor. The model processes the target dataset to generate output embeddings, which then serve as input features for a downstream classifier, such as a Random Forest or Logistic Regression model.
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from transformers import AutoModel
backbone = AutoModel.from_pretrained('distilroberta-base')
# Tokenization steps omitted for brevity
def compute_embeddings(data_batch):
with torch.no_grad():
outputs = backbone(
input_ids=data_batch['input_ids'],
attention_mask=data_batch['attention_mask']
)
cls_embeddings = outputs.last_hidden_state[:, 0, :]
return {'embedding_vectors': cls_embeddings}
vectorized_dataset = tokenized_data.map(compute_embeddings, batched=True, batch_size=16)
train_x = np.array(vectorized_dataset['train']['embedding_vectors'])
train_y = np.array(vectorized_dataset['train']['targets'])
test_x = np.array(vectorized_dataset['test']['embedding_vectors'])
test_y = np.array(vectorized_dataset['test']['targets'])
rf_classifier = RandomForestClassifier(n_estimators=100)
rf_classifier.fit(train_x, train_y)
print(f'Test Score: {rf_classifier.score(test_x, test_y)}')
Classifier Head Training (Output Layer Updating)
Rather than training an external classifier, this strategy attaches a new classification head to the LLM. The base model's parameters remain frozen, and only the newly added output layers are trained. This mimics the feature-based approach but integrates the classifier training directly into the neural network pipeline.
from transformers import AutoModelForSequenceClassification
import pytorch_lightning as pl
classifier_model = AutoModelForSequenceClassification.from_pretrained(
'distilroberta-base', num_labels=2
)
# Freeze the entire backbone
for param in classifier_model.base_model.parameters():
param.requires_grad = False
# Enable gradients for the classification head only
for param in classifier_model.classifier.parameters():
param.requires_grad = True
# Training loop
trainer = pl.Trainer(max_epochs=5)
trainer.fit(classifier_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
trainer.test(classifier_model, dataloaders=test_loader)
Full Network Training (All Layers Updating)
Updating all parameters of the LLM represents the gold standard for maximizing performance, especially when the target domain diverges significantly from the pretraining data. While computationally expensive, unfreezing the entire network allows the model to deeply internalize the nuances of the new task.
from transformers import AutoModelForSequenceClassification
import pytorch_lightning as pl
full_model = AutoModelForSequenceClassification.from_pretrained(
'distilroberta-base', num_labels=2
)
# Ensure all parameters are trainable
for param in full_model.parameters():
param.requires_grad = True
# Training loop
trainer = pl.Trainer(max_epochs=5)
trainer.fit(full_model, train_dataloaders=train_loader, val_dataloaders=val_loader)
trainer.test(full_model, dataloaders=test_loader)
Parameter-Efficient Fine-Tuning (PEFT)
Full network training demands immense computational resources. PEFT techniques enable adapting massive models by updating only a tiny fraction of parameters, yielding five core benefits: reduced computational overhead, faster training cycles, lower hardware barriers, mitigated catastrophic forgetting, and efficient storage sharing across tasks.
Libraries like Hugging Face PEFT facilitate these strategies, supporting methods such as Low-Rank Adaptation (LoRA), Prefix Tuning, P-Tuning, and Prompt Tuning. Instead of modifying all weights, these approaches introduce small, trainable auxiliary modules or prefixes across various layers, achieving high performance at a fraction of the cost.
Reinforcement Learning from Human Feedback (RLHF)
RLHF aligns LLMs with human preferences using a combination of supervised and reinforcement learning. Popularized by InstructGPT and ChatGPT, the process begins by collecting human rankings on different model outputs. These rankings train a separate reward model, which automates the evaluation of LLM responses. The primary LLM is then optimized using Proximal Policy Optimization (PPO) guided by the reward model. This indirect approach resolves the bottleneck of requiring real-time human feedback during the training phase.