TECHNIQUE
Model Adaptation
| Name | Kind | When | Maturity |
|---|---|---|---|
| DPO via TRL | library | pairwise preference data exists; simpler and stabler than RLHF | established |
| KTO | pattern | only thumbs-up/down signals exist, not pairwise comparisons | emerging |
No published applications observed using this technique yet.
Teardown coverage accrues forward — the taxonomy is the map, the count is the honest state of it.
Back to the technique map