Informatique et sciences de l'informationAnglaisMedium

Stories by Research Graph on Medium

Stories by Research Graph on Medium
Stories by Research Graph on Medium
Page d'accueilFlux RSS
language
Publié
Auteur Xuzeng He

Supervised Fine-tuning, Reinforcement Learning from Human Feedback and the latest SteerLM Author · Xuzeng He ( ORCID: 0009–0005–7317–7426) Introduction Large Language Models (LLMs), usually trained with extensive text data, can demonstrate remarkable capabilities in handling various tasks with state-of-the-art performance. However, people nowadays typically want something more personalised instead of a general solution.