Informática y Ciencias de la InformaciónInglésMedium

Stories by Research Graph on Medium

Stories by Research Graph on Medium
Stories by Research Graph on Medium
Página de inicioFeed RSS
language
Publicado
Autor Xuzeng He

Supervised Fine-tuning, Reinforcement Learning from Human Feedback and the latest SteerLM Author · Xuzeng He ( ORCID: 0009–0005–7317–7426) Introduction Large Language Models (LLMs), usually trained with extensive text data, can demonstrate remarkable capabilities in handling various tasks with state-of-the-art performance. However, people nowadays typically want something more personalised instead of a general solution.