How was GPT 3 trained by OpenAI? How to train GPT 3 on Data, content specific to my company?

10 min readMay 26, 2023

Open AI trained GPT 3 .5 model ( pre trained set of Transformers) and crafted Chat GPT using Reinforcement Learning from Human Feedback (RLHF) And supervised fine-tuning.

To create a reward model for reinforcement learning, Open AI team during training randomly selected. model-written message, sampled several alternative completions, and had Human AI trainers rank them. Using these reward models, Open AI fine-tune the model using Proximal Policy Optimization. Open AI performed several iterations of this process.

How should AI systems behave, and who should decide?

We're clarifying how ChatGPT's behavior is shaped and our plans for improving that behavior, allowing more user…

openai.com

ChatGPT is fine-tuned from a model in the GPT-3.5 series, which finished training in early 2022. ChatGPT were trained on an…

How should AI systems behave, and who should decide?

We're clarifying how ChatGPT's behavior is shaped and our plans for improving that behavior, allowing more user…

Written by Girish Kurup