Training Large Language Models to Reason in a Continuous Latent Space
Paper
โข 2412.06769 โข Published
โข 94
This checkpoint has been reproduced based on the code provided in the facebookresearch/coconut repository and the experimental settings described in the paper Training Large Language Models to Reason in a Continuous Latent Space. Please refer to these sources for further details on the methodology and configuration used in this experiment.
Base model
openai-community/gpt2