Text Synth - Technical Notes
Text Synth is build using the GPT-2 language model released by OpenAI. It is a neural network of 1.5 billion parameters based on the Transformer architecture.
GPT-2 was trained to predict the next word on a large database of 40 GB of internet texts.
- Model: select the model size. The medium model (345M parameters) is fast but generates slightly less accurate results than the large one (1558M parameters).
- Top-k, top-p, temperature: sampling parameters. They
determine how the next token is selected from the
probabilities computed by the model. Top-k selects the next
token among the
kmost probable ones. It is interesting to lower it if a precise answer is needed to a question. The downside is that the text tends to be repetitive. A too large value tends to give bad results because the model generates very unlikely tokens. Top-p is quite similar, but selects the next token among the most probables ones so that their cumulative probability is larger than
p. For most uses, it is not needed to change the temperature. More information is available in this article.
- Random seed: the next token is selected by using a deterministic random number generator. It is initialized by the random seed. Hence the same text is generated for a given seed. The default value 0 indicates to use a new random seed for each try.
This implementation is original because instead of using a GPU, it runs using only 4 cores of a Xeon E5-2640 v3 CPU at 2.60GHz. With a single user, it generates 10 tokens per second. It is programmed in plain C using the LibNC library. A standalone command line version can be downloaded here.
Thanks to OpenAI for providing their GPT-2 model.News:
- 2020-08-09: the size of the model, the sampling parameters (top-k, top-p, temperature) and the random number seed can now be modified.