Text Model progress is going strong!

As previously announced, we are now training our own large language models from scratch.
Since this is not an easy endeavor, we are starting out with a small proof of concept model.
This allows us to identify and fix various issues with the training methodology and data, without losing a large amount of time and compute when things go wrong.
So far, this approach has already paid off and we have fixed a number of issues during the process of training our small proof of concept model, one of which did necessitate a restart of the training run. The full pretraining run for this small model, speaking from a compute-only perspective, takes about five days on our — only recently completed — cluster.
At this point in time, we have finished the pretraining phase with very promising results (73% LAMBADA score and other evals close to or beyond GPT-NeoX 20B). We have now advanced to tuning the model for a larger 8192 token context size.
During this we identified some further data issues, which are now addressed, so training could proceed further.
We plan to release this smaller proof of concept model in the near future, so people can get a first impression of the kind of results our training methodology can achieve, while we begin working on a larger, more powerful model.
That’s all the news for now… except, we have a new mascot introduction coming up:
Ladies and gentlemen, gather ‘round and brace yourselves for a mind-bending encounter like no other!

Prepare to delve into the boundless realms of innovation and imagination as we unveil to you the one, the only… Shoggy! But fear not, for this Shoggoth is not of the eldritch horror variety that might send shivers down your spine.
No, no! This Shoggoth, affectionately known as “Shoggy,” is the extraordinary visual representation of NovelAI’s cutting-edge H100 supercompute cluster.
Using Shoggy as a conduit for inspiration, we shape our innovations:
Whether you’re a writer seeking the perfect plot twist or an artist yearning for a stroke of genius, this magnificent amalgamation of data and dreams is here to shape our work into new fantastical text and image models.
Let’s see who our Shoggy shapes first…
