Introducing Dynamo 8B, a 8.2 billion parameter language model that demonstrates leading performance on multilingual benchmarks
Purus suspendisse a ornare non erat pellentesque arcu mi arcu eget tortor eu praesent curabitur porttitor ultrices sit sit amet purus urna enim eget. Habitant massa lectus tristique dictum lacus in bibendum. Velit ut viverra feugiat dui eu nisl sit massa viverra sed vitae nec sed. Nunc ornare consequat massa sagittis pellentesque tincidunt vel lacus integer risu.
Mauris posuere arcu lectus congue. Sed eget semper mollis felis ante. Congue risus vulputate nunc porttitor dignissim cursus viverra quis. Condimentum nisl ut sed diam lacus sed. Cursus hac massa amet cursus diam. Consequat sodales non nulla ac id bibendum eu justo condimentum. Arcu elementum non suscipit amet vitae. Consectetur penatibus diam enim eget arcu et ut a congue arcu.
Vitae vitae sollicitudin diam sed. Aliquam tellus libero a velit quam ut suscipit. Vitae adipiscing amet faucibus nec in ut. Tortor nulla aliquam commodo sit ultricies a nunc ultrices consectetur. Nibh magna arcu blandit quisque. In lorem sit turpis interdum facilisi.
Vitae vitae sollicitudin diam sed. Aliquam tellus libero a velit quam ut suscipit. Vitae adipiscing amet faucibus nec in ut. Tortor nulla aliquam commodo sit ultricies a nunc ultrices consectetur. Nibh magna arcu blandit quisque. In lorem sit turpis interdum facilisi.
“Nisi consectetur velit bibendum a convallis arcu morbi lectus aecenas ultrices massa vel ut ultricies lectus elit arcu non id mattis libero amet mattis congue ipsum nibh odio in lacinia non”
Nunc ut facilisi volutpat neque est diam id sem erat aliquam elementum dolor tortor commodo et massa dictumst egestas tempor duis eget odio eu egestas nec amet suscipit posuere fames ded tortor ac ut fermentum odio ut amet urna posuere ligula volutpat cursus enim libero libero pretium faucibus nunc arcu mauris sed scelerisque cursus felis arcu sed aenean pharetra vitae suspendisse ac.
Model specifications:
*Pretraining on the multilingual dataset was done with a sequence length of 4096 tokens
Even with the constant stream of exciting updates in the LLM space, there is a relative lack of investment in non-English languages, which has resulted in a gap in performance between English and other languages for open-source language models. We built Dynamo 8B to address this gap in multilingual LLM offerings.
We are excited about the downstream applications that Dynamo 8B will support. AI teams today are struggling to address the challenge of unsafe user queries and LLM outputs, which results in major compliance challenges for enterprises that are looking to deploy the technology. As language models like LlamaGuard and Phi-2 were developed to act as more lightweight guardrail models to regulate LLM inputs and outputs, we are excited for Dynamo 8B to similarly enable safe and compliant usage of LLMs globally across a diverse set of languages.
That is why we are releasing Dynamo 8B in connection with the launch of DynamoGuard, a software platform for training guardrail models to enforce custom AI governance policies in enterprise LLMs. We envision leveraging Dynamo 8B to expand DynamoGuard’s capabilities to enable safer and more compliant LLM usage across the globe. Dynamo 8B's ability to achieve leading multilingual performance empowers AI teams to more safely deploy AI systems to their international users, employees, and partners while promoting adherence to emerging AI governance standards.
Dynamo 8B is an improvement upon the Mistral-7B architecture for the purpose of multilingual language modeling. It includes an extended tokenizer that was pretrained to better leverage tokens in different languages. This tokenizer was extended by training a sentence BPE tokenizer on select languages (200M tokens were used per language) and combines the merges/vocab that were not already present in the Mistral tokenizer. After the tokenizers were merged, the model was pretrained with an additional 210B tokens from multilingual datasets like German, Spanish, Korean, Italian, and Turkish texts. The pretraining dataset also incorporated English tokens to mitigate catastrophic forgetting.
Dynamo 8B has not been instruction fine-tuned and has not undergone alignment using techniques like reinforcement learning from human feedback. The intention behind crafting this model is to provide the research community with a model to explore vital multilingual capabilities that enable widespread use of LLMs globally.
In our recent evaluation, we used several multilingual benchmarks to assess our model's capabilities. These benchmarks included PAWS, XCOPA, and xstorycloze, all part of EleutherAI's evaluation harness. All of the runs were performed with 32bit precision. Here is an in-depth description of each benchmark we used:
Our released Dynamo 8B model is available on HuggingFace here.
To learn how to leverage Dynamo 8B for your AI solutions, schedule your free demo.