<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=145304570664993&amp;ev=PageView&amp;noscript=1">

91ƵAPP

Dolly logo

Apr 26, 2023

Dolly 2.0 - open source language model with ChatGPT-like interactivity

Written By:

Alex McKinney

We're Hiring

Join us and build the next generation AI stack - including silicon, hardware and software - the worldwide standard for AI compute

Join our team

Dolly 2.0, an open-source large language model (LLM) that delivers ChatGPT-like instruction-following interactivity, is now available to run as a Paperspace Gradient Notebook, powered by 91ƵAPP IPUs.

The 12 billion parameter model was and is based on EleutherAI’s Pythia. 

The trained weights, source code, and dataset for Dolly 2.0 have been released under an open-source and commercial-use license, making it the first truly open, instruction fine-tuned LLM. Prior models are subject to more stringent licensing, making them unusable for commercial applications.

Training LLMs for human-computer interaction 

Attempting to elicit answers from LLMs that haven’t been appropriately fine-tuned requires prompt engineering to produce consistently useful responses – an experience that can be frustrating for users. This is because base LLMs are trained to simply predict the next token, which does not necessarily correlate with a good or even correct responses. 

Fine-tuning on an instruction-following dataset makes the model more suited to human interaction: simply ask the model a question and get a response back. This makes it ideal for Q&A applications. 

LLMs and commercial restrictions 

Dolly 2.0’s predecessor Dolly 1.0 was trained using the Stanford Alpaca dataset. Alpaca, in turn, uses some outputs from OpenAI’s ChatGPT. As a result, Dolly 1.0 was bound by ChatGPT’s licence restrictions regarding commercial use – preventing Dolly users  from building products and services around the model. 

Dolly 1.0 is not alone in this respect. Similar limitations affect many recently released instruction-following LLMs, including Koala, GPT4All, and Vicuna.

Generating an original dataset

To address these problems, the Databricks team needed to generate a corpus of training data, written by humans, of a similar size to the 13,000 prompt-response pairs used by OpenAI to train InstructGPT, a sibling model to ChatGPT. 

The company turned to its 5,000 employees, gamifying the process of creating training data by running contest to write the instruction and response pairs. Prizes were offered for the top 20 labelers, across seven specific dataset tasks. Using this approach, combined with a competitive leader board, Databricks managed to create a dataset with more than 15,000 instruction-response pairs. 

The dataset tasks encompass the following (from Databricks’ blog): 

Open Q&A: For instance, “Why do people like comedy movies?” or “What is the capital of France?” In some cases, there’s not a correct answer, and in others, it requires drawing on knowledge of the world at large. 

Closed Q&A: These are questions that can be answered using only the information contained in a passage of reference text. For instance, given a paragraph from Wikipedia on the atom, one might ask, “What is the ratio between protons and neutrons in the nucleus?” 

Extract information from Wikipedia: Here an annotator would copy a paragraph from Wikipedia and extract entities or other factual information such as weights or measurements from the passage. 

Summarize information from Wikipedia: For this, annotators provided a passage from Wikipedia and were asked to distil it to a short summary. 

Brainstorming: This task asked for open-ended ideation and an associated list of possible options. For instance, “What are some fun activities I can do with my friends this weekend?”. 

Classification: For this task, annotators were asked to make judgments about class membership (e.g. are the items in a list animals, minerals or vegetables) or to judge the properties of a short passage of text, such as the sentiment of a movie review. 

Creative writing: This task would include things like writing a poem or a love letter. 

The resulting model, Dolly 2.0, is licensed under the , which allows anyone to use it, modify it, and create a commercial application using it.  

It is worth noting that Dolly 2.0 is still under active development, so expect to see further versions with better performance in the near future.

Getting Started: How to use Dolly 2.0 on IPUs

You can try out Dolly 2.0 for free, using a Paperspace Gradient Notebook, powered by 91ƵAPP IPUs.

The notebook takes you through the process of downloading the model weights, creating an inference pipeline, and querying Dolly 2.0 with instructions and questions. 

Dolly 2.0 fits in our Paperspace free tier environment, using a 91ƵAPP POD4 system. Users can also scale up to a paid POD16 system for faster inference. Follow the Paperspace Gradient link below to begin.

Dolly is fun to interact with and could easily be fine-tuned to have a particular personality as it answers user questions. As is, it could be used with an appropriate safety filter for real world information or just as a fun on-brand character.

If you are interested in fine tuning Dolly 2.0 on IPUs with your own datasets,  please let us know using this form and we’ll help you get started.