-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathOpen-Assistant.yaml
101 lines (82 loc) · 3.51 KB
/
Open-Assistant.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
# Thank you for contributing!
# In filling out this yaml file, please follow the criteria as described here:
# https://github.com/opening-up-chatgpt/opening-up-chatgpt.github.io/tree/main/projects#criteria
# You're free to build on this work and reuse the data. It is licensed under CC-BY 4.0, with the
# stipulation that attribution should come in the form of a link to http://opening-up-chatgpt.github.io
# and a citation to the paper in which the initial dataset & criteria were published:
# Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In CUI '23: Proceedings of the 5th International Conference on Conversational User Interfaces. July 19-21, Eindhoven. doi: 10.1145/3571884.3604316
system:
name: Open Assistant
link: https://open-assistant.io/
type: text
performanceclass: full
basemodelname: Pythia 12B
endmodelname: OpenAssistant Conversations
endmodellicense: Apache 2.0
releasedate: 2023-02
notes:
org:
name: LAION-AI
link: https://open-assistant.io/
notes:
# availability:
datasources_basemodel:
class: open
link: https://github.com/LAION-AI/Open-Assistant/tree/main/data/datasets
notes: Datasets documented in detail and recipes for cleaning up and downloading provided in code notebooks.
datasources_endmodel:
class: open
link: https://huggingface.co/datasets/OpenAssistant/oasst1
notes: OpenAssistant Conversations is 'a human-generated, human-annotated assistant-style conversation corpus consisting of 161443 messages distributed across 66497 conversation trees, in 35 different languages, annotated with 461292 quality ratings' (preprint)
weights_basemodel:
class: open
link: https://huggingface.co/OpenAssistant
notes: Model weights in several variants downloadable through HuggingFace
weights_endmodel:
class: closed
link:
notes: RLHF weights not separately released
trainingcode:
class: open
link: https://github.com/LAION-AI/Open-Assistant
notes: Code includes guide for developers
# documentation
code:
class: open
link: https://projects.laion.ai/Open-Assistant/docs/intro
notes: Separate website provides entry point to comprehensive documentation
architecture:
class: open
link: https://github.com/LAION-AI/Open-Assistant/tree/main/model
notes: Instructions to tune the pipeline on training data
preprint:
class: partial
link: https://arxiv.org/abs//2304.07327
notes: Preprint describes creation of OpenAssistant Conversations corpus for instruction tuning, but not the base LLM, hence partial.
paper:
class: partial
link: https://proceedings.neurips.cc/paper_files/paper/2023/hash/949f0f8f32267d297c2d4e3ee10a2e7e-Abstract-Datasets_and_Benchmarks.html
notes: Preprint was published in NeurIPS
modelcard:
class: partial
link: https://huggingface.co/OpenAssistant
notes: Various model cards exist
datasheet:
class: partial
link: https://docs.google.com/spreadsheets/d/1NYYa6vHiRnk5kwnyYaCT0cBO62--Tm3w4ihdBtp4ISk/edit?pli=1&gid=1537161081#gid=1537161081
notes: Most data sets are linked and some contain a data sheet.
# access
package:
class: open
link:
notes:
api:
class: open
link: https://projects.laion.ai/Open-Assistant/api
notes:
metaprompt: closed
licenses:
class: open
link: https://projects.laion.ai/Open-Assistant/docs/faq#what-license-does-open-assistant-use
notes: Apache 2.0