Перечень открытых LLM на гитхабе

  • Автор темы Автор темы AlexZir
  • Дата начала Дата начала

AlexZir

Постоянный участник
Сообщения
146
Решения
1
Реакции
52
Пруф: GitHub - eugeneyan/open-llms: 🤖 A list of open LLMs available for commercial use.

Open LLMs

Эти LLMs лицензированы для использования в коммерческих проектах (e.g., Apache 2.0, MIT, OpenRAIL-M).

Языковая модель Дата релиза Страница/Блог Лицензия
T5 2019/10 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Apache 2.0
UL2 2022/10 UL2 20B: An Open Source Unified Language Learner Apache 2.0
Cerebras-GPT 2023/03 Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models (Paper) Apache 2.0
Open Assistant (Pythia family) 2023/03 Democratizing Large Language Model Alignment Apache 2.0
Pythia 2023/04 Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Apache 2.0
Dolly 2023/04 Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM MIT
DLite 2023/05 Announcing DLite V2: Lightweight, Open LLMs That Can Run Anywhere Apache 2.0
RWKV 2021/08 The RWKV Language Model (and my LM tricks) Apache 2.0
GPT-J-6B 2023/06 GPT-J-6B: 6B JAX-Based Transformer Apache 2.0
GPT-NeoX-20B 2022/04 GPT-NeoX-20B: An Open-Source Autoregressive Language Model Apache 2.0
Bloom 2022/11 BLOOM: A 176B-Parameter Open-Access Multilingual Language Model OpenRAIL-M v1
StableLM-Alpha 2023/04 Stability AI Launches the First of its StableLM Suite of Language Models CC BY-SA-4.0
FastChat-T5 2023/04 We are excited to release FastChat-T5: our compact and commercial-friendly chatbot! Apache 2.0
h2oGPT 2023/05 Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey Apache 2.0
MPT-7B 2023/05 Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs Apache 2.0, CC BY-SA-3.0
RedPajama-INCITE 2023/05 RedPajama-INCITE family of models including base, instruction-tuned & chat models Apache 2.0
OpenLLaMA 2023/05 OpenLLaMA: An Open Reproduction of LLaMA Apache 2.0

Open LLMs для разработки

Языковая модель Дата релиза Страница/Блог Лицензия
SantaCoder 2023/01 SantaCoder: don't reach for the stars! OpenRAIL-M v1
StarCoder 2023/05 StarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you! OpenRAIL-M v1
StarChat Alpha 2023/05 Creating a Coding Assistant with StarCoder OpenRAIL-M v1
Replit Code 2023/05 Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit CC BY-SA-4.0
CodeGen2 2023/04 CodeGen2: Lessons for Training LLMs on Programming and Natural Languages Apache 2.0
CodeT5+ 2023/05 CodeT5+: Open Code Large Language Models for Code Understanding and Generation BSD-3-Clause

Open LLM datasets для предварительного обучения

Название Дата релиза Страница/Блог Dataset Tokens (T) Лицензия
starcoderdata 2023/05 StarCoder: A State-of-the-Art LLM for Code starcoderdata 0.25 Apache 2.0
RedPajama 2023/04 RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens RedPajama-Data 1.2 Apache 2.0

P.S. За текстовое представление извиняюсь, не получается оформить в виде таблицы
 
Назад
Сверху Снизу