open source

StarCoder2

Provided by: Framework: PyTorch

StarCoder2 is a large-scale open-source AI model developed by BigCode for code generation and comprehension tasks. Built with PyTorch and licensed under Apache 2.0, it supports multiple programming languages and is optimized for both code completion and generation. The model is designed to aid developers by automating code writing, improving productivity, and enabling advanced programming assistance.

Model Performance Statistics

13

Views

May 12, 2024

Released

Jul 20, 2025

Last Checked

2.0

Version

Capabilities
  • Code Completion
  • Debugging
  • Doc Generation
Performance Benchmarks
HumanEval81.5%
MultiPL-E76.9%
Technical Specifications
Parameter Count
N/A
Training & Dataset

Dataset Used

The Stack v2, GitHub public repos

Related AI Models

Discover similar AI models that might interest you

Modelopen source

DeepSeek-Coder

DeepSeek-Coder

DeepSeek-Coder

DeepSeek AI

DeepSeek‑Coder is a series of open-source code language models developed by DeepSeek AI using PyTorch. Trained from scratch on 2 trillion tokens (87% code, 13% natural language), with model sizes from 1.3B to 33B parameters and a 16K window context. It excels at project‑level code completion, infilling, and supports dozens of programming languages. It consistently leads benchmarks like HumanEval, MultiPL‑E, and MBPP in open-source comparisons.

Code Generationcode-generationdeveloper
13
Modelopen source

Emu2-Chat

Emu2-Chat

Emu2-Chat

Beijing Academy of AI

Emu2-Chat is a conversational AI model designed for engaging and context-aware chat interactions. It is optimized for natural language understanding and generating human-like responses across various domains. Ideal for chatbots, virtual assistants, and customer support automation.

Multimodalconversational
94
Modelopen source

GPT-Neo

GPT-Neo

GPT-Neo

EleutherAI

GPT-Neo is an open-source large language model developed by EleutherAI, designed as an alternative to OpenAI’s GPT-3. It uses the Transformer architecture to generate coherent, human-like text based on a given prompt. GPT-Neo is trained on the Pile dataset, which is a diverse and large-scale text corpus, making it capable of many NLP tasks such as text generation, summarization, translation, and question answering. GPT-Neo models come in different sizes, the most popular being the 1.3B and 2.7B parameter versions.

Natural Language Processingnlp
42