Ollama API

Ollama is an open-source framework that allows you to run large language models like Llama 2, Mistral, and Qwen locally on your machine. It provides a REST API compatible with OpenAI's Chat Completions endpoint, enabling developers to integrate LLMs into their applications without relying on external servers.

2

Endpoints

71

Views

Jul 20, 2025

Last Checked

NaN

Rate Limit

API Endpoints

Generates text using local language models

Full URL

http://localhost:11434/api/api/generate

Code Examples

curl -X POST 'http://localhost:11434/api/api/generate'

Parameters

{
  "model": "llama2",
  "prompt": "Explain quantum computing basics",
  "stream": false
}

Example Response

{
  "done": true,
  "model": "llama2",
  "response": "Quantum computing uses qubits...",
  "created_at": "2023-07-18T16:00:00Z"
}

Version

v1

Tags
llmlocal-ai
Technical Details
Authentication
None
Response Formats
JSON
Availability
global
Status
Published
Rate Limits

Local execution

Supported Languages
PythonJavaScript
Use Cases
private ai
local inference
https://ollama.com/

Related APIs

Discover similar APIs that might interest you

APIfreemium

Google Cloud Vision AI

Google Cloud Vision AI API provides powe...

Category
Machine Learning
Endpoints
2
computer-visionimage-recognition
APIopen source

SpeechBrain

SpeechBrain is a comprehensive open-sour...

Category
Machine Learning
Endpoints
1
asrspeech-processing
APIopen source

Haystack API

Haystack is a robust open-source Python/...

Category
Machine Learning
Endpoints
1
nlpsearch