Ollama API

Ollama is an open-source framework that allows you to run large language models like Llama 2, Mistral, and Qwen locally on your machine. It provides a REST API compatible with OpenAI's Chat Completions endpoint, enabling developers to integrate LLMs into their applications without relying on external servers.

2

Endpoints

71

Views

Jul 20, 2025

Last Checked

NaN

Rate Limit

API Endpoints

Generates text using local language models

Full URL

http://localhost:11434/api/api/generate

Code Examples

curl -X POST 'http://localhost:11434/api/api/generate'

Parameters

{
  "model": "llama2",
  "prompt": "Explain quantum computing basics",
  "stream": false
}

Example Response

{
  "done": true,
  "model": "llama2",
  "response": "Quantum computing uses qubits...",
  "created_at": "2023-07-18T16:00:00Z"
}

Version

v1

Tags
llmlocal-ai
Technical Details
Authentication
None
Response Formats
JSON
Availability
global
Status
Published
Rate Limits

Local execution

Supported Languages
PythonJavaScript
Use Cases
private ai
local inference

Related APIs

Discover similar APIs that might interest you

APIfreemium

Google Cloud Vision AI

Google Cloud Vision AI API provides powe...

Category
Machine Learning
Endpoints
2
computer-visionimage-recognition
APIopen source

SpeechBrain

SpeechBrain is a comprehensive open-sour...

Category
Machine Learning
Endpoints
1
asrspeech-processing
APIopen source

Haystack API

Haystack is a robust open-source Python/...

Category
Machine Learning
Endpoints
1
nlpsearch
Free Ollama API – Run LLMs Locally | Free API Hub