Ollama API
Ollama is an open-source framework that allows you to run large language models like Llama 2, Mistral, and Qwen locally on your machine. It provides a REST API compatible with OpenAI's Chat Completions endpoint, enabling developers to integrate LLMs into their applications without relying on external servers.
2
Endpoints
71
Views
Jul 20, 2025
Last Checked
NaN
Rate Limit
API Endpoints
Generates text using local language models
Full URL
http://localhost:11434/api/api/generate
Code Examples
curl -X POST 'http://localhost:11434/api/api/generate'
Parameters
{ "model": "llama2", "prompt": "Explain quantum computing basics", "stream": false }
Example Response
{
"done": true,
"model": "llama2",
"response": "Quantum computing uses qubits...",
"created_at": "2023-07-18T16:00:00Z"
}
Version
v1
Tags
llmlocal-ai
Technical Details
Authentication
NoneResponse Formats
JSONAvailability
globalStatus
Published Rate Limits
Local execution
Supported Languages
PythonJavaScript
Use Cases
private ai
local inference
Related APIs
Discover similar APIs that might interest you