FreeAPIHub
HomeAPIsAI ModelsAI ToolsBlog
Favorites
FreeAPIHub

The central hub for discovering, testing, and integrating the world's best AI models and APIs.

Platform

  • Categories
  • AI Models
  • APIs

Company

  • About Us
  • Contact
  • FAQ

Help

  • Terms of Service
  • Privacy Policy
  • Cookies

© 2026 FreeAPIHub. All rights reserved.

GitHubTwitterLinkedIn

Table of Contents

  1. 1What Is Detectron2 and Why Use an API for It?
  2. 2Why Use This Free Object Detection API
  3. 3Step-by-Step Setup
  4. 4Code Examples: Python and JavaScript
  5. 5Python Example: Basic Fetch
  6. 6Python Example: Practical Object Detection Script
  7. 7JavaScript Example: Fetch Detections With Error Handling
  8. 8Understanding the Output
  9. 9Error Handling: What Actually Breaks
  10. 10Real-World Use Cases
  11. 11Comparison: Hosted API vs Local Detectron2
  12. 12FAQ
  13. 13Do I need to install Detectron2 locally to follow this tutorial?
  14. 14Is this really a free computer vision free api?
  15. 15What's the difference between object detection and instance segmentation?
  16. 16Why is my first request so slow?
  17. 17Can I use my own images instead of a URL?
  18. 18What object classes does the model recognise?
  19. 19Conclusion

Table of Contents

19 sections

  1. 1What Is Detectron2 and Why Use an API for It?
  2. 2Why Use This Free Object Detection API
  3. 3Step-by-Step Setup
  4. 4Code Examples: Python and JavaScript
  5. 5Python Example: Basic Fetch
  6. 6Python Example: Practical Object Detection Script
  7. 7JavaScript Example: Fetch Detections With Error Handling
  8. 8Understanding the Output
  9. 9Error Handling: What Actually Breaks
  10. 10Real-World Use Cases
  11. 11Comparison: Hosted API vs Local Detectron2
  12. 12FAQ
  13. 13Do I need to install Detectron2 locally to follow this tutorial?
  14. 14Is this really a free computer vision free api?
  15. 15What's the difference between object detection and instance segmentation?
  16. 16Why is my first request so slow?
  17. 17Can I use my own images instead of a URL?
  18. 18What object classes does the model recognise?
  19. 19Conclusion

Trending

1

Flask vs Django vs FastAPI: Choosing the Best Python Web Framework

7 min844
2

Top 7 Free AI Tools for Academic Research and Paper Discovery

7 min648
3

Master API Testing with Postman: A Complete Beginner’s Guide

12 min605
4

Top AI Video Editing Tools Compared for Faster Content Creation

8 min576
5

Top AI Coding Tools to Revolutionize Development in 2026

9 min540

More in AI APIs

StarCoder2 API Tutorial: Free AI Code Generation in Python & JS

14 min read

Hugging Face API Tutorial: Run Free AI Models in Python

14 min read
All AI APIs posts
AI APIs
May 14, 20261 views

Detectron2 Tutorial API: Free Object Detection in Python

A practical Detectron2 tutorial API guide for object detection. Covers a free hosted endpoint, bounding boxes, instance segmentation, full Python and JavaScript code, plus real error handling.

Developer workstation showing a Python script calling an object detection API with bounding box results printed in a terminal

Developer workstation showing a Python script calling an object detection API with bounding box results printed in a terminal

FreeAPIHub

You've probably hit the same wall every computer vision beginner hits. You read about Detectron2, you see the cool segmentation demos, and then you try to install it locally — and your laptop fan starts sounding like a jet engine. This Detectron2 tutorial API guide skips that pain. We'll call a hosted inference endpoint instead, so you can run real object detection without compiling CUDA or fighting PyTorch versions.

By the end of this post, you'll send an image URL to an API, get back bounding boxes, class labels, and confidence scores, and print them out. We'll use Python first, then do the same thing in JavaScript. Both examples handle errors the way you'd actually want them handled in a real app.

The goal here isn't to teach you the math behind Mask R-CNN. It's to get a working object detection script running on your machine in the next 15 minutes.

What Is Detectron2 and Why Use an API for It?

Detectron2 is Facebook AI's open-source library for object detection, instance segmentation, and keypoint detection. It's the backbone behind a lot of production vision systems — retail analytics, security feeds, medical imaging. The library is fast and accurate. It's also famously annoying to set up.

That's where a hosted Detectron2 tutorial API comes in. Instead of installing 4GB of dependencies, you POST an image (or image URL) to an endpoint and get JSON back. For this tutorial we'll use the public Hugging Face Inference API, which hosts Detectron2-style models like facebook/detr-resnet-50 for free. No credit card. No GPU.

One thing to note up front: the free tier is rate-limited and has a model cold-start delay. We'll handle both later in the error section.

Why Use This Free Object Detection API

  • No local install: No PyTorch, no CUDA, no version mismatches.
  • Free tier: Hugging Face's Inference API runs without payment for low-volume use.
  • No API key needed for public models: Anonymous requests work, though you get higher limits if you add a free token.
  • Real model output: Same architecture family used in Facebook AI object detection research.
  • Beginner-friendly JSON: The response is a flat list of detections — easy to loop through.

Compared to running Detectron2 locally, you trade some latency for zero setup time. Honestly, for prototyping, that's a great trade.

Step-by-Step Setup

Here's what you need before writing any code.

  1. Python 3.8 or newer installed.
  2. The requests library for HTTP calls.
  3. An image URL to test with (we'll use a public sample).
  4. Optional: a free Hugging Face account if you want a personal token for higher rate limits.

Install the one dependency:

pip install requests

That's it. No torch, no detectron2, no opencv-python. Just requests.

Code Examples: Python and JavaScript

We'll build this up in three steps. First a basic Python fetch so you can see the response. Then a practical Python version with error handling. Then the JavaScript equivalent.

Python Example: Basic Fetch

This sends an image URL to the model endpoint and prints whatever comes back. Minimal code, no error handling yet — just to confirm the API responds.

import requests

# Public DETR model — same family used in Facebook AI object detection research
API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

# Any public image URL works. This is a sample street scene from Wikimedia.
IMAGE_URL = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/640px-Cat03.jpg"

# The API accepts raw image bytes in the request body
image_bytes = requests.get(IMAGE_URL).content

response = requests.post(API_URL, data=image_bytes, timeout=30)
print(response.status_code)
print(response.json())

Run that. If the model is warm, you'll see a JSON list of detections. If you see a message saying the model is loading, give it 20 seconds and try again — that's normal on the free tier.

Python Example: Practical Object Detection Script

Now let's make it production-shaped. We add error handling, a retry for cold starts, and a clean print of each detection. Free tier note: requests cap around 1 image per call, and you should space calls to avoid 429 errors — there's no published per-minute limit, so time.sleep(1) between calls is a safe rate.

import requests
import time

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"
IMAGE_URL = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/640px-Cat03.jpg"

# Free tier caps: 1 image per request, ~30 requests/hour anonymous.
# Model cold-start can take 15-25 seconds the first call.
MAX_RETRIES = 3
COLD_START_WAIT = 20

def detect_objects(image_url):
    # Download the image first so we can send raw bytes
    image_response = requests.get(image_url, timeout=15)
    image_response.raise_for_status()
    image_bytes = image_response.content

    for attempt in range(MAX_RETRIES):
        try:
            response = requests.post(API_URL, data=image_bytes, timeout=30)

            # 503 means the model is loading — wait and retry
            if response.status_code == 503:
                print(f"Model is warming up. Waiting {COLD_START_WAIT}s...")
                time.sleep(COLD_START_WAIT)
                continue

            # 429 means we hit the rate limit
            if response.status_code == 429:
                print("Rate limited. Backing off for 60s...")
                time.sleep(60)
                continue

            response.raise_for_status()
            return response.json()

        except requests.exceptions.Timeout:
            print(f"Timeout on attempt {attempt + 1}. Retrying...")
            time.sleep(2)

    raise RuntimeError("Could not get a response after retries.")

detections = detect_objects(IMAGE_URL)

# Filter out low-confidence boxes — anything under 0.7 is usually noise
for item in detections:
    if item["score"] >= 0.7:
        label = item["label"]
        score = round(item["score"], 3)
        box = item["box"]
        print(f"{label} ({score}) at x:{box['xmin']}-{box['xmax']} y:{box['ymin']}-{box['ymax']}")

Here's what the script prints when the model has detected a cat in the sample image:

cat (0.998) at x:14-621 y:43-475
couch (0.812) at x:0-639 y:200-479

Two detections, both with confidence well above our threshold. The box values are pixel coordinates in the original image — top-left is (xmin, ymin), bottom-right is (xmax, ymax).

JavaScript Example: Fetch Detections With Error Handling

Same logic, written for Node.js 18+ (which has fetch built in). No npm install needed.

// Node.js 18+ or any modern browser
const API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50";
const IMAGE_URL = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/640px-Cat03.jpg";

// Free tier caps: 1 image per request. Cold start can take ~20s.
async function detectObjects(imageUrl) {
  try {
    // Step 1 — download the image as a binary Blob
    const imageResponse = await fetch(imageUrl);
    if (!imageResponse.ok) {
      throw new Error(`Image download failed — HTTP ${imageResponse.status}`);
    }
    const imageBlob = await imageResponse.blob();

    // Step 2 — POST raw bytes to the inference endpoint
    const response = await fetch(API_URL, {
      method: "POST",
      body: imageBlob
    });

    // 503 means the model is still loading on the free tier
    if (response.status === 503) {
      console.log("Model warming up. Try again in ~20 seconds.");
      return;
    }

    if (!response.ok) {
      throw new Error(`Inference failed — HTTP ${response.status}`);
    }

    const detections = await response.json();

    if (!Array.isArray(detections) || detections.length === 0) {
      console.log("No objects detected.");
      return;
    }

    // Only print confident predictions
    detections
      .filter(item => item.score >= 0.7)
      .forEach(item => {
        const score = item.score.toFixed(3);
        const { xmin, ymin, xmax, ymax } = item.box;
        console.log(`${item.label} (${score}) at x:${xmin}-${xmax} y:${ymin}-${ymax}`);
      });

  } catch (error) {
    console.error("Detection failed:", error.message);
  }
}

detectObjects(IMAGE_URL);

And the console output once the model is warm:

cat (0.998) at x:14-621 y:43-475
couch (0.812) at x:0-639 y:200-479

Same shape as the Python output. That's by design — the API contract is the same regardless of client.

Understanding the Output

Each item in the response array represents one detected object. Here's the field breakdown:

  • score: A float between 0 and 1. Higher means the model is more sure. Anything under 0.5 is usually noise.
  • label: The class name as a string ("cat", "person", "car"). DETR uses the COCO label set — 80 common classes.
  • box: An object with xmin, ymin, xmax, ymax — pixel coordinates of the bounding box in the original image.

Sample raw JSON for one detection:

{
  "score": 0.998,
  "label": "cat",
  "box": {"xmin": 14, "ymin": 43, "xmax": 621, "ymax": 475}
}

If you want instance segmentation tutorial output instead of bounding boxes, swap the model name for facebook/maskformer-swin-base-coco. The response shape changes — you get a per-pixel mask as a base64 PNG instead of a box. We'll keep the box version here because it's simpler to parse.

Error Handling: What Actually Breaks

This is the part most tutorials skip, and it's the part that'll save you two hours of debugging.

503 — Model is loading. The first request after a cold period boots the model on Hugging Face's servers. Response body says something like {"error": "Model is currently loading", "estimated_time": 20}. Don't treat this as a failure — sleep and retry.

429 — Rate limited. The free tier doesn't publish exact per-minute limits, but anonymous requests get throttled fast. Add time.sleep(1) between calls. If you're processing more than ~30 images an hour, sign up for a free token and pass it as Authorization: Bearer YOUR_TOKEN.

Empty list response. Sometimes the model returns []. That's not an error — it just means no objects passed the internal confidence floor. Lower your threshold or try a clearer image.

Image too large. The endpoint accepts up to about 10MB per request. Bigger images get rejected with a 413. Resize before sending if you're working with high-res photos.

Timeouts. Always set timeout=30 on Python requests. Without it, a stuck request will hang your script forever. Trust me on this one.

Real-World Use Cases

Where would you actually use a free object detection API like this? A few honest examples:

  • Retail shelf monitoring: Snap a photo of a store shelf and count product instances. Flag empty slots.
  • Security camera triage: Run frames through detection and only alert humans when a "person" appears outside business hours.
  • Content moderation prep: Pre-filter user-uploaded images to detect what's in them before a human reviewer sees them.
  • Wildlife camera traps: Process SD card dumps and auto-tag images with "deer", "bird", or "empty" so you don't scroll through 4,000 blank frames.

Comparison: Hosted API vs Local Detectron2

Factor Hosted API (this tutorial) Local Detectron2
Setup time 2 minutes 1-3 hours (CUDA + PyTorch)
Cost Free, ~30 req/hour anonymous Free, but needs a GPU
Latency per image 500-2000 ms 50-150 ms on GPU
Max image size 10 MB Limited by GPU RAM
Offline use No Yes

FAQ

Do I need to install Detectron2 locally to follow this tutorial?

No. The whole point of using a hosted endpoint is to skip the local install. If you later need offline inference or sub-100ms latency, that's when installing Detectron2 locally makes sense.

Is this really a free computer vision free api?

Yes, with limits. The Hugging Face Inference API serves these models for free at low volume. For production traffic you'd want a paid tier or a self-hosted setup, but for learning and prototyping it's genuinely free.

What's the difference between object detection and instance segmentation?

Object detection draws a rectangle around each object. Instance segmentation draws the exact outline, pixel by pixel. This tutorial focuses on detection because the response is easier to parse. For an instance segmentation tutorial, swap to a MaskFormer or Mask R-CNN model on the same endpoint.

Why is my first request so slow?

Free-tier models go to sleep when nobody calls them. Your first request wakes the model up, which takes 15-25 seconds. After that, follow-up requests are fast. The retry logic in the practical example handles this for you.

Can I use my own images instead of a URL?

Yes. In Python, open the file with open("photo.jpg", "rb").read() and send those bytes as the data argument. In JavaScript, use a File or Blob object as the body in fetch. The API doesn't care where the bytes came from.

What object classes does the model recognise?

The DETR ResNet-50 model is trained on COCO, which has 80 classes — common things like person, car, dog, chair, laptop, bottle. It won't recognise specific products or faces. For custom classes you'd need to fine-tune the model.

Conclusion

You now have a working object detection pipeline in two languages, with proper error handling, and you didn't install a single deep learning library. That's the practical value of a Detectron2 tutorial API approach — you get to focus on what you do with the results instead of fighting the setup.

The logical next step is to wire this into something useful. Take the detection output and draw boxes on the image using Pillow in Python, or canvas in the browser. Or batch-process a folder of images and dump results to CSV. Both are 20 lines of code from where you are now.

Looking for more no-key APIs to plug into your projects? Browse the Free API Hub directory for hosted endpoints that work the same way.

Tags

#detectron2 tutorial api#free object detection api#facebook ai object detection#computer vision free api#instance segmentation tutorial#python computer vision#hugging face inference api

Found this helpful?

Share this article with fellow developers or save it for later reference. Your support helps us create more quality content.

Suggested for You

All posts
Comparison of Flask, Django, and FastAPI Python web frameworks
844
7 minProgramming

Flask vs Django vs FastAPI: Choosing the Best Python Web Framework

Read
Illustration of AI tools aiding academic research and paper discovery with digital interface of scientific papers
648
7 minAcademic Research

Top 7 Free AI Tools for Academic Research and Paper Discovery

Read
API testing using Postman interface with sample API requests and responses
605
12 minAPI Development

Master API Testing with Postman: A Complete Beginner’s Guide

Read
Comparison of AI video editing tools showing interface features and video clips
576
8 minVideo Editing

Top AI Video Editing Tools Compared for Faster Content Creation

Read
AI coding tools enhancing software development workflow in 2026
540
9 minSoftware Development

Top AI Coding Tools to Revolutionize Development in 2026

Read

Continue Learning

More from AI APIs

Developer workstation showing StarCoder2 code completions generated through the Hugging Face Inference API in a Python terminal and a VS Code editor
AI APIs
May 13, 202614 min read

StarCoder2 API Tutorial: Free AI Code Generation in Python & JS

Learn how to use StarCoder2, an open-source code generation AI, through a free Hugging Face Inference API. This StarCoder2 API tutorial covers setup, prompting, error handling, and real Python plus JavaScript examples you can run today.

Read
Developer workstation showing a Python script calling the Hugging Face Inference API on the left monitor and a terminal printing sentiment analysis scores on the right.
AI APIs
May 10, 202614 min read

Hugging Face API Tutorial: Run Free AI Models in Python

A hands-on Hugging Face API tutorial showing how to run free AI models for text generation, sentiment analysis, and summarization. Includes working Python and JavaScript examples, sample output, and honest error handling for beginners.

Read