What is Remotorch?

REST API Inference

For app developers

Call pre-configured AI models via simple REST APIs. No ML knowledge required. Upload an image, get back an upscaled version. Send text, get embeddings. Works from any language that can make HTTP requests.

Interactive playground to test and learn
Copy-paste code generation
Works with Python, JavaScript, Go, Ruby, etc.
Serverless - containers scale up/down automatically

Browse Model Registry

Remote GPU Sessions

For Python/PyTorch developers

Run PyTorch code on remote GPUs from any device. Your code executes locally, but tensor operations happen on our GPUs. Perfect for IoT devices, development machines without GPUs, and edge deployments.

Use TorchScript models on devices without GPUs
Raspberry Pi, Chromebook, $6 VPS - all work
Familiar PyTorch syntax
Upload models once, load instantly

View Python SDK

How We Compare

Understanding where Remotorch fits in the AI ecosystem

vs HuggingFace

HuggingFace is...

• A model repository - download weights and run them yourself
• Requires you to set up Python, PyTorch, CUDA, and dependencies
• You need a GPU to run most models
• Great for ML engineers who know what they're doing
• Inference API exists but limited and expensive

Remotorch is...

• A model execution platform - we run the models for you
• No setup required - make API calls from any language
• We provide the GPU - you just send requests
• Great for app developers integrating AI features
• Interactive playground to learn and test

Bottom line: HuggingFace is where you find models. Remotorch is where you run them without needing ML infrastructure expertise. We actually import many models from HuggingFace and make them accessible via simple APIs.

vs Cloud GPU Providers (AWS, GCP, Lambda Labs, RunPod)

Cloud GPU Providers are...

• Virtual machines with GPUs - you SSH in and configure everything
• Hourly billing even when idle (forgot to stop that instance?)
• Requires DevOps knowledge (Docker, networking, storage)
• Great for training and heavy production workloads
• Typically $1-4+/hour minimum

Remotorch is...

• Managed inference - we handle all the infrastructure
• Pay per request or per minute of actual use
• No DevOps required - just API calls
• Great for inference, prototyping, and learning
• Start at $0 - pay only when you use GPU time

Bottom line: Cloud GPU providers give you raw compute. Remotorch gives you ready-to-use AI capabilities. Use cloud GPUs for training large models or running 24/7 production workloads. Use Remotorch for everything else.

vs Local GPU (Your Own Hardware)

Local GPU is...

• The best option if you have one and need low latency
• $1,000-$2,000+ upfront for a decent GPU
• Limited to one machine - can't use from your laptop if GPU is in your desktop
• Power consumption, heat, noise
• Not available on MacBooks, Chromebooks, Raspberry Pi, etc.

Remotorch is...

• GPU access from anywhere - any device, any location
• $0 upfront - pay only for what you use
• Use from laptop, phone, IoT device - anywhere with internet
• No power/heat/noise concerns
• Access high-end GPUs you couldn't afford to buy

Bottom line: If you have a local GPU and need low latency, use it. Remotorch is for everyone else: developers without GPUs, IoT deployments, occasional GPU users, and anyone who wants to learn without investing in hardware first.

Use Cases

Who uses Remotorch and why

REST API Inference

App Developers Adding AI Features

You're building an app and want to add AI features: image processing, text analysis, object detection. You don't want to become an ML engineer to do it.

Test in Playground First

Upload a sample file, see the output, understand what you're working with before writing code.

Copy Working Code

The playground generates the exact API call. Copy it, paste it into your app, it works.

Works in Any Stack

Node.js, Python, Ruby, Go, Rust - if it can make HTTP requests, it works.

Example: Adding image upscaling to a web app

// Node.js / Express endpoint
app.post('/upscale', async (req, res) => {
  const response = await fetch(
    'https://api.remotorch.io/v1/models/swinir/infer',
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.REMOTORCH_KEY}`
      },
      body: req.file
    }
  );

  const upscaledImage = await response.blob();
  res.send(upscaledImage);
});

Example: Object detection on Raspberry Pi

# Running on a Raspberry Pi 4
import remotorch
from picamera2 import Picamera2

# Connect to remote GPU (RTX 4090)
remotorch.connect(
    api_key="rk_...",
    gpu_type="rtx4090"
)

# Load your detection model (stored on Remotorch)
model = remotorch.load_model("my-yolo-model")

# Capture from Pi camera
camera = Picamera2()
frame = camera.capture_array()

# Run inference on remote GPU
detections = model(remotorch.tensor(frame))

print(detections.cpu())  # Results back on Pi

Remote GPU Sessions

IoT & Embedded Developers

You're building something on a Raspberry Pi, Jetson Nano, or other embedded device. You need to run ML models but don't have local GPU resources.

TorchScript Models

Export your model once, upload it, then load and run it from any device.

Familiar PyTorch Syntax

Your existing PyTorch code works with minimal changes. Tensors live on remote GPU.

Works Anywhere

Raspberry Pi, Chromebook, WSL, old laptop - if it runs Python, it can access GPUs.

Education

Students & Learners

You want to learn how to integrate AI into applications but don't have the budget for GPU hardware and don't want to spend weeks setting up environments.

Learn by Doing

The playground lets you experiment immediately without any setup.

See Real API Calls

Watch the actual HTTP requests being made. Understand the protocol.

Affordable Experimentation

Start free, pay cents per experiment. No $1000 GPU investment to start learning.

The Learning Path

1 Play with models in the playground

2 Copy the generated code

3 Integrate into your own project

4 Build something amazing

Setting Expectations

We want you to succeed, which means being honest about what we're good at

What We're Great For

✓ Learning AI integration - The playground makes it easy to understand how AI APIs work
✓ Prototyping - Test ideas quickly before committing to infrastructure
✓ IoT & edge devices - Run ML on devices without GPUs
✓ Occasional GPU access - Need a GPU for minutes, not hours
✓ Development without local GPU - MacBook users, cheap VPS deployments
✓ Adding AI features to apps - Simple REST APIs for common ML tasks

What We're Not Designed For

✗ Training large models - Network overhead makes this impractical
✗ Ultra-low-latency inference - ~10-50ms network overhead per operation
✗ 24/7 high-volume production - Dedicated hardware is more cost-effective
✗ Replacing your local GPU - If you have one and need speed, use it
✗ Real-time gaming/streaming - Latency matters too much

We'd rather tell you upfront than have you discover limitations after investing time. For high-volume production, consider dedicated GPU instances once you've validated your approach with us.

REST API Inference

Remote GPU Sessions

How We Compare

vs HuggingFace

HuggingFace is...

Remotorch is...

vs Cloud GPU Providers (AWS, GCP, Lambda Labs, RunPod)

Cloud GPU Providers are...

Remotorch is...

vs Local GPU (Your Own Hardware)

Local GPU is...

Remotorch is...

Use Cases

App Developers Adding AI Features

Test in Playground First

Copy Working Code

Works in Any Stack

IoT & Embedded Developers

TorchScript Models

Familiar PyTorch Syntax

Works Anywhere

Students & Learners

Learn by Doing

See Real API Calls

Affordable Experimentation

The Learning Path

Setting Expectations

What We're Great For

What We're Not Designed For

Ready to Get Started?