In today’s AI-driven world, deploying large language models (LLMs) like Meta’s Llama 3, Google’s Gemma, or Mistral locally offers unparalleled control over data privacy and customization. However, sharing these tools securely over the internet unlocks collaborative potential—whether you’re a developer showcasing a prototype, a researcher collaborating with peers, or a business integrating AI into customer-facing apps.
This comprehensive guide will walk you through exposing Ollama’s API and Open WebUI online using Pinggy, a powerful tunneling service. You’ll learn to turn your local AI setup into a globally accessible resource—no cloud servers or complex configurations required.
Install Ollama & Download a Model
ollama run llama3:8b
Share Ollama API via Pinggy
ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io "u:Host:localhost:11434"
https://abc123.pinggy.link
).Deploy Open WebUI
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main
Expose WebUI Online
ssh -p 443 -R0:localhost:3000 a.pinggy.io
With growing concerns about data privacy and API costs, tools like Ollama and Open WebUI have become essential for running LLMs locally. However, limiting access to your local network restricts their utility. By sharing them online, you can:
Pinggy simplifies port forwarding by creating secure tunnels. Unlike alternatives like ngrok, it offers:
.exe
installer.curl -fsSL https://ollama.com/install.sh | sh
ollama --version # Should return "ollama version 0.1.30" or higher
Ollama supports 100+ models. Start with a lightweight option:
ollama run qwen:0.5b
For multimodal capabilities (text + images), try
llava
or bakllava
:
ollama run llava:13b
Open WebUI provides a ChatGPT-like interface for Ollama. Install via Docker:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Access the UI at http://localhost:3000
and create an admin account.
By default, Ollama runs on port 11434
. Start the server:
ollama serve # Keep this terminal open
Use this SSH command to tunnel Ollama’s API:
ssh -p 443 -R0:localhost:11434 -t qr@a.pinggy.io "u:Host:localhost:11434"
Command Breakdown:
-p 443
: Connects via HTTPS for firewall compatibility.-R0:localhost:11434
: Forwards Ollama’s port to Pinggy.qr@a.pinggy.io
: Pinggy’s tunneling endpoint.u:Host:localhost:11434
: Maps the tunnel to your local port.After running, you’ll see a public URL like https://abc123.pinggy.link
.
Verify access using curl
or Verify using web browser:
curl https://abc123.pinggy.link/api/tags
To test the Ollama API using JavaScript, follow these simple steps:
npm install
.node main.js
to test the API.Run this command to share port 3000
:
ssh -p 443 -R0:localhost:3000 a.pinggy.io
You’ll receive a URL like https://xyz456.pinggy.link
.
https://xyz456.pinggy.link
in a browser.
ssh -p 443 -R0:localhost:3000 user:pass@a.pinggy.io
Upgrade to Pinggy Pro (INR 204.89/month) for custom domains:
ssh -p 443 -R0:localhost:3000 -T yourdomain.com@a.pinggy.io
Distributed teams can:
Expose Ollama’s API to power:
Researchers can securely share access to proprietary models with peers without exposing internal infrastructure.
ollama serve
.11434
and 3000
.llama3:70b
(requires 40+ GB).By combining Ollama, Open WebUI, and Pinggy, you’ve created a secure, shareable AI platform without relying on cloud services. This setup is ideal for startups, researchers, or anyone prioritizing data control.