Home

>

Setup your First Server

Setup your First Server

Fill out two simple forms, complete the payment, and within 24 hours your server will be fully configured with all the models and tools you selected. You will then receive access through your personal dashboard.

Choose server
Choose a server configuration that best fits your needs. You’ll pay a fixed monthly fee for the server rental with no hidden charges.
Mac Mini
CPU M4
16 GB RAM
256GB NVMe SSD
100 / month
Mac Mini
CPU M4
24 GB RAM
256GB NVMe SSD
135 / month
Mac Mini
CPU M4
32 GB RAM
256GB NVMe SSD
170 / month
None of the options suit your needs? Request a custom configuration for your server!
Choose LLM’s
Select the models you want to install from the list. Please note that some models are available only on specific server configurations.
All
Open-Sora
Open-Sora-v2 is an open-source Text-to-Video generation model developed by HPC-AI Technology. The project aims to democratize high-quality video creation by offering a publicly accessible alternative to closed systems like OpenAI's Sora. It is built upon the Diffusion Transformer (DiT) architecture and optimized using frameworks like ColossalAI for high computational efficiency, crucial for training and inference. Key features include generating dynamic video clips from text prompts (Text-to-Video) or a source image (Image-to-Video). It utilizes components like StabilityAI's VAE for video compression and T5/CLIP for robust text and image encoding. The model prioritizes scalability and accessibility, making advanced video synthesis available to researchers and users with significant, but non-exclusive, computing resources.
Choose version
Open-Sora
Choose version
Open-Sora-v2 is an open-source Text-to-Video generation model developed by HPC-AI Technology. The project aims to democratize high-quality video creation by offering a publicly accessible alternative to closed systems like OpenAI's Sora. It is built upon the Diffusion Transformer (DiT) architecture and optimized using frameworks like ColossalAI for high computational efficiency, crucial for training and inference. Key features include generating dynamic video clips from text prompts (Text-to-Video) or a source image (Image-to-Video). It utilizes components like StabilityAI's VAE for video compression and T5/CLIP for robust text and image encoding. The model prioritizes scalability and accessibility, making advanced video synthesis available to researchers and users with significant, but non-exclusive, computing resources.
Kimi K2
Kimi K2 is our latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. It achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. But it goes further — meticulously optimized for agentic tasks, Kimi K2 does not just answer; it acts. And now, it is within your reach. Today, we are open-sourcing: Kimi-K2-Base: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions. Kimi-K2-Instruct: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking. With Kimi K2, advanced agentic intelligence is more open and accessible than ever. We can't wait to see what you build.
Choose version
Kimi K2
Choose version
Kimi K2 is our latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. It achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. But it goes further — meticulously optimized for agentic tasks, Kimi K2 does not just answer; it acts. And now, it is within your reach. Today, we are open-sourcing: Kimi-K2-Base: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions. Kimi-K2-Instruct: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking. With Kimi K2, advanced agentic intelligence is more open and accessible than ever. We can't wait to see what you build.
Phi
Phi is a family of small, open-weight language models developed by Microsoft. They are designed to be compact, fast, and efficient for local or edge use. Models like Phi-2 (2.7B) and Phi-3-mini (3.8B) offer strong performance on reasoning and coding tasks despite their small size. Phi uses a mix of curated web data, academic texts, and synthetic datasets. These models run well on laptops and devices with limited memory, especially when quantized. Developers use Phi for fast prototypes, embedded AI tools, or private inference.
Choose version
Phi
Choose version
Phi is a family of small, open-weight language models developed by Microsoft. They are designed to be compact, fast, and efficient for local or edge use. Models like Phi-2 (2.7B) and Phi-3-mini (3.8B) offer strong performance on reasoning and coding tasks despite their small size. Phi uses a mix of curated web data, academic texts, and synthetic datasets. These models run well on laptops and devices with limited memory, especially when quantized. Developers use Phi for fast prototypes, embedded AI tools, or private inference.
Gemma
Google Gemma is a family of open-weight language models designed for efficient reasoning and high-quality text generation. Built on the same research foundation as Gemini, it offers strong performance while remaining lightweight and privacy-friendly. Gemma models come in multiple sizes, such as 1B, 2B, and 9B, optimized for chat, coding, and multilingual tasks. They run efficiently on local devices like laptops or Mac mini, making them ideal for open and private AI applications.
Choose version
Gemma
Choose version
Google Gemma is a family of open-weight language models designed for efficient reasoning and high-quality text generation. Built on the same research foundation as Gemini, it offers strong performance while remaining lightweight and privacy-friendly. Gemma models come in multiple sizes, such as 1B, 2B, and 9B, optimized for chat, coding, and multilingual tasks. They run efficiently on local devices like laptops or Mac mini, making them ideal for open and private AI applications.
FLUX
FLUX is a new family of image generation models by Black Forest Labs, built on Flow Matching Transformer technology. It produces highly realistic, detailed, and coherent images from text prompts with faster inference than diffusion models. Versions like FLUX.1-dev focus on quality, while FLUX.1-schnell prioritizes speed and efficiency. It’s optimized for both local setups and professional workflows, supporting tools like ComfyUI and Hugging Face Diffusers.
Choose version
FLUX
Choose version
FLUX is a new family of image generation models by Black Forest Labs, built on Flow Matching Transformer technology. It produces highly realistic, detailed, and coherent images from text prompts with faster inference than diffusion models. Versions like FLUX.1-dev focus on quality, while FLUX.1-schnell prioritizes speed and efficiency. It’s optimized for both local setups and professional workflows, supporting tools like ComfyUI and Hugging Face Diffusers.
Can't find the LLM you need in the list? Suggest your option and we will find a way to install it on your server.
Choose LLM inference tool
Select the models you want to install from the list. Please note that some models are available only on specific server configurations.
MLC LLM
MLC LLM lets you run big language models on your own devices, including Macs with Apple Silicon. It uses Apple’s Metal system for GPU power and supports 4-bit or 8-bit quantised models to save memory. You can run models up to around 30 billion parameters on a Mac Studio with 128 GB memory. It supports Python, REST APIs, and even mobile or browser use. The tool is fast, open-source, and works well for developers building apps or testing models locally. You don’t need cloud servers. Just install it, load your model, and start chatting or generating text right away.
Read more
BentoML
BentoML is an open-source framework that makes it easy to serve and deploy AI models in production. It lets developers turn any machine learning or LLM model into a scalable API service with just a few lines of Python. BentoML automatically handles packaging, dependency management, and Docker containerization. It’s designed for high-performance inference, multi-model pipelines, and seamless deployment to cloud or on-prem environments via BentoCloud.
Read more
Agenta.ai
Agenta.ai is a platform for managing, testing, and optimizing LLM-powered applications. It lets developers experiment with prompts, models, and parameters in one interface. The tool supports evaluation, version control, and performance tracking. It’s built for teams that want reliable, measurable LLM workflows.
Read more
EXO
Exo is an open-source tool that allows you to create a personal AI cluster by aggregating the computing power of multiple everyday devices, such as phones, laptops, and Apple Watches. This decentralized approach enables you to run large AI models locally without the need for specialized, expensive hardware like high-end GPUs
Read more
LM Studio
LM Studio is a desktop application that lets you discover, download, and run a variety of large language models (LLMs) locally on your computer. It provides a user-friendly graphical interface, so you don't need any coding experience to start interacting with powerful AI models. All processing happens offline, ensuring your interactions remain completely private and on your own machine. For developers, it offers an OpenAI-compatible API, making it easy to integrate local LLMs into other applications. The app works with both CPUs and GPUs, supports quantized models for efficiency, and includes a model catalog for finding compatible open-source models.
Read more
Can't find the tool you need in the list? Suggest your option and we will find a way to install it on your server.
Choose Tools
Select the tools you want to install from the list, and you’ll access them through a user-friendly web interface. All DNS settings will already be configured for you.
All Models
Open WebUI
Open WebUI is a self-hosted web interface for working with large language models offline or via APIs. It supports multiple backends like Ollama and OpenAI-compatible services, offers chat, file ingestion, RAG workflows, and extensibility with Python plugins, making it ideal for secure, customizable AI deployments. It includes user management, Markdown and LaTeX rendering, and fast setup with Docker.
Read more
Open WebUI
Open WebUI is a self-hosted web interface for working with large language models offline or via APIs. It supports multiple backends like Ollama and OpenAI-compatible services, offers chat, file ingestion, RAG workflows, and extensibility with Python plugins, making it ideal for secure, customizable AI deployments. It includes user management, Markdown and LaTeX rendering, and fast setup with Docker.
Unsloth AI
Unsloth AI lets you train and fine-tune LLMs faster on your own hardware. You cut memory use and speed up experiments. It supports many open models and works with 4-bit or 8-bit formats. Good for custom AI tools, private data, and lower GPU costs. Ideal for developers and teams.
Read more
Unsloth AI
Unsloth AI lets you train and fine-tune LLMs faster on your own hardware. You cut memory use and speed up experiments. It supports many open models and works with 4-bit or 8-bit formats. Good for custom AI tools, private data, and lower GPU costs. Ideal for developers and teams.
LanceDB
LanceDB is an open-source vector database built on the high-performance Lance columnar format. It enables local, serverless storage and search of embeddings with easy integration into Python and JavaScript environments. Designed for developers building AI applications, it offers fast similarity search, lightweight setup, and seamless use with frameworks like LangChain and LlamaIndex.
Read more
LanceDB
LanceDB is an open-source vector database built on the high-performance Lance columnar format. It enables local, serverless storage and search of embeddings with easy integration into Python and JavaScript environments. Designed for developers building AI applications, it offers fast similarity search, lightweight setup, and seamless use with frameworks like LangChain and LlamaIndex.
pgvector
pgvector is an open-source PostgreSQL extension that adds native support for vector similarity search within a traditional relational database. It lets you store, index, and query embeddings directly alongside structured data using familiar SQL syntax. Ideal for smaller or integrated AI systems, it offers a cost-efficient way to add semantic search without deploying a separate vector database.
Read more
pgvector
pgvector is an open-source PostgreSQL extension that adds native support for vector similarity search within a traditional relational database. It lets you store, index, and query embeddings directly alongside structured data using familiar SQL syntax. Ideal for smaller or integrated AI systems, it offers a cost-efficient way to add semantic search without deploying a separate vector database.
Chroma
Chroma is the open-source search and retrieval database for AI applications. Vector, full-text, regex, and metadata search. Develop locally and scale to petabytes in the cloud backed by object storage. Serverless search and retrieval that is fast, cheap, and reliable.
Read more
Chroma
Chroma is the open-source search and retrieval database for AI applications. Vector, full-text, regex, and metadata search. Develop locally and scale to petabytes in the cloud backed by object storage. Serverless search and retrieval that is fast, cheap, and reliable.
Can't find the tool you need in the list? Suggest your option and we will find a way to install it on your server.
Total amount to pay: 0
Price breakdown