Cloudflare Containers: Docker at Edge in 330+ Cities
Cloudflare Containers runs Dockerized workloads across its 330+ city network. Architecture overview, deployment steps, cold start performance, and use cases.
Edge Cities
Typical Cold Start
Docker Compatible
Custom CDN Setup
Key Takeaways
Running Docker containers at the edge has long been a compelling idea that proved difficult in practice. The computational overhead of container isolation, combined with the need to distribute images across hundreds of locations, made edge containers either slow or prohibitively expensive. Cloudflare Containers changes the calculus by building containerized workloads directly into the same global network that already runs Workers, routing requests to the nearest healthy instance across 330+ cities worldwide without any custom infrastructure setup.
The practical significance is substantial. Any Docker image that runs on a Linux x86-64 host can now be deployed to Cloudflare's global network with a single wrangler deploy command. This guide covers the architecture, deployment workflow, cold start performance characteristics, and the networking integrations that make Containers a first-class citizen of the Cloudflare developer platform. For teams evaluating edge infrastructure options as part of a broader web development strategy, understanding where Containers fits relative to Workers is the essential starting point.
What Are Cloudflare Containers
Cloudflare Containers is a container runtime built into the Workers platform that executes arbitrary Docker images at Cloudflare edge locations. Where Workers uses the V8 isolate model to run JavaScript and WebAssembly with microsecond startup and strict resource limits, Containers uses lightweight container virtualization to run any language, framework, or binary that packages into a Docker image.
The architectural distinction matters. Workers isolates are stateless by design: each invocation is independent, there are no persistent TCP connections, and execution time is measured in milliseconds. Containers run as persistent processes with long-lived connections, arbitrary memory, and the ability to load native libraries. The two runtimes coexist on the same platform and can call each other through bindings.
Any Linux x86-64 Docker image works without modification. Bring your existing Node.js, Python, Go, Rust, or Java application and deploy it globally without a rewrite.
Containers are instantiated in the Cloudflare PoP closest to the incoming request. No DNS-based routing, no multi-region configuration, and no custom CDN layer required.
Containers share the same deployment toolchain, billing, and network bindings as Workers. KV, R2, Durable Objects, and Queues are all accessible from inside a container.
The use case Cloudflare targets most directly is the class of workloads that cannot fit in Workers but still benefit from global distribution. AI inference with large model files, legacy web applications with native database drivers, background job processors that need persistent state, and media processing pipelines with platform-specific binaries are all natural fits. For workloads that do fit in Workers, the isolate model remains faster and cheaper.
Containers vs Workers: Choosing the Right Tool
The decision between Containers and Workers is not about preference — it is about the constraints of your workload. Workers optimizes for request-response latency and scale. Containers optimizes for compatibility and persistence. Understanding where those optimization boundaries sit determines which runtime is correct for each part of your application.
- Request routing, authentication, and middleware logic
- Sub-millisecond response requirements with no cold start
- TypeScript or JavaScript with WebAssembly bindings
- Stateless transformation, A/B testing, and edge caching
- Cost-sensitive workloads that fire millions of times per day
- Native dependencies, binary executables, or non-JS runtimes
- Persistent TCP connections to databases or external services
- Long-running background jobs exceeding Workers CPU limits
- Legacy applications you want to globalize without rewriting
- Media processing, AI inference, and compute-heavy workloads
The most powerful architecture combines both runtimes. A Worker handles the public-facing request, applies authentication and rate limiting with near-zero latency overhead, then hands off to a Container binding for the compute-intensive portion of the request. The Worker returns the response to the client while the container handles heavy processing. For teams already familiar with Vercel Fluid Compute and cold start elimination strategies, the Containers model applies similar principles at the Cloudflare layer.
Design principle: Use Workers as the intelligent routing and middleware layer, and Containers as the compute backend. This preserves Workers' near-zero cold start advantage for the latency-sensitive path while giving Containers workloads time to warm up in the background.
Deployment Workflow: Dockerfile and Wrangler
Deploying a container to Cloudflare follows the same wrangler-based workflow as Workers. The only additions are a Dockerfile in your project root and a [containers] section in wrangler.toml. The CLI handles image building, distribution, and instance lifecycle management automatically.
Basic container binding
[[containers]]
name = "my-app"
image = "./Dockerfile"
max_instances = 10
memory = "512MB"
cpu = 1Pre-built image from registry
[[containers]]
name = "my-app"
image = "registry.example.com/my-app:latest"
min_instances = 2Deploy command (same as Workers)
wrangler deployThe min_instances setting keeps a minimum number of containers running at all times, which eliminates cold starts for that portion of your traffic. The max_instances setting caps horizontal scaling to control costs. Cloudflare automatically scales between the minimum and maximum based on incoming traffic, with each instance serving requests from the PoP where it was instantiated.
Worker binding to container
// In Worker: env.MY_APP is a Container binding
const container = env.MY_APP.get(
env.MY_APP.idFromName("instance-1")
);
const response = await container.fetch(request);The idFromName pattern is the same API used by Durable Objects, which means developers already familiar with Durable Object routing can apply the same patterns to Containers. Named instances allow you to route specific tenants or sessions to specific container instances, enabling stateful session affinity without a load balancer configuration.
Cold Start Performance and Scaling
Cold start performance is the most common concern when evaluating edge container platforms. Cloudflare Containers measures cold start latency from the moment a request arrives at a PoP with no warm instances to when the container is ready to handle that first request. For typical production images, this is 180–320 milliseconds.
Standard Node.js application images (200–400MB) cold start in approximately 180–240ms. Alpine-based minimal images start closer to 100ms.
Python images with heavy dependencies like NumPy or pandas cold start in 250–320ms. Slim base images without compiled libraries run closer to 150ms.
Statically linked Go and Rust binaries with scratch base images achieve cold starts under 80ms, approaching Workers isolate startup performance.
The practical mitigation strategy is to optimize images for startup time. Use multi-stage builds to reduce image size, prefer Alpine or distroless base images, and defer non-critical initialization to after the server starts accepting requests. For latency-critical production workloads, set min_instances to at least one per major region to keep containers pre-warmed.
Cloudflare's global distribution also acts as a natural cold start mitigation for most traffic patterns. With 330+ PoPs, the probability that at least one PoP in a region already has a warm instance is high once traffic volume exceeds a modest threshold. Traffic that would have hit a cold container at a regional provider is instead routed to a pre-warmed instance at a nearby city.
Image optimization tip: Build your production images with --no-cache and analyze layer sizes with docker history. Reducing the compressed image size from 500MB to 200MB typically cuts cold start time by 30–40%.
Networking with Durable Objects, KV, and R2
Containers are fully integrated into the Cloudflare developer platform through the same binding system used by Workers. This means a container has direct, low-latency access to KV namespaces, R2 buckets, Durable Objects, D1 databases, and Queues without making network calls to external services. The bindings are declared in wrangler.toml and injected into the container at runtime through a local metadata endpoint.
Read feature flags, configuration values, and cached data from Workers KV at sub-millisecond latency from within the container. Ideal for runtime configuration without filesystem mounts or environment variable redeployments.
Read and write large objects to Cloudflare R2 from containers with no egress fees. Media processing pipelines that read source files, process them, and write outputs are a natural fit — all within the Cloudflare network.
Use Durable Objects as a coordination layer between container instances. A Durable Object can maintain a connection registry, distribute work across container instances, and aggregate results without an external orchestration service.
A Worker receives a request, enqueues a job to Cloudflare Queues, and returns immediately. A Container worker processes the queue, handling CPU-intensive tasks like document conversion, email rendering, or report generation without blocking the request path.
The zero-egress-cost model for communication between Cloudflare services is a significant cost advantage for architectures that move large amounts of data between storage and compute. A media processing pipeline that reads video from R2, transcodes it in a container, and writes the output back to R2 incurs no data transfer charges for that internal movement. For security-focused edge deployments, see also our guide on Cloudflare AI security for autonomous agents at the edge, which covers the security model for edge-hosted AI workloads.
Security, Isolation, and Resource Limits
Container isolation on Cloudflare uses hardware virtualization rather than shared kernel namespaces. Each container instance runs in an isolated microVM using technology similar to Firecracker, providing stronger tenant isolation than traditional Docker containers on a shared host. This matters for multi-tenant applications where different customer workloads share the same underlying hardware.
Hardware-level isolation between container instances. A compromised container cannot access the host kernel or other tenants' memory, unlike traditional container runtimes.
Secrets are injected via Cloudflare Workers Secrets, not baked into the image. Rotate credentials without rebuilding or redeploying the container image.
Outbound network access from containers passes through Cloudflare's network, enabling IP allowlisting at external services using Cloudflare's published egress IP ranges.
Resource limits are set per container instance in wrangler.toml. Memory can be configured up to 16GB per instance for memory-intensive workloads like in-memory databases or large model inference. CPU allocation supports fractional values, so a lightweight background job can specify 0.25 CPU while a compute task specifies 4. The container is terminated if it exceeds its configured limits, so sizing your instances correctly for the workload is important.
Security best practice: Never embed secrets in your Docker image. Use wrangler secret put to store credentials and reference them as environment variables in your container runtime. Run periodic docker scan or trivy image checks in your CI pipeline to catch base image vulnerabilities before deployment.
Real-World Use Cases and Architectures
The clearest signal of a platform's utility is the set of workloads it unlocks that were previously impractical. For Cloudflare Containers, several use case categories stand out as particularly well-matched to the platform's strengths in global distribution, platform integration, and Docker compatibility.
Run small to medium language models, embedding generators, and classification models in containers close to users. Avoids the round-trip to a central inference endpoint while keeping model files out of the client browser.
Take an existing monolithic Node.js, Python Flask, or Java Spring application and deploy it globally without a rewrite. A Worker handles the CDN and edge logic while the container runs the existing application unchanged.
PDF generation with Puppeteer or wkhtmltopdf, image manipulation with ImageMagick, video thumbnail extraction, and Office document conversion — all of these require native binaries that work naturally in containers and benefit from global distribution.
Run PgBouncer, ProxySQL, or a custom database proxy in a container at the edge, maintaining persistent connection pools to regional databases. Workers proxy their database calls through the edge container rather than opening cold connections to the origin.
For digital agencies and web development teams, the legacy app globalization use case deserves particular attention. Many client applications that currently run in a single AWS or GCP region could be served globally with dramatically lower latency by containerizing the application and deploying it to Cloudflare. The migration path is a Dockerfile and a wrangler.toml — no application code changes required. Our web development services include edge architecture assessments for teams evaluating this migration path.
Pricing Compared to Traditional Hosting
Cloudflare Containers pricing uses a consumption model based on CPU-seconds, memory-GiB-seconds, and request count. There are no charges for data transfer within the Cloudflare network, no per-region surcharges, and no load balancer fees. The comparison to traditional container hosting requires accounting for these included services.
The break-even analysis favors Cloudflare Containers most strongly for globally distributed workloads. Running a workload across four AWS regions (US East, EU West, AP Southeast, AP Northeast) with load balancers and cross-region egress quickly exceeds the all-in cost of a single Cloudflare Containers deployment that covers all 330+ PoPs. For single-region workloads with no global distribution requirement, AWS Fargate or Google Cloud Run may offer lower unit costs.
The free tier for Containers (included in the Workers Free and Paid plans) provides a meaningful starting point for development and low-traffic production workloads. The exact free tier limits are subject to change as Cloudflare refines the product, so check the current pricing page before committing to an architecture based on free tier assumptions.
Limitations and Considerations
Cloudflare Containers is a relatively new product and several limitations exist that affect architectural decisions, especially for production workloads with specific requirements around storage, GPU access, and compliance.
No persistent disk storage: Container instances do not have persistent filesystem storage. Data written to the container filesystem is lost when the instance is terminated. Use R2 for object storage, D1 for relational data, or KV for key-value state. This is by design for stateless scalability but requires architecture adjustment for stateful applications.
No GPU support for containers: GPU-accelerated inference is available through Workers AI (Cloudflare's serverless AI inference service) but not through Containers. Large model inference that requires GPU acceleration cannot currently run in a Container — use Workers AI for those workloads instead.
2GB compressed image size limit: Container images are currently capped at 2GB compressed. Large data-science images with full Anaconda distributions or bundled model weights will exceed this limit. Use remote model loading rather than baking weights into the image.
Platform immaturity: Enterprise features available on AWS ECS (service mesh, advanced IAM integration, FIPS endpoints) and Google Cloud Run (VPC Service Controls, Cloud Armor integration) are still on the roadmap. Teams with specific compliance requirements should validate current feature availability before committing to the platform.
Despite these limitations, Cloudflare Containers is production-ready for the broad class of stateless or semi-stateful web applications, API backends, and media processing pipelines. The limitations primarily affect specialized workloads (GPU inference, FIPS compliance, very large images). For most web application use cases the platform is fully capable.
Conclusion
Cloudflare Containers closes the gap between Workers and traditional cloud container hosting by bringing Docker-native workloads to the global edge. The 180–320ms cold start for typical images, zero egress costs within the network, and native integration with the Workers platform make it a compelling option for globally distributed applications that cannot fit within Workers' execution model.
The deployment workflow is the most underrated advantage: one Dockerfile, one wrangler.toml section, and one deploy command produce a globally distributed container without any CDN configuration, multi-region orchestration, or load balancer setup. For development teams already using Workers, adding Containers to the architecture for compute-intensive workloads is a natural next step that requires minimal operational overhead.
Ready to Deploy at the Edge?
Edge container architecture is one component of a modern web development strategy. Our team helps businesses design and implement global deployment architectures that deliver measurable performance improvements.
Related Articles
Continue exploring with these related guides