How the Internet Routes Traffic

When you load a web page, your request travels through a sequence of networks – your ISP, a transit provider, an internet exchange, a cloud provider’s backbone, a data centre in another hemisphere. The decisions about which path to take are made by routers running a protocol called BGP, and those decisions are shaped by business relationships, physical geography, and a surprising amount of trust. This is how the internet figures out where to send your packets.

In the networking post, we covered how packets move through a single network – Ethernet frames, IP addresses, switches, and routers. This post picks up where that one left off: how packets move between networks, across the global internet, from your laptop in Perth to a server in Virginia or Sydney or Singapore.

Autonomous systems: the internet’s building blocks

The internet is not one network. It’s a network of networks – tens of thousands of independently operated networks that have agreed to interconnect. Each of these networks is called an autonomous system (AS).

An autonomous system is a collection of IP address ranges (prefixes) under the control of a single organisation, presenting a common routing policy to the outside world. Your ISP is an AS. AWS is an AS (several, actually). Google is an AS. Telstra is an AS. The university network you used in university was probably an AS.

Each AS is identified by an ASN (Autonomous System Number), assigned by a regional internet registry. In the Asia-Pacific region, that’s APNIC, based in Brisbane. Telstra’s primary ASN is 1221. Optus is 7474. AWS has multiple ASNs, including 16509 and 14618. Google is 15169. You can look up any ASN using tools like Hurricane Electric’s BGP Toolkit or PeeringDB.

As of 2025, there are approximately 75,000 active autonomous systems on the internet. The relationships between them – who connects to whom, and on what terms – determine how traffic flows across the globe.

BGP: the protocol that holds the internet together

Border Gateway Protocol (BGP), currently at version 4 (RFC 4271, 2006), is the routing protocol that autonomous systems use to exchange routing information with each other. It’s how AS 1221 (Telstra) tells AS 15169 (Google) “I can reach the IP range 203.0.113.0/24 – send me packets for those addresses.”

BGP is a path vector protocol. Each AS advertises the IP prefixes it can reach, along with the AS path – the sequence of autonomous systems a packet would traverse to reach that prefix. When AS A tells AS B “I can reach 203.0.113.0/24 via the path [A, C, D]”, AS B knows that reaching that prefix through A requires traversing three autonomous systems.

BGP routers (called BGP speakers) maintain a table of all the routes they’ve learned from their peers. When multiple routes to the same prefix are available, the router selects the best one based on a set of criteria:

Shortest AS path (fewer hops is generally preferred)
Local preference (the AS operator’s own routing policy – “prefer this peer over that one”)
Multi-exit discriminator (MED – a suggestion from the neighbouring AS about which entry point to prefer)
Origin type (routes learned from IGP are preferred over those from EGP, which are preferred over incomplete)
Various tie-breakers (lowest router ID, etc.)

The important thing to understand is that BGP routing decisions are heavily influenced by policy, not just technical metrics. An AS might prefer a longer path through a cheaper transit provider over a shorter path through an expensive one. An AS might prefer to keep traffic on its own network for as long as possible (a practice called cold potato routing) or hand it off to the next network as quickly as possible (hot potato routing). These decisions are business decisions encoded as routing policy.

Peering and transit: the business of interconnection

Autonomous systems connect to each other through two fundamental types of relationship: peering and transit.

Transit is a customer-provider relationship. A smaller AS pays a larger AS to carry its traffic to the rest of the internet. Your ISP probably has a transit agreement with one or more Tier 1 networks (networks that can reach the entire internet without purchasing transit from anyone else). The transit provider announces the customer’s IP prefixes to the rest of the internet and carries traffic between the customer and other networks.

Transit is paid. The customer pays the provider based on bandwidth (typically measured at the 95th percentile of usage, in a model called 95th percentile billing). Prices vary enormously by geography and market – transit in Australia has historically been more expensive than in North America or Europe, though the gap has narrowed as competition has increased and undersea cable capacity has grown.

Peering is a mutual exchange. Two ASes agree to carry each other’s traffic directly, without paying a transit provider as an intermediary. Peering is typically settlement-free – neither party pays the other – because both benefit from the reduced latency and cost of direct interconnection.

Peering happens in two ways:

Private peering: two ASes run a direct physical link between their routers, usually in the same data centre. This is common between large networks that exchange a lot of traffic. Google and Telstra, for instance, have private peering arrangements in multiple locations.

Public peering: two ASes connect through an Internet Exchange Point (IXP) – a shared switching fabric where multiple ASes can interconnect. Instead of running a separate link to each peer, an AS connects once to the IXP and can peer with any other AS connected to the same exchange.

Internet Exchange Points

An IXP is, physically, a set of high-speed Ethernet switches in one or more data centres. ASes connect to the switch fabric and can exchange traffic directly with any other connected AS. The IXP doesn’t route traffic – it just provides the Layer 2 (Ethernet) infrastructure for ASes to peer with each other.

IXPs are crucial for internet performance, especially in regions like Australia where the alternative – sending traffic via an international transit provider – adds significant latency.

IX Australia operates internet exchanges in Perth, Sydney, Melbourne, Brisbane, Adelaide, and other cities. The Perth IX is located in the Equinix PE1 data centre in Malaga. When a Telstra customer in Perth visits a website hosted by a small hosting company that’s also connected to the Perth IX, the traffic can flow directly between the two networks at the exchange, staying within Perth. Without the IXP, that traffic might need to travel to Sydney and back, adding 40-60 milliseconds of latency.

The world’s largest IXPs handle staggering amounts of traffic. DE-CIX Frankfurt peaks at over 14 Tbps. AMS-IX Amsterdam handles over 10 Tbps. The Equinix exchanges collectively handle even more. These exchanges are where the bulk of the internet’s traffic actually flows.

The Tier 1 networks

A Tier 1 network is an AS that can reach every other AS on the internet without purchasing transit from anyone. Tier 1 networks peer with each other settlement-free – they have to, because by definition no Tier 1 network purchases transit. If any two Tier 1 networks refused to peer with each other, their customers wouldn’t be able to reach each other, which would fragment the internet.

The exact list of Tier 1 networks is debated (the status isn’t formally assigned), but generally includes:

Network	ASN	Headquarters
Lumen (CenturyLink)	3356	USA
NTT	2914	Japan
Cogent	174	USA
Arelion (Telia Carrier)	1299	Sweden
GTT	3257	USA
Telxius (Telefonica)	12956	Spain

Telstra (AS 1221) is sometimes considered Tier 1 within the Asia-Pacific region, though its global status depends on how you define the criteria. It has extensive peering arrangements across the Asia-Pacific and significant transit infrastructure.

The 2021 Facebook outage

On 4 October 2021, Facebook, Instagram, WhatsApp, and Messenger went offline for approximately six hours. It was one of the largest internet outages in history, affecting roughly 3.5 billion users. The cause was a BGP withdrawal.

Here’s what happened, based on Facebook’s own post-mortem and Cloudflare’s analysis:

During routine maintenance, a command was issued to assess the capacity of Facebook’s backbone network. Due to a bug in the audit tool, the command instead withdrew all BGP routes for Facebook’s IP prefixes. This meant that Facebook’s DNS servers, which resolve facebook.com, instagram.com, and whatsapp.com, became unreachable. Not because the servers were down – they were running fine – but because the rest of the internet no longer knew how to reach them.

The withdrawal cascaded. Facebook’s DNS servers stopped answering queries because nobody could reach them. DNS resolvers around the world cached the NXDOMAIN (domain not found) responses. Within minutes, facebook.com effectively ceased to exist on the internet.

The recovery was hampered by a cruel irony: Facebook’s internal tools for managing BGP were themselves accessible via the network that BGP had just disconnected. Engineers couldn’t remotely access the routers to fix the configuration because the routers were unreachable. They had to physically travel to Facebook’s data centres and access the routers via out-of-band management consoles. Entry to the data centres was further complicated because the electronic badge systems also depended on the now-disconnected network.

The outage illustrates several critical points about BGP:

BGP has no authentication by default. Any AS can announce any prefix. Facebook’s routers announced a withdrawal, and the internet believed it immediately. There’s no “are you sure?” mechanism.
DNS depends on BGP. You can have the most reliable DNS infrastructure in the world, but if the BGP routes to your name servers are withdrawn, your domain disappears.
Centralisation creates fragility. Facebook runs its own authoritative DNS on its own network. When that network became unreachable, there was no backup. A more distributed architecture – using a third-party DNS provider alongside internal DNS – would have mitigated the impact.
Out-of-band access matters. The engineers who fixed the problem needed physical console access because every remote access path depended on the network that was down.

BGP security: the trust problem

BGP was designed in the late 1980s for a much smaller internet where ASes were operated by organisations that knew and trusted each other. The protocol has essentially no built-in security. When an AS announces “I can reach 203.0.113.0/24”, other ASes accept that announcement on trust.

This creates a vulnerability called BGP hijacking: an AS (intentionally or accidentally) announces routes for IP prefixes it doesn’t control, causing traffic destined for those prefixes to be routed to the wrong place.

Notable BGP hijacks include:

The Pakistan YouTube hijack (2008): Pakistan Telecom, ordered by the Pakistani government to block YouTube, announced a more specific route for YouTube’s IP prefix. Due to a misconfiguration, this announcement leaked to the global internet, and YouTube became unreachable worldwide for about two hours. Pakistan Telecom claimed to own YouTube’s IP space, and the internet believed it.

The Amazon Route 53 hijack (2018): attackers hijacked an IP prefix used by Amazon’s Route 53 DNS service, redirecting DNS queries for myetherwallet.com to a server under their control. Users who visited the site were served a phishing page that stole cryptocurrency wallet credentials.

Resource Public Key Infrastructure (RPKI, RFC 6480) is the primary mitigation. RPKI allows IP prefix holders to cryptographically sign Route Origin Authorisations (ROAs) that specify which ASes are authorised to originate routes for their prefixes. BGP routers that implement Route Origin Validation (ROV) can reject routes that violate the ROA.

Adoption has been growing but remains incomplete. As of 2024, roughly 50% of IPv4 prefixes have ROAs published. Cloudflare, Google, Amazon, and most major networks now perform ROV. But many smaller networks don’t, and even with ROV, BGP path validation (verifying that the AS path is legitimate, not just the origin) remains an unsolved problem.

How AWS routes traffic

For anyone building on AWS – which is most of the internet, frankly – understanding how traffic flows within and between AWS resources is essential.

VPCs and route tables

A VPC (Virtual Private Cloud) is an isolated network within an AWS region. It has its own IP address space (a CIDR block), its own subnets, and its own route tables that determine where traffic goes.

Each subnet in a VPC is associated with a route table. The route table contains rules like:

Destination	Target	Purpose
10.0.0.0/16	local	Traffic within the VPC stays within the VPC
0.0.0.0/0	igw-abc123	All other traffic goes to the internet gateway

The “local” route is implicit – every VPC route table includes it. Traffic between instances in the same VPC is routed by the VPC networking layer without leaving AWS’s network.

Internet gateways and NAT gateways

An internet gateway (IGW) is the connection point between a VPC and the public internet. Resources in a public subnet (a subnet whose route table has a route to the IGW) can have public IP addresses and communicate directly with the internet.

A NAT gateway allows resources in a private subnet (no route to the IGW) to initiate connections to the internet (for software updates, API calls, etc.) without being directly reachable from the internet. The NAT gateway sits in a public subnet and performs network address translation, similar to your home router.

Transit Gateway

AWS Transit Gateway is a regional hub that connects multiple VPCs and on-premises networks. Instead of creating VPC peering connections between every pair of VPCs (which scales quadratically – 10 VPCs need 45 peering connections), you connect each VPC to the Transit Gateway and manage routing centrally.

For an organisation with production, staging, and development VPCs, plus connections to corporate offices, Transit Gateway simplifies the network topology from a mesh to a star.

Direct Connect

AWS Direct Connect provides a dedicated, private network connection between your data centre (or co-location facility) and AWS. Instead of sending traffic over the public internet, you establish a physical connection (1 Gbps or 10 Gbps) at an AWS Direct Connect location.

In Australia, Direct Connect locations include facilities in Sydney (Equinix SY1-SY5, Global Switch, NextDC S1/S2), Melbourne (Equinix ME1, NextDC M1/M2), and Perth (NextDC P1). A Perth-based company using Direct Connect would typically connect to a NextDC facility in Malaga and establish a dedicated link to AWS’s ap-southeast-2 region in Sydney.

Direct Connect provides consistent network performance (no contention with public internet traffic), reduced data transfer costs (Direct Connect data transfer is cheaper than internet data transfer), and privacy (traffic doesn’t traverse the public internet).

CloudFront and edge locations

CloudFront is AWS’s content delivery network (CDN). It caches content at edge locations around the world, so users receive content from a nearby location rather than from the origin server in a distant region.

As of 2025, CloudFront has over 450 edge locations in more than 90 cities. In Australia, there are edge locations in Sydney, Melbourne, Brisbane, Perth, and Adelaide. When a user in Perth requests a resource from a CloudFront distribution, the request is served from the Perth edge location (if cached) or routed to the origin via AWS’s backbone network.

CloudFront uses anycast – the same IP addresses are announced from multiple edge locations, and BGP routing directs the user to the nearest one. This is the same technique used by the DNS root servers (which we covered in the DNS post).

Anycast: one address, many locations

Anycast is a networking technique where the same IP address is assigned to multiple servers in different locations. When a client sends a packet to an anycast address, BGP routing delivers it to the “nearest” server – nearest in the BGP sense, which usually means fewest AS hops, not necessarily geographically closest.

Anycast is used extensively for:

DNS root servers: the thirteen root server addresses are anycast, with over 1,700 instances worldwide
CDNs: CloudFront, Cloudflare, Fastly, and other CDNs use anycast to direct users to the nearest edge location
DDoS mitigation: anycast distributes attack traffic across many locations, preventing any single location from being overwhelmed
Public DNS resolvers: Cloudflare’s 1.1.1.1 and Google’s 8.8.8.8 are anycast addresses

The benefit is transparent failover. If one anycast instance goes down, BGP routing automatically directs traffic to the next nearest instance. The client doesn’t need to know about the failure – the IP address doesn’t change, and the routing adapts.

The limitation is that anycast works best for stateless or short-lived protocols (DNS, HTTP requests to a CDN). For long-lived TCP connections, if routing changes mid-connection (because a closer anycast node comes online or goes offline), the connection can break because the packets start arriving at a different server that doesn’t have the TCP session state.

The undersea cables

The global internet is, fundamentally, a network of optical fibre cables – many of them running along the ocean floor. These submarine cables carry over 99% of intercontinental data traffic. Satellites, despite their visibility, handle only a tiny fraction.

Australia’s international connectivity depends on a handful of submarine cable systems:

Cable	Route	Capacity	Year
Australia Singapore Cable (ASC)	Perth -- Singapore	40 Tbps	2018
Indigo	Perth -- Singapore, via Indonesia	36 Tbps (design)	2019
Southern Cross NEXT	Sydney -- Auckland -- Los Angeles / Fiji	72 Tbps (design)	2022
Japan-Guam-Australia South (JGA-S)	Sydney -- Guam -- Japan	36 Tbps	2020
SEA-ME-WE 3	Perth -- SE Asia -- Middle East -- Europe	0.96 Tbps	2000
Oman Australia Cable (OAC)	Perth -- Oman (via Indian Ocean)	Design 100+ Tbps	Under construction

Perth is Australia’s primary gateway to Asia and increasingly to Europe. The Australia Singapore Cable and the Indigo cable both land in Perth, connecting Australia directly to Singapore’s major internet exchange ecosystem. From Singapore, traffic can reach the rest of Asia, Europe (via overland cables through the Middle East), and anywhere else.

Sydney is the primary gateway to the United States (via the Pacific) and New Zealand. Most of Australia’s east coast traffic to the US traverses cables like Southern Cross NEXT.

The TeleGeography Submarine Cable Map is an excellent interactive visualisation of the global submarine cable network. It makes visible something that’s easy to forget: the internet is, at its core, a physical thing – light pulses travelling through glass fibres at the bottom of the ocean.

These cables are surprisingly vulnerable. They’re about the diameter of a garden hose in deep water (the fibre is tiny; the rest is steel armouring and insulation). They’re laid on the ocean floor, sometimes buried in shallow water near the shore. They’re damaged regularly by ship anchors, fishing trawlers, earthquakes, and (rarely) shark bites. The repair process involves a specialised cable ship sailing to the break point, grappling the cable from the ocean floor, splicing in a new section, and re-laying it. It can take weeks.

When a cable breaks, traffic is rerouted via other cables, but capacity is reduced and latency increases. In 2022, the Tonga volcanic eruption severed the only submarine cable connecting Tonga to the rest of the internet, taking the country offline for over a month until repairs were completed.

Traceroute: seeing the path

Traceroute (called tracert on Windows) is a diagnostic tool that reveals the path packets take between your computer and a destination. It works by exploiting a feature of IP called the Time to Live (TTL) field.

Every IP packet has a TTL field – a counter that starts at some value (typically 64 or 128) and is decremented by each router that forwards the packet. When the TTL reaches 0, the router discards the packet and sends an ICMP Time Exceeded message back to the sender, including its own IP address.

Traceroute sends packets with incrementally increasing TTL values:

TTL=1: the first router decrements it to 0 and sends back an ICMP message, revealing its identity
TTL=2: the first router passes it along, the second router decrements it to 0 and responds
TTL=3: reaches the third router
And so on, until the packet reaches the destination

Here’s a simplified traceroute from Perth to a server in the US:

192.168.1.1        1 ms     Home router
10.201.0.1         8 ms     ISP's local router (Perth)
203.0.113.1       10 ms     ISP's backbone (Perth)
198.51.100.1      12 ms     IX Australia Perth
192.0.2.1         55 ms     Singapore (undersea cable hop)
192.0.2.5         56 ms     Transit provider (Singapore)
192.0.2.9        180 ms     Pacific crossing (Singapore to LA)
192.0.2.13       182 ms     US transit provider (LA)
192.0.2.17       210 ms     Destination data centre
93.184.216.34    212 ms     Destination server

The latency jumps tell the story. Hops 1-4 are within Perth – low latency, under 15 ms. The jump from hop 4 to hop 5 (12 ms to 55 ms) is the undersea cable from Perth to Singapore. The jump from hop 6 to hop 7 (56 ms to 180 ms) is the Pacific crossing from Singapore to Los Angeles. Physics imposes a hard floor on these latencies: light travels through optical fibre at roughly two-thirds the speed of light in a vacuum, and Perth to Singapore is about 3,900 km of cable, giving a minimum one-way latency of around 19 ms. The actual latency is higher because of routing, switching, and amplification.

Traceroute is useful for diagnosing where problems occur:

High latency at a specific hop: that hop (or the link to it) may be congested
Packet loss at a specific hop: that router may be overloaded or the link may be degraded
A hop that doesn’t respond: some routers are configured to not respond to traceroute probes (they drop ICMP TTL Exceeded messages), which shows as * * * in the output – this isn’t necessarily a problem
Unexpected routing: your packets might be taking a geographically circuitous route (Perth to Sydney to Singapore instead of Perth directly to Singapore), indicating a routing inefficiency

MTR (Matt’s Traceroute) combines traceroute with continuous ping, showing real-time latency and packet loss at each hop. It’s the preferred tool for ongoing network diagnostics.

Traffic engineering: shaping the flow

Large networks don’t just let BGP route traffic wherever it wants. They actively shape traffic flows using traffic engineering – manipulating BGP attributes, adjusting link weights, and selecting routes to optimise for cost, performance, or reliability.

Common techniques include:

AS path prepending: an AS artificially lengthens its AS path in announcements to certain peers, making those paths less attractive and shifting traffic to preferred routes. If an Australian network wants traffic from Europe to arrive via Singapore rather than the US, it can prepend its AS path in the announcement sent to US transit providers, making the US path look longer.

BGP communities: tags attached to route announcements that communicate policy preferences. A transit provider might define communities that let customers control how their routes are announced: “don’t announce this route to peers in Europe” or “announce this route with lower preference to AS 174.”

ECMP (Equal-Cost Multi-Path): when multiple paths to a destination have equal BGP metrics, traffic is distributed across all of them. This provides both load balancing and redundancy.

The internet is physical

It’s easy to think of the internet as ethereal – data floating through the cloud, weightless and instant. But the internet is profoundly physical. Your data travels as light pulses through glass fibres, amplified every 80 km by erbium-doped amplifiers, carried across oceans in cables that took years to plan and months to lay.

The routing decisions that determine your packet’s path are made by software – BGP, running on specialised routers – but they’re constrained by geography, physics, and economics. You can’t route around the speed of light. You can’t avoid the undersea cable if you’re sending data between continents. You can peer with a network at an IXP in Perth, but only if that network has a physical presence in Perth.

When you load a web page from Perth and it takes 250 milliseconds, that’s not a sign of a slow internet connection. That’s the minimum time for light to travel from Perth to Virginia and back, through fibre optic cables, across the Indian Ocean and the Pacific, through a dozen routers each making BGP-informed forwarding decisions, through firewalls and load balancers and TLS handshakes, arriving at a server that processes your request and sends the response back along a similarly circuitous path.

250 milliseconds. A quarter of a second. For a round trip of 30,000 kilometres.

The internet routes traffic through a combination of protocol (BGP), economics (peering and transit agreements), physics (undersea cables and the speed of light), and trust (AS operators believing each other’s route announcements). It works remarkably well, most of the time. When it fails – when a BGP route is withdrawn by accident, when a cable is severed by an anchor, when a hijack redirects traffic through a hostile network – the consequences are immediate, global, and a reminder that the seemingly abstract internet is built on very concrete infrastructure.

Series

Under the Hood — deep dives into the technology we use every day.

What Time Is It? — coming around 21 April
The Wonderful, Absurd Science of Colour — coming around 28 April
How LLMs Actually Work — coming around 26 May
Time Is Weirder Than You Think — coming around 2 June
How Estimation Works (And Why It Doesn't) — coming around 30 June
How Clocks Work — coming around 21 July
How CI/CD Pipelines Work — coming around 18 August
How Email Works — coming around 15 September
How Computer Networks Work — coming around 22 September
How Randomness Works — coming around 27 October
A Gentle Guide to Typography: From Chisels to Character Sets — coming around 17 November
How TLS Works: From Trusted Networks to Trust-No-One — coming around 24 November
How Multi-Factor Authentication Works — coming around 12 January
How the Internet Routes Traffic (you are here)
A Guide to DNS — coming around 23 March