The Internet and how data travels
HandoutThe Internet
- The Internet is a huge network of computers that are joined together.
- No single computer is in charge. Many computers share the work.
- When you open a web page, your data travels across many of these computers.
Data is split into packets
- Big data does not travel in one piece.
- It is cut into small pieces called packets.
- Each packet carries a bit of the data, plus an address (where it goes) and a number (its order).
Your message: "HELLO FRIEND"
Split into packets:
[1] "HELL" to: 203.0.113.7
[2] "O FR" to: 203.0.113.7
[3] "IEND" to: 203.0.113.7
Routing: many paths
- Routing means choosing a path for each packet through the network.
- Packets pass through many computers called routers on the way.
- Different packets can take different paths to the same place.
[You] --- router A --- router B --- [Website]
\ /
--- router C --- router D -
Packet 1 may go A -> B.
Packet 2 may go C -> D.
Both still arrive.
Redundancy and fault tolerance
- Redundancy means there is more than one path between two points.
- Because of this the Internet has fault tolerance: it keeps working even if one link fails.
- If router B breaks, packets just use another route, like C -> D.
[You] --- router A --- (router B BROKEN)
\
--- router C --- router D --- [Website]
The path through B fails.
Packets still arrive through C -> D.
TCP/IP: the rules
- Computers must agree on rules so they can talk. We call a set of rules a protocol.
- IP (Internet Protocol) gives every computer an address and helps send each packet.
- TCP (Transmission Control Protocol) checks that all packets arrive and puts them back in order.
Packets arrive (maybe out of order):
[3] "IEND" [1] "HELL" [2] "O FR"
TCP reorders them by number:
[1] "HELL" + [2] "O FR" + [3] "IEND"
= "HELLO FRIEND"
HTTP: the web
- The web (World Wide Web) is the pages and links you open in a browser.
- The web uses a protocol called HTTP (HyperText Transfer Protocol).
- Your browser sends an HTTP request ("please give me this page"); the server sends an HTTP response (the page).
Why this design is strong
- No single point controls everything, so there is no single point to break the whole Internet.
- Many paths + packets = the network can route around problems.
- Shared rules (TCP/IP, HTTP) let very different computers work together.
Key words
- Packet: a small piece of data with an address and an order number.
- Routing: choosing a path for packets through routers.
- Redundancy / fault tolerance: many paths, so one failure does not stop the network.
- TCP/IP: rules to send packets and reassemble them in order.
- HTTP: the rules the web uses to request and send pages.
Now you try
- These tasks model how data really travels: split it, reassemble it, and route around a failure.
- Press Check answer to test your code.
TCP puts packets back in order by their number. packets is a list of [order, text] pairs that arrived shuffled. Write reassemble(packets) that sorts them by order and returns the joined text. Example: reassemble([[2, "O FR"], [1, "HELL"], [3, "IEND"]]) is "HELLO FRIEND".
Click Run to see the output here.
Big data is cut into fixed-size packets. Write split(text, size) that returns a list of [order, chunk] pairs (order starts at 1), each chunk at most size characters. Example: split("HELLO FRIEND", 4) is [[1, "HELL"], [2, "O FR"], [3, "IEND"]].
Click Run to see the output here.
Redundancy gives more than one path, so the network can route around a failure. paths is a list of routes (each a list of router names). A route works only if it avoids the broken router. Write working_routes(paths, broken) returning the routes that do not contain broken.
Click Run to see the output here.