COMP3322 notes P1 - Internet & WWW Basic

发布时间 2023-09-22 12:00:27作者: 四季夏目天下第一

选这门课完全是为了推进我博客美化的大业!希望学完之后 update logs 里的一部分 issues 能自己亲手解决。

首先来到 Internet and WWW basic: 这些基本的 network 知识对接下来的 front-end framework 学习大有裨益。Internet, Web, DNS, HTTP 等「最熟悉的陌生人」在这一节得以祛魅。

从这门课开始我打算 develop 一个新的笔记模式。目前已有的笔记模式大概分为:

  • Fragment 碎片型。Ex. ENGG1340/COMP2113.
  • Essence 精华型。Ex. ENGG3357.
  • Comprehensive 全面型。Ex. Algorithms I & II.

对于类似 COMP3322 这种概念较多较杂的课程,碎片型显得没有条理,精华型容易产生遗漏,全面型整理起来又太过冗长。于是我准备尝试一下结合课程 slides 的 revision-oriented 型:一切以 final 复习方便为宗旨!


{% note warning %}

  This article is a self-administered course note.

  It will NOT cover any exam or assignment related content.

{% endnote %}


What is the Internet?

The Internet is the backbone of the Web, the technical infrastructure that makes the Web possible. At its most basic, the Internet is a large network of computers which communicate all together.

Keywords. [见 slides 01-Internet-Basic]

  • Internet is a technical infrastructure.
  • hosts (主机)。
  • IP routers (\(\sim n^2\) links to \(n\) links).

What is World Wide Web (WWW)?

Internet 的中文翻译是互联网;而 WWW (World Wide Web) 是万维网。在此之前我一直以为它们指的是同一种东西;然而互联网中的「网」强调的是 network;而万维网中的「网」指的是 web。这个问题挺值得玩味。

另外,WWW 的台译是全球资讯网,我认为该翻译虽没有万维网那么「雅」,但「信」,「达」却做得更好。

Keywords. [见 slides 01-Internet-Basic]

  • hypertext links (超文本链接).
  • uniform resource locator (URL): protocol + domain name + network path.
    • uniquely identify a resource on the WWW.
  • client-server communication: by means of HTTP protocol.
  • publishing documents with HTML & CSS.

Client-Server Communication

Keywords. [见 slides 01-Internet-Basic]

  • End-to-End and port numbers. [a]
  • [IMPORTANT] Illustration of steps taken by client and server.
  • IP address and IPV4 addressing. [b]
  • localhost: this computer. IP: 127.0.0.1.

[a]: 客户端同时运行着 Google Chrome 与 Firefox,而服务器端运行着 Nginx 与 MongoDB;通过分配不同的 port numbers,进程间可以分别建立联系 (Chrome \(\to^{80}\) Nginx, Firefox \(\to^{443}\) MongoDB)。客户端的进程向对应的端口发送请求;而服务器端的进程则监听特定的端口,并对请求进行回应。

[b]: IPV4 addressing scheme 的前/后缀。在 HKU 之外,148.7 这一前缀将作为 (HKU) physical network locator;而在 HKU 内部,147.8.175 这一前缀将作为 (HKU CS) physical network locator.


Domain Name System (DNS)

DNS maps human-readable domain names to binary addresses used by Internet Protocol.

Keywords. [见 slides 01-Internet-Basic]

  • hierarchical domain names: read from right to left.
    • nameless DNS root node connecting to all TLDs.
    • top-level domains (TLD). e.g., com, org, hk, cn...
    • second-level domains (SLD). e.g., thisisxxz...
    • 3rd, 4th...
  • distributed hierarchical database.
    • internal machines \(\to\) local DNS servers (provided by ISP) \(\to\) root name servers.
  • [IMPORTANT] domain name address resolution process.
  • DNS caching: browser cashes/local DNS(name) server cashes.

Hypertext Transfer Protocol (HTTP)

HTTP is an extensible protocol that is easy to use. The client-server structure, combined with the ability to add headers, allows HTTP to advance along with the extended capabilities of the Web.

Keywords. [见 slides 01-Internet-Basic]

  • application-layer protocol: sent over TCP.
  • stateless but not sessionless (e.g., cookies).
  • HTTP/1.0. HTTP headers, status code info.
  • HTTP/1.1 (still widely used). pipeline request, caching, content negotiation.

HTTP Messages

Keywords. [见 slides 01-Internet-Basic]

  • HTTP request: [request type] [URL] [HTTP version].
    • e.g., GET /https://i.cs.hku.hk/~atctam/view/Simple.html HTTP/1.1.
  • HTTP response (status line): [HTTP version] [status code] [status phrase].
    • e.g., HTTP/1.1 200 OK.
  • HTTP headers. A series of key-value pairs [case_insensitive_name]: [value].
    • provide additional info. 4 kinds: General/Entity/Request/Response.
  • Message body.
    • requests: do not have a body for GET, HEAD..., have it for POST...
    • responses: carries the resource (HTML page) requested by clients.
  • Web requests. The browser makes subsequent requests for each resource (CSS, scripts, external pictures...) referenced in the HTML.

Caching & Validation

Keywords. [见 slides 01-Internet-Basic]

  • web caching: private browser caches/proxy servers caches.
    • advantages: (1) cache close to client \(\to\) reduce response time. (2) decrease network traffic to distant servers.
  • stale: Cache-Control & Expires headers.
  • validation: two response headers & a request header.
    • web server responses with Etag and Last-Modified headers.
    • browser/proxy server requests with If-None-Match and/or If-Modified-Since header.
    • If not modified, web server responses with HTTP/1.1 304 Not Modified.
  • cookies. server sends cookies to browser with the Set-Cookie header. browser stores it and the next request to the same server includes the Cookie header.
    • responses: Set-Cookie: user_id=1678.
    • (next) requests: Cookie: user_id=1678.

Reference

{% note warning %}

  This article is a self-administered course note.

  References in the article are from corresponding course materials if not specified.

{% endnote %}

Reference website:

Course info:

Code: COMP3322, Lecturer: Dr.Anthony Tam.