@rob_rich

Anatomy of a Web Request

by Rob Richardson

@rob_rich

https://robrich.org/

About Me

Rob Richardson is a software craftsman building web properties in ASP.NET and Node, Angular and React. He's a frequent speaker at conferences, user groups, and community events, and a diligent teacher and student of high quality software development. You can find this and other talks on https://robrich.org/presentations and follow him on twitter at @rob_rich.

Typically we look at the server pipeline

http://karthikk.com/blog/wp-content/uploads/2011/01/asp_lifecycle-e1296387796559.jpg

https://dotnettrickscloud.blob.core.windows.net/img/mvc/mvcrequestcycle.png

We're not going to do that

What happens when
I type an ip
into the browser?

What happens when
I type an ip
into the browser?

What happens when
I type an ip
into the browser?

  • opens a socket
  • sends data
  • server replies
  • browser renders it

What data is sent?

telerik.com/fiddler

What data is sent?

  • http method
  • url
  • http headers
  • http body
  • accept header
    (content negotiation)
  • user agent

What data is returned?

telerik.com/fiddler

What data is returned?

  • status code
  • headers
  • body
  • content length
  • content type

HTTP Status Codes

  • 2xx: it worked
  • 3xx: go look over there
  • 4xx: error, user / client's fault
  • 5xx: error, server's fault
  • 1xx: not done yet, but still here

 

https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

HTTP Status Codes

302: "found" (moved temporarily)

  • browser redirects to location header url
  • many libraries and command line tools automatically redirect
  • next request, they'll come back to the original url


301: moved permanently

  • browser redirects to location header url
  • browser will cache the result
  • search engines will forward link love

Let's look deeper

OSI Model

Open Systems Interconnection model

  • 7. Application layer
  • 6. Presentation layer
  • 5. Session layer
  • 4. Transport layer
  • 3. Network layer
  • 2. Data link layer
  • 1. Physical layer

 

https://en.wikipedia.org/wiki/OSI_model

OSI Model

Open Systems Interconnection model

  • 7. HTTP
  • 6.
  • 5.
  • 4. TCP
  • 3. IP
  • 2.
  • 1. Wires

 

https://en.wikipedia.org/wiki/OSI_model

Packets

http://books.msspace.net/mirrorbooks/snortids/0596006616/snortids-CHP-2-SECT-2.html

TCP Packet

TCP packet max size is 64k

TCP packets are usually under 1500 bytes

Ethernet frames are 1500 bytes

stackoverflow.com/questions/2613734/maximum-packet-size-for-a-tcp-connection

TCP connection handshake

https://static.lwn.net/images/2012/tfo/foc_use.png

TCP slow-start

  • Purpose: Avoid over-saturating the connection
  • Start by sending a few packets
  • Check latency of responses
  • Assume lack of response is due to congestion
  • Send more packets next time
  • Infer how fast it can go

 

See also https://en.wikipedia.org/wiki/Slow-start

TCP slow-start

Result:
Connection starts slow
and speeds up as you use it

 

See also https://en.wikipedia.org/wiki/Slow-start

Resolving URLS

What happens when
I type a url?

What happens when
I type a url?

What happens when
I type a url?

  • resolve url to ip
  • opens a socket to ip
  • sends http packet
  • over tcp over ethernet
  • server replies
  • browser renders it

DNS (Domain Name System)

  • NS record: "that guy is responsible for domain"
  • A record: "here's the IP"
  • CNAME record: "go ask that guy"
  •  
  • MX record: mail server
  • TXT record
  • SPF record
  • other: AAAA, SRV, PTR, etc
  •  
  • TTL: how long should DNS servers cache this

How does the
browser resolve DNS?

http://www.hill2dot0.com/wiki/index.php?title=DNS

How does the
browser resolve DNS?

  • browser asks it's DNS server
    > ipconfig
    
    IPv4 Address. . . . . . . : 192.168.1.203
    DNS Servers . . . . . . . : 192.168.1.1
  • it asks it's DNS server
  • recursively until someone knows or ...
  • until one of the 13 root name servers answers
    https://en.wikipedia.org/wiki/DNS_root_zone

How does the
DNS server answer?

  • A: "here's the IP"
  • CNAME: "go ask that guy"
  •  

consuming server caches the answer
so it can answer faster next request

DNS Example

  • browser asks for foo.com to my DNS server
  • my DNS server doesn't know
  • asks it's DNS server
  • that server answers "CNAME: ask bar.com"
  • my DNS server caches the answer then passes it on
  • browser caches the answer
  • browser asks for bar.com
  • my DNS server knows
  • answers with "A: 123.45.67.89"
  • browser opens socket to 123.45.67.89
  • browser sends http packet to http://123.45.67.89/
  • includes http header: "hostname: foo.com"

DNS Example

secure transmission

What happens when
I type a url?

What happens when
I type a url?

What happens when
I type a url?

  • resolve url to ip
  • opens a socket to ip
  • security stuff
  • sends http packet
  • over tcp over ethernet
  • server replies
  • browser renders it

https (TLS) handshake

http://technet.microsoft.com/en-us/library/Cc767139.f14-6_big(l=en-us).gif

Diffie-Hellman key exchange

  1. Alice and Bob agree: mod p = 23, base g = 5
  2. Alice chose: a = 4, sends Bob A = ga mod p
    • A = 54 mod 23 = 4
  3. Bob chose: b = 3, sends Alice B = gb mod p
    • B = 53 mod 23 = 10
  4. Alice computes s = Ba mod p
    • s = 104 mod 23 = 18
  5. Bob computes s = Ab mod p
    • s = 43 mod 23 = 18
  6. Alice and Bob now share a secret: 18

https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange

What's the purpose
of https urls?

  • encrypt data in transit
  • client verifies server
  • server doesn't verify client
    (that's client certificates)

How does
client verify server?

  • client download certificate
  • client walks the trust chain
  • until it gets to something it pre-trusts

https://superuser.com/questions/347588/how-do-ssl-chains-work

Server certificate trust chain

Client's trusted certificates

What is encrypted?

Everything apart from the hostname is encrypted
so a server hosting multiple domains
can respond with the correct certificate

TLS extension called Server Name Indicator or SNI

https://stackoverflow.com/questions/8277323/what-information-is-visible-to-a-packet-sniffer-which-intercepted-a-https-packet

https startup

  • TCP slow start
  • TLS handshake
  • certificate download
  • including trust chain

https startup
best practices

  • Minimize certificate size
  • 2048 bit, sha256 ssl certificate is fine
  • Don't include a lot of extra domains

browser suggest

What happens when
I type a url in chrome?

What happens when
I type a url in chrome?

What happens when
I type a url in chrome?

  • resolve Google url to ip
  • opens a socket to Google
  • sends "is this a url or a query"
  • Google answers and returns suggestions
  • resolve url to ip
  • opens a socket to ip
  • over tcp over ethernet
  • server replies
  • browser renders it

client page lifecycle

Client page lifecycle

https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/timing-overview.png

Client concerns

  • stops at <script>, <link>, <style>
  • downloads linked resource
  • parses resource
  • applies changes (re-flow page if necessary)
  • ... need to get the cascade order right

Client concerns

only two six simultaneous connections to the same subdomain

passes cookies automatically

Client best practices for HTTP/1

  • styles at the top
  • scripts at the bottom
  • use a CDN for static content
    (at least a different subdomain)
  • minimize web requests
  • https for anything even remotely sensitive

 

https://developer.yahoo.com/performance/rules.html

Client best practices for HTTP/2

  • styles at the top
  • scripts at the bottom
  • minimize web requests
  • use a CDN for static content
    (at least a different subdomain)
  • https all the time

 

https://developer.yahoo.com/performance/rules.html

browser runs on windows

WIN32 message pump

http://gemsres.com/story/nov06/299077/arthur-fig-1.gif

WIN32 message pump

  • things write messages into the queue
  • the message pump runs on the GUI thread
  • the message pump reads the queue and takes action
  • raises events in listening applications
  • paints on the screen with GDI, OpenGL, DirectX, etc
  •  

WIN32 message
best practices

If you compute on the GUI thread
you'll freeze the process for a time.

Get to a background thread as soon as possible.

What happens?

So what happens when you type
"https://www.somesite.com/"
in Google on Windows?

What happens?

  • key down, key up, key press events, message pump reads them
  • DNS lookup of Google suggest servers
  • Google suggestions says "go there"
  • DNS lookup of the url, resolve CNAME after CNAME until you get to an A record
  • open socket to ip
  • ethernet wraps tcp wraps http
  • tcp slow-start handshake
  • https handshake
  • ...

What happens?

  • ...
  • client verifies trust chain
  • http method, headers, accept, hostname
  • server parses request and forms response
  • how many tcp packets does it take to arrive?
  • browser parses response, loads DOM
  • browser stops at <script>, <link>, <style>
  • onload, render
  • writes win32 messages to paint the screen

The internet is not a black box.

Open the lid and look inside.