How the Internet Works 9: HTML

In today’s post I want to demystify the term HTML. HTML stands for HyperText Markup Language, and it is the way that you write web pages. A markup language, according to Wikipedia, is a way to “annotate the text” to provide some additional information.

Every webpage has HTML, which includes a set of “tags” to describe different parts of the page. A tag can be something like <a>, <img>, <html>, <p> and many more.

Here is a simple HTML page:

<HTML>
  <HEAD>
    <TITLE>My Page</TITLE>
  </HEAD>
  <BODY>
       <H1>Welcome to my page!</H1>
  </BODY>
</HTML>

There are a few things you can notice here. Tags start like this <NAME>: and end like this </NAME>. Tags can go underneath other tags. The outermost tag is the <HTML> tag, which says this is a html page. There is a <TITLE> tag, which says the title of this page is “My html page”. There is a tag <H1> which says that there is a big header on the page that says “Welcome to my page!”

The indenting shows how tags are related to each other. A tag that is more indented is said to be a “child” of the tag before it. The structure of a HTML page is said to create a “tree.”

If I want to expand my current example to include an image to a file named “dog.jpg”, I would write

<img src="dog.jpg" />

This image tag has a source attribute which tells us the filename is “dog.jpg.” Different attributes tell us whether the content should be a list, a header, a paragraph, or some different section of text. These tags can be annotated with other attributes that determine how they are styled (CSS) or what to do when certain events happen (JavaScript), but these are topics for a later date.

The main point is that webpages are just made up of html which is a series of tags that add additional information to plain text. If you want to write html, just open a text editor, copy and paste that first example and save it as a file called “index.html”– and you can open that up in a web browser.

How the Internet Works 8: Bytes, Megabytes, and More

I wrote recently about how everything in computers is stored as 0s and 1s, and the language of computers is binary. You can read about that here. However, one bit or a few bits doesn’t really contain that much information. In general you will be dealing with thousands, or millions, or billions, or even more binary digits!

There are a set of prefixes that are associated with binary data, and here is what they all mean.

First a bit is just a single 1 or 0.

If you take 8 bits and put them together we call that a byte.

One thousand bytes is called a kilobyte (shorthand kB). The prefix kilo means 1,000, like kilogram. One minor thing is that computers always represent things in powers of 2, so although the prefix kilo means 1,000 on a computer it is more likely that a kilobyte is 1,024 bytes, because 1,024 = 2^10– the closest power of two to 1,000. A small text file on your computer is probably a few kilobytes up to a hundred kilobytes.

One million bytes is called a megabyte (shorthand MB). The prefix mega means 1,000,000. For a computer the exact number is the closest power of 2 which is 2^20. A standard mp3 file on your computer is probably somewhere from 3 to 10 megabytes (that is 3 to 10 MILLION bytes, or 24 to 80 MILLION 1s and 0s!).

One billion bytes is called a gigabyte (shorthand GB). The prefix giga means 1,000,000,000. For a computer the closest power of 2 is 2^30. A full-length video of medium quality is probably around a gigabyte in size.

Just for some more reference, the Google homepage just sent 32.1 kB of data. Just loading the home page of thekeesh.com was 440.98 kB, but doing it a second time was only 12.36kB of data. An image file in general is from tens of kilobytes to a few megabytes.

There are more prefixes bigger than giga–

terabyte is one thousand gigabytes
petabyte is one thousand terabytes
exabyte is one thousand petabytes
zettabyte is one thousand exabyte
yottabyte is one thousand zettabyte.

You can buy a terabyte hard drive these days for about $100 or less on a quick Google search, which is pretty crazy.

From the wikipedia page on zettabyte:

As of February 2012, no storage system has achieved one zettabyte of information. The combined space of all computer hard drives in the world was estimated at approximately 160 exabytes in 2006… As of 2009, the entire Internet was estimated to contain close to 500 exabytes. This is a half zettabyte.

So that is a basic introduction to some of the metric prefixes as applied to bytes.

How the Internet Works 7 – Clients and Servers

Part of my goal with this series of posts it to demystify a lot of the jargon that you find when people talk about the internet. Each of these topics has a whole literature and special part of the internet dedicated to it, but just knowing what the general idea is a step in the right direction.

Today, I want to write about “clients” and “servers.” These are words that are used a lot in describing basic parts of the internet, so here is the answer:

The definitions go hand in hand, and so that is why I will introduce them together.

The client is someone that requests information server. The server is the one who responds to the requests.

If you think about it like a restaurant–when you go, you sit down at your table. You are the “client,” you can order whatever food you like. The waiter or waitress is your “server,” they respond to your requests.

In terms of the internet, you are the client. When you want to visit a webpage, say google.com, you make a request to get their webpage, and one of Google’s servers responds with the webpage.

When people talk about servers they can mean a million different things, but usually they are referring to some computer, (or part of a computer), that is responsible for “serving” up their website, or responding to the different clients who are requesting it.

“Client” and “server” go hand in hand with another pair of words: “front-end” and “back-end.” Front-end is the client-side, or the place where you, the end-user, is. Back-end is the server-side, where the website’s big computers and data-centers are. Now if I told you that “php is a server-side programming language”–you can start to decode that phrase.

How the Internet Works 6 – Programming Languages

This is going to be an extremely brief and high level overview of what a programming language is. My goal here is that if you don’t know what a programming language is before reading this, you have an idea of what it is after.

A programming language is just the way that you tell a computer what you want it to do. You run programs on your computer, like a web browser, or maybe iTunes. These are programs. Someone or a group of people built these programs, and the way they did it was by using a programming language.

There is a famous first program that people write, called “Hello World” where you just try and get the program to print out “Hello World.” Here is an example in a programming language called C:

#include <stdio.h>

int main(){
    printf("Hello World");
}

Theres a lot of random stuff here, a lot of random syntax to get the program to run, but the basic idea is that this will output “Hello World” when you run it.

In another programming language called python, if you want to print hello world you just write:

print "Hello World"

Different programming languages have different ways of telling the computer to do things. Many math operations also carry over, so for example you could add two numbers in python with:

print 5 + 10

There is wayyyyyyyyyyy more to programming languages. There is a specific legal way to write things in programming languages, and that is called the syntax. There are lots of other ways to classify and talk about programming languages, but that is for another time.