IRL Chicago
Web Editor


web crawlers

We find stuff on the “surface web” using search engines. search engines use algorithms and “web crawlers” ( also known as spiders ) to regularly index the entire surface web and make it searchable.

A web crawler is a “bot” that starts on a link and then follows all the links on that page and then follows all the links on those pages, so on and so forth. Remember that when we browse the web we’re essentially downloading a copy of that page ( at least momentarily ) onto our computer so that our browser can render it for us. A web crawler does the same thing, except a lot faster, which makes it capable of essentially downloading a snapshot of the entire world wide web… and once you have a copy of the web you can write algorithms to index and rank them and thus make it more efficiently searchable.

you can try this out yourself and use a fre web crawler like site sucker to download your own copy of a website ( or even the entire world wide web... if you have enough room on your computer )

search engines aren’t the only folks who use web crawlers, the Internet Archive also uses crawlers to maintain an archive of the entire web, you can use the way back machine to search their archive and see what different websites looked like years ago!

Tor ( network && browser )

Tor "is free software for enabling online anonymity and resisting censorship. It is designed to make it possible for users to surf the Internet anonymously, so their activities and location cannot be discovered by government agencies, corporations, or anyone else." --(wikipedia)

everytime you go online, regardless of whether you’re logged into google or facebook, you’re never anonymous, you can be uniquely identified via your IP address. Tor routes your traffic “through a free, worldwide, volunteer network consisting of more than five thousand relays” in order to conceil that identity. You can use Tor by downloading the Tor browser and navigate the web anonymously.

Tor can also be used to create “hidden services.” There are websites that look like this http://djfa84aof8398gjf.onion a random string followed by a .onion, these are the websites created with Tor which become part of “deep web.” They can’t be viewed on a normal browser, because these domains are created using Tor, and therefore can’t be crawled and indexed by popular search engines.

The vast majority of files and sites on the web aren’t on the surface web, they’re on the deep web, but you can’t find them using searching on google and using chrome or firefox.

what's on the deep web

Lots of the folks working on the Tor project are political activists, they leverage the anonymity provided by Tor to do political advocacy, whistle-blowing ( like wikileaks ), create dissident blogs, essays, forums etc. The anonymity provides protection for activists in oppressive regimes to freely exchange ideas and organize movements.

But there’s also a dark side to the deep web. The anonymity also allows folks to break the law and create child pron sites, sell drugs and even contract hit-men.

net neutrality



watch this documentary on the Deep Web

watch this mock-umentary on Net Neutrality