WORDSTELLATIONS

Wordstellations is a twitter bot that draws word constellations using frequent words on astronomy related wikipedia pages.

This experiment is no longer active. The "server" that was running this bot tried to burn my apartment down.

How It Works (TL;DR edition)

The entire system operates via a Selenium script (coordinated by Jenkins) that injects JavaScript and CSS into a randomly selected wikipedia page. Selenium then takes a screenshot and uploads the result to Twitter.

How It Actually Works

Find an astronomy related page to use:

Using a script, I scraped a list of 88 modern constellations as well as a few other manual selections. Here's the final list of 96 pages that the bot has to choose from:

Andromeda, Antlia, Apus, Aquarius, Aquila, Ara, Aries, Asterism (astronomy), Astrology, Auriga, Boötes, Caelum, Camelopardalis, Cancer, Canes Venatici, Canis Major, Canis Minor, Capricornus, Carina, Cassiopeia, Centaurus, Cepheus, Cetus, Chamaeleon, Circinus, Columba, Coma Berenices, Constellation, Corona Australis, Corona Borealis, Corvus, Crater, Crux, Cygnus, Delphinus, Dorado, Draco, Equuleus, Eridanus, Fornax, Gemini, Grus, Hercules, Horologium, Hydra, Hydrus, Indus, Lacerta, Leo, Leo Minor, Lepus, Libra, Lupus, Lynx, Lyra, Mensa, Meteorology, Microscopium, Milky Way, Monoceros, Musca, Norma, Octans, Ophiuchus, Orion's Belt, Orion, Pavo, Pegasus, Perseus, Phoenix, Pictor, Pisces, Piscis Austrinus, Puppis, Pyxis, Reticulum, Sagitta, Sagittarius, Scorpius, Sculptor, Scutum, Serpens, Sextans, Star formation, Summer Triangle, Taurus, Telescopium, Triangulum, Triangulum Australe, Tucana, Ursa Major, Ursa Minor, Vela, Virgo, Volans, Vulpecula

Figure out which words to draw

Once a random page is selected, Selenium then opens a Chrome instance and inject a series of scripts into the page. The first being one which determines which words would make a good constellation.

  1. Determine word frequencies (excluding common words like 'the', 'and', and any words with less that 3 characters).
  2. Choose the most frequent words ensuring that for the entire page we have at least 250 word instances to work with.

Any less than 250 word instances and the words would likely be spaced too far apart for the constellations to look pretty.

Drawing the selected words

Next Selenium injects another script that when given a list of words:

  1. Scroll to a random location on the page.
  2. Finds the text nodes that are currently visible in the viewport.
  3. Finds the location and style information of the provided words within the text nodes from the previous step.

From here on the script actually starts injecting HTML and CSS into the page.

  1. Inject a div at the bottom of the page with a super high z-index and pretty background gradient.
  2. For each word instance within the viewport, a div was then injected and absolutely positioned on top of its actual location within the page, making sure to mirror font family, style, size, weight, and line-height.

Connecting the words

  1. Inject a root SVG element above all the words drawn in the previous steps.
  2. Generate a distance matrix using the word locations.
  3. For each word generate a line SVG element to nearest 1-4 other words but don't insert them yet.
  4. Shuffle the list of lines that could be drawn.
  5. Remove any lines that are overlapping with other lines.
  6. Actually inject the SVG line elements.

Upload the result to twitter:

The last step is to tell Selenium to take a screen shot and then upload the result Twitter.

‘trianguli’, ‘that’, ‘triangulum’, ‘jump’, ‘with’, ‘constellation’, ‘2013’… http://t.co/H72XNuSnz7 pic.twitter.com/zg2ZOVA82U

— wordstellations (@wordstellations) April 19, 2015