Multiplayer Snake – Part II – Learning hard things the hard way

Holy crap. The first draft of snake.js was finished quickly, and is available for play here.  The mechanics of the game are super simple, start it up and a board is presented to you with a pale pink snake racing down towards a lime green dot.  Use the arrow keys to move the snake in the direction you want it to turn.  Every time you hit a green dot your snake grows one block longer.  Hit a wall and you die.  Hit your own body and you die.  The game has just about zero syntactic sugar for the user, which is to say, it’s not built to be user-friendly.  Readers of this blog will know by now that I tend to abhor this kind of design.  In this case, I have built the game as an intermediate step in pursuit of a very different end product, and so I do not expect it to be played by anyone; it is not designed to stand on its own, I have included the url here to give the reader some background for the rest of the post.


Immediate challenges: 

I ran into trouble early on in development, when I asked a friend to test the game for me.  She started it up in firefox and reported that her inputs were not registering.  Whoops! After a little rooting around, I tracked down the problem to Firefox’s implementation of the HTML5 specification: when the browser registers a keydown event, the spec calls for a “KeyIdentifier” attribute to come with the event object, detailing which key was hit.  Chrome did it fine, Firefox did not. Not the end of the world, I just added a translation layer into the code, so that now the lower-level “which” attribute is read, and its counterpart is called from a dictionary to find the right input, which then feeds into the game.  Super! A bug squashed!


Now for the fun part, to port it to a multiplayer game.  

Oops.  Turns out that making a game multiplayer is hard. Not hard like writing snake was hard, which (at least for me, and for the first time) it was.  But hard like a hard problem in computer science. Turns out that making a game multiplayer requires coordinating state across several systems in real time, and fast.  What does that mean?

Recalling last post’s digression into the MVC pattern, remember that the way the snake game works is to have a Controller object take inputs from a user, modify a Model of our data, and build a View of the information for the user.  This works fine when there is one user for whom this information must be coordinated, on one system.  But once one of those conditions changes, everything goes to hell.  

Backing up a second, think about how we might make a turn based game like tic-tac-toe multiplayer.  There are approximately two options we could try: first, have each of the two systems (each player’s browser) hold the model, and update it each time the other’s controller object broadcasts a user’s move, which would change where each model should be.  Using TCP (or a TCP based protocol like WebSockets), this is pretty easy, and pretty reliable; TCP ensures that every message sent arrives at its destination, and that it arrives in the order in which it was sent, relative to all others.  That’s pretty cool.  Trust me, it is.  But there’s something fundamentally broken about each player having the model locally, because it violates the idea of a model, which calls for there to be one, and only one True state of things.  Using a TCP protocol just masks this violation, by ensuring that nothing happens until all the controllers have coordinated what the One True Model is, and updating their own accordingly.  For two player tic-tac-toe then, imperfect MVC is fine, or at least, it’s acceptable.  

Ok, now let’s try that with snake.  If each player has its own model of the game locally, and that model is considered canonical (again, a fundamentally broken concept) then no player input may be processed into the model, or from the model into the view (remember, data goes User->Controler->Model->Controller->View->User) until the clients have had a chance to bring everyone else up to speed.  In a turn based game that’s not the end of the world.  Browser’s are serper derper fast, and WebSockets are pretty fast, and together they’re fast enough to coordinate models between each turn fast enough to not be a pain in the ass to players in something like Chess, Checkers, or Tic-Tac-Toe.  But Snake is not a turn based game.  Snake is in real time.  If we force the browsers to coordinate their two models every time the game tries to cycle, it will slow down.  It will suck.  

That would be annoying, and with a game as twitchy as Snake (you need to move rill fast to avoid hitting things) the game is pretty much ruined if you force it to be coordinated between multiple endpoints across a network.  

In fact, it’s a little worse than it sounds.  Remember what we said earlier: the game as we’ve been considering it is actually a corrupted form of MVC, in which there are multiple, potentially conflicting sources of truth.  Making snake step through turns ruins the gamer’s experience, but doesn’t fundamentally destroy the concept of the game.  (Although one could fairly make the case that a game like Snake is no longer itself when it ceases to be a quick twitch-reflex testing game, I’ll leave that decision to the reader for now.)  In other words then you could still play multiplayer web based snake, it would just be kind of crappy.  But now what happens when state somehow changes in one model, and not the other?  Impossible! Our models are coordinating with each other each step, how could they fall truly out of sync?  Well, the trivial case here is a cheater.  Our code is in Javascript, in the browser, and it would be relatively simple for a user to change the game as instantiated to change how his computer understood what was happening in the game.  That kind of mismatch would get ironed out eventually (as long as the game protocol was built robustly enough), but the game states only have to be broken once for the game to be altered entirely.  In other words, it would only take a user one malicious edit to cause his computer to believe that the other player was dead, and for it to report that he had won.  Sorting that out would be basically impossible.  Some answer could be arrived at by the server, but it would be extraordinarily difficult for that answer to be reliable.  That is to say, even if we could make a final ruling, it would be arbitrary, since each browser’s model is considered “TRUE”.  

The solution to this problem is to keep the model the way God intended it to be, singular and unified and held in a third party server.  This is the second of the two models for building the game that I mentioned waaay at the beginning of this article.  The only way to make really truly sure that no player irrevocably cheats the other is to make the server the sole and final arbiter of reality.  From the perspective of the game on either user’s end, the server is the alpha and the omega.  It instantiates a game model, and then accepts inputs from the users, modifies the game accordingly, and reports back to them what reality looks like now.  Sound familiar?  This is the way MVC code is actually really truly supposed to be organized: One Model – I can’t stress that enough – ONE MODEL – is handled by one controller, and fed into however many views you want.  The number of views does not matter.  The superbowls happens the same way whether you have one tv displaying a view of reality, or ten, or ten million.  But what if you tried to have ten superbowls, each of which purported to be simultaneously the same thing as all the others, as well as the gospel truth of what a “The Superbowl” was?  Well basically then you’d have politics.  Depending on how you feel about politics, they’re fine in the political realm, but we don’t want that crap in our computer programs.  

What does all of this mean for you – dear reader?  Well first it means that with apologies, I must withhold from you a multiplayer snake game.  Some reasonable approximation of such a game is still very much possible, but some serious design decisions will have to be made first if I want to make one worth playing.  At this point I’m putting the odds of that at around ~30%.  Not the end of the world, but probably not going to happen in the short-medium term future.  

Our next post will take us away from Javascript for a little while, and into the world of Python.  Stay tuned. 



Multiplayer Snake – Part I

For the next few days I will be working on a multiplayer snake game for the browser.

I love a select few simple arcade games.  Tetris and Snake are my favorites (in that order); but as much as I like to play them I have a surprisingly hard time reasoning about their design.  I understand how to play the games, and in fact how to optimize my game play, which is to say, how to play them well.  (I mostly stopped playing snake when I gave up my last Nokia phone, but I continue to play Tetris, a lot.  I’m surprisingly good at it.)  But as much as I like and understand the games themselves I have never had a good handle on how their engines work.  Because they’re so opaque to me, I have decided to take some time over the next few days to implement both games in Javascript, using the HTML5 canvas, and no external libraries.  For the record, that means no game libraries, no Underscore, no jQuery – nothing but the browser the way god made it. 

Snake seems the easier of the two to design, so I’ve started on that game first.  I’m hosting the game on GitHub, here.  

Elevator Pitch: 

Snake is a game in which a user directs a string of blocks around the screen and attempts to “eat” a randomly placed food object, which will make the string grow longer, while avoiding crashing the snake’s head into its body or one of the four walls. 


High Level Implementation:

The game is implemented in an HTML5 Canvas node, which is a “dumb” canvas which can be colored in by Javascript using arbitrarily complex logic.  At the beginning of each game, a string of blocks will be randomly instantiated on the screen, moving at a constant speed in a randomly selected direction (right, left, up, down).  Since the canvas is a dumb element, it cannot reason about the objects drawn on top of it. All the canvas knows is that it exists, and certain transformations it is supposed to make to any additional drawing done on top of it (eg flip the cursor upside down and then keep painting black on to the screen).  Therefore, the developer has the job of keeping the computer’s understanding of the game synchronized with the viewer’s understanding of the game.  In practice, this means the developer has to update the “model” of the game so that it matches the “view” of the game.  These two elements are in turn controlled by a “controller” element (creative huh?), in what is known in computer science as the “Model-View-Controller” (MVC) architectural pattern.  MVC is considered the correct way to architect code, because it makes certain good things easy to do, and certain bad things hard to do.  MVC will get its own blog post in the future.  


Detailed Implementation:

The MVC pattern described above means that the code is split into three main components.  (I say components because even though they are basically Classes, Javascript does not have classes – it is a classless language.  Slurps soup at restaurants, doesn’t shut up when you kick it under the table, really really classless.)  

  1. Models:
    1. Blocks: Rectangular blocks form the basis for most of this game; they are given a color, size, and location coordinates, and then empowered to show or hide themselves to the viewer, depending on certain circumstances.  The snake, board, and food items will all be made up of these blocks.
    2. Snake: the snake is composed of an array of blocks, with a few extra methods to make my life easier, including the ability to move itself around the board and grow when it needs to (ie when it “eats” something).  Every time the game cycles, the snake’s tail disappears, and is replaced by a new block at the snake’s head.  The old block is deleted from memory, and its visual counterpart is erased from the screen, and the new one is added to the snake’s “model”, and painted onto the screen.  This makes moving the snake an absolute pleasure.  The idea of figuring out where each new block would go each time the snake moved had scared the crap out of me previously, and this way all I have to figure out is where two blocks were (the head and the tail) and where one new one needs to be relative to one of those.  It’s insanely simple, and it works nicely. 
    3. Grid: the board on which the snake moves is composed of a grid, which is held in memory as an array of arrays of objects with location and occupancy status information embedded in them.  This makes collision testing relatively straightforward: to check whether the snake hit something, all we have to check is whether its head is in the same x and y coordinates as our other occupied blocks; these are for the most part simply the outer edge of grid-squares, the snake’s body itself, and whatever food square we place on the map.  This makes collision detection easier and more efficient than scanning the entire board and checking for overlapping spaces.  In fairness, this only works because we’re using a grid; objects of arbitrary shape and size would have to be scanned for using a more sophisticated system.
    4. View: the board view is an object with extremely simple behavior.  It’s sole purpose in life is refreshing the canvas for our viewers, to make sure it reflects the models we’ve established earlier.  It’s really good at doing that.  Like, really good at it.  Nice job Board View.
    5. Controller:  This is where things get hairy.  The game’s controller should be relatively straightforward to reason about, and eventually it will be implemented correctly.  As of now however the implementation is imperfect, and so I will first describe the Controller as it should be, and then as it currently is. 
      1. Controller (proper): the controller’s job in life is to take user inputs and modify the game’s model, and then alert the view that it needs to update the model, which the view will then take care of doing.  Imagine the controller as a middle manager at a newspaper, who’s job is to take the pictures and stories his journalists created and feed them to the people who will handle their layout and printing.  The controller is also responsible for taking in information from the newspaper’s owners, so he can tell his journalists what stories to create.  The controller’s job then is to control the intake and processing, and response flow of information from and back to users.  That’s quite a mouthful, but a simple idea.  The Controller takes things in, and tells different parts of the application what to do with them.  Cool.
      2. Controller (as it is now): The controller as of now is blown all to hell.  I’m racing to finish a first draft of the game, and so I’ve decided to keep the deep internals of the game working correctly while fudging the implementation of its high-level machinery to just get things working.  As of now, that means basically that I’m the controller.  Kind of makes sense, no?  Until I have time to properly think through how the game should flow, I’m building those pieces piecemeal and as needed.  This is not the end of the world, and from the user’s perspective, if my coding is good enough, there should be no difference.  But it is problematic for several reasons (one being my coding may not be good enough – we’ll see), which will be discussed in depth in my MVC post – if it ever gets written.  So for now, think of the game as being architected on the MVM pattern – Model-View-Moshe.  Awesome, now you know how the program works! What fun!!!


The game is not yet published, since the first prototype isn’t yet complete.  It will be by tomorrow though, and I will publish a working link to a basic version of the game, probably hosted through heroku.  

Not enough for you? Fine, here’s a teaser: I also spent some of last week building a realtime chat server using Python, the code for which is also up on GitHub here.  My plan is to finish implementing the Snake game, and before I get to Tetris to port Snake to a multiplayer version, coordinated in real time through the server.  You’ll know when it’s ready.  Especially if you’re sitting next to me. But also if you continue reading this blog.  If you’ve read this far in this post, I’m guessing you’ll see that one when it comes out too.  

Thanks for listening!




Some interesting Javascript CRUD

For my first week at Hacker School, I built a small Javascript library to help manage (relatively) large data sets in the browser. Most web apps use relatively primitive tools to manage data. Collections of objects are usually dumped into global scope, and either looped through or called using human-usable namespaces as needed. It’s the type of data management we all do in our first programs, and it’s generally good enough for the browser, where most data doesn’t persist between sessions.

Partly, this comes from the insanely (full presentation here high overhead associated with network IO. Pulling information off of a server and into your browser takes an unbelievably long time, in computer terms, and so developers try to minimize the amount of information they move for any given session. This means first that most webapps have relatively little data to work with at any given time, and second that even if they did, their developers are usually wise to spend most of their efforts speeding up other areas of their website. For most websites, that is, managing local data is not the bottleneck.

Fine, bad data management practices are reasonable to expect in this context, and they’re not so harmful. But there are still very good reasons to avoid them. To begin with, the amount of data handled by the browser has been growing over time, and is likely to continue to grow. Manually calling objects as you need them may be possible today, but there is no guarantee that it will be tomorrow. Second, using consistent, predictable patterns in your code makes it easier to collaborate with other developers, and to extend the code yourself later. This isn’t rocket science. (But it is computer science!) The less hard-coding you do in your code the better, and wrapping your data up with abstraction layers is a great way to make your life easier.

With this in mind, I wrote JsCRUD.js. It’s a small library (currently ~120 sloc) that’s designed to wrap up the browser’s localStorage (or sessionStorage) object, to make it possible to Create, Read, Update, and Delete records more easily and faster than it is now.

The library works by creating a set of indices, which track which objects contain which types of values, and then defining a set of functions which serialize and store (and later, retrieve) your objects in one of the browser’s key-value stores. The library is not yet finished, but it will be shortly.

The project led to a few fascinating discoveries. First, when building the library’s querying tool (readRecord), it became immediately intuitive that it would be easiest to find objects based on a model that the user passes in. I realized partway through writing the feature that this is the same exact paradigm used by MongoDB. This makes a ton of sense, since they’re doing almost exactly the same thing as JsCRUD, at least at the user-interface layer.

The second thing the project seems to be showing me is that web developers don’t seem to be very enthusiastic about their data. Specifically, most of the people I’ve spoken to about the project are either hostile or indifferent to the idea that they should manage their data actively, instead of keeping everything reasonably well scoped and calling things manually. For what it’s worth, I’m not sure they’re wrong. I made my case above for why I think data should be more carefully managed (to be clear: data should be managed, not memory) by the developer. But I also think that the web as it’s built now works pretty well, and that it makes sense for developers to focus their energies on the core of their work, relying on faster browser engines to compensate for imperfect design. I happen to really like databases, and I think there’s real value in making your data more robustly queryable than it often is.

In fact, the HTML5 specification agrees with me on this point, which is why the IndexedDB API exists. It’s a pretty cool tool. There is also a legacy API in most databases, WebSQL, which exposes an SQLite database to Javascript in the browser. The latter system is no longer included in the HTML spec, and probably for good reason. SQL is an amazing language that gives developers incredibly good control over their data, but at the cost of forcing them to switch languages and paradigms for various functionality within their system. That’s fine when users have a ton of data and complicated data needs. But in the browser most data is still orders of magnitude smaller than on the desktop, let alone big enterprise systems, meaning that the friction incurred by switching back and forth between languages makes less sense.

What does all of this mean to you? First, keep an eye on your data in web applications. If you’re managing any large number of records, there may be value in using some system that permits you to manage them in a logical way. You may be able to get away with brute-forcing it for now, but that approach doesn’t always scale, and besides, it’s kind of ugly. Second, get to know your browser’s APIs. HTML5 and Javascript have a surprisingly large and powerful ecosystem of tools built right into the browser, and if you understand how they work you can get a lot more power out of your code.

Feel free to check out my project on GitHub. Fork it if you like, and by all means feel free to mess around with the code, and offer changes if you think you can improve it.


I Start Hacker School Tomorrow

Tomorrow, at 10:15 a.m. I will be starting my first day of Hacker School. I will be using this blog as a tool to help me learn over the next three months. I invite you all to come, watch, and see what happens.

To be clear from the start: this is a personal blog, the opinions stated herein are mine alone, and do not reflect the official position of any organization, including Hacker School.

Hacker School will run until the end of December.  Until that time, I will update this blog at least once a week with some new information regarding how I’m spending my time and what I’m hoping to accomplish with it.  The way I see it, this blogging is designed to serve two primary functions:

  1. Blogging will force me to work incoherent and abstract thoughts into some measure of concretion and clarity.  Teaching myself to code these past few months has been fun, but I have constantly lost ground by permitting half-formed understandings to slip away from me as I moved on to new problems or topicsThe only way I can write about a topic is if I understand it at least well enough to express it in whole sentences. In addition, writing on a blog is inherently public, at least in the sense that it may be seen by others. I will never be guaranteed that someone will see my writing, but I will always guaranteed that I am never completely sure that someone will not.  That’s a great thing, for a few reasons, not the least of which is how much I enjoy the attention.  But writing for an audience, especially for someone with relatively thin skin, means that I will have assistance in holding myself to a higher standard of comprehension than if I were answerable only to myself.  I’m counting on you.
  2. The second reason I’ll be blogging is to build a concrete metric of my progress during Hacker School, and as a programmer in general.  Since I began coding, I’ve made tremendous strides in some areas, and very little in others.  My workflow and toolchain could have been designed by my computer-illiterate grandmother (love you!).  I can get some natural measure of my progress just by comparing code that I’ve written recently to old code, but that kind of raw comparison only goes so far.  I have a ton of old projects that relied in crucial parts on borrowed code-ideas and patterns, and which therefore don’t fully reflect my abilities at that point in time.   Not to belabor the point then – having a record of my progress would have been immensely satisfying, and will be extremely useful as a tool with which I may measure my own progress.

That’s why I’m planning on blogging about my journey through Hacker School.  

With that said, I do plan on continuing with this blog’s original mission, which is to provide a place for me to publicly muse about public data and data analytics.  The main difference is that there will be, for a while at least, a new focus on method.  I will be spending more time considering how I learn the technical skills I hope to use to understand these topics, as opposed simply to what I get out of it, or why I think these things are important in the first place. 

It is my hope that adding this component will only enhance my original mission.  Understanding the underlying techniques always helps users evaluate content.  Specifically, knowing what tools I use and what my skills are, as well as how they have been derived and what other techniques inform those skills, will all give users a richer understanding of my writing.  One natural – and unnerving – consequence is that users will be additionally empowered to criticize my writing.  For you, knowing which things I know about, and how much I know about those things, will go a long way to pointing out where there are holes in my writing (although it is certainly not necessary!).  As scary as that prospect is, it’s also a great challenge for me: as I said earlier, writing for an audience is an inherently fraught enterprise.  It comes with a bit of exposure and the potential for a lot of embarrassment.  But knowing that someone is out there, maybe watching, and maybe getting something useful from my time and my effort more than makes up for that fear.  

So tonight, on the eve of Hacker School, I bid you all welcome.  Pull up a chair, grab some popcorn, and fire up your favorite IDE.

It’s time to learn.

Today I Deleted My LinkedIn Account; You Probably Should Too

(Please note corrections after the post)

Today I will delete my LinkedIn account.  I say I will instead of I have because I do not know how LinkedIn’s account settings work; deleting my account may require substantial effort on my part, it may not even be possible at all.  The fact that I have no idea how my account works on LinkedIn should tell you two things: first, that the web services’ account persistence schemas are incredibly dense and durable, and second, that I have never so much as poked around my LinkedIn account.  The first part is generally interesting, and extremely important on a blog like this.  Why and how web apps choose to persist user data is in many ways the essence, or at least an exemplar, of big data and analytics.  It deserves its own blog post, and it will only be addressed here as it connects to the second item: my particular experience with LinkedIn, and why it convinced me to delete my account.

To begin, some statistics:
As of this post, LinkedIn claimed to have over 225 million registered members.  That’s a lot of people. For context, Instagram claims around 100 million users, and world-heavyweight Facebook tops the chart at over 1 billion users.  For those of you keeping score at home:

  • Facebook > 1,000,000,000
  • LinkedIn > 225,000,000
  • Instagram > 100,000,000

That kind of feels right, but something is off.  Facebook is so massive that it distorts just about every metric it touches. It just does.  But the amount of email spam I get from LinkedIn feels MUCH higher than the Facebook flood.  Without thinking too hard about it, there are a few obvious reasons for the disparity.

First, I hate LinkedIn emails.  Seriously, they are by far the most annoying spam I get from a serious organization.  Why is that? Well, for starters, LinkedIn had the terrible idea to route their spam through user email addresses.  Seriously, go check your inbox for an “Invitation to connect on LinkedIn”.  They don’t come from *, which THEY ABSOLUTELY SHOULD.  To be clear, LinkedIn asked for – and received – my permission to use my email address this way.  Users – myself included – SUCK at managing 3rd party login permissions.  A quick scan of my Google account reveals that I have granted access to 49 different websites.  AND I HAVE NO IDEA WHAT MOST OF THEM ARE, OR WHAT PERMISSIONS I GAVE THEM.  Worse, I am a (new) web developer, so I have at least a basic understanding of how these authentication systems work.  If you’re like most people, the idea that you would voluntarily give a third party control of your Gmail makes no intuitive sense.  It’s your gmail, why would LinkedIn be sending emails from  Even worse, if you are a normal person, you probably don’t even have the vocabulary available to ask that question.  Go ahead, try to find your Google account’s permissions. I’ll wait here.  When you give up, you can click here for instructions.

The point isn’t to beat up on Google, or make fun of the average user.  Facebook’s API’s are in some way more invasive than Google’s, or at least, they hold the potential for equally bad abuse by malicious users, and I think software should be targeted to the average “normal” user.  (In fact, I so strongly believe this that I was apparently the only person on Facebook or HackerNews who thought this article was 100% dead-wrong.  I am in fact so opposed to this thinking that I will address it in a dedicated blog post, but let me be clear: abstraction and specialization are the CORRECT ways to design complex systems for common use, and I think the average user should learn as much programming as they do plumbing to fix their sink, which is to say exactly the minimum needed to keep things working for them and nothing more.)

Software that traps standard users, or invites crappy 3rd party developers to trick them is bad software.  Complexity should be abstracted away from users until normal users can reach something like the 80/20 balance – where they can get 80% of the software’s utility with 20% of a full understanding of its functioning.  For what it’s worth, that’s a really hard UX goal to reach, and I respect engineers on the front and back end enormously for the challenge they face.  But that does not mean that it is ok for Google to make it that easy for LinkedIn to email people from your email address, and it certainly does not make it ok for LinkedIn to do it!  Allocating blame is tricky here: Google is the holder of your information and manager of your identity, and so it has the final responsibility not to let spammers get access to it.  But they’ve made a tremendously powerful tool for developers available in their account API’s, and I am more upset with a social network mammoth like LinkedIn for abusing a tool like that than I am at its makers for making it available.  Exactly how you judge everyone involved is up to you, but the point is there’s something wrong.

Nailing it down: the first reason I hate LinkedIn emails so much is that they are delivered in an inherently abusive way.  They are sent through the personal email addresses of people I know, despite the fact that they are marketing emails sent from a large company.

The second reason I hate LinkedIn emails so much is that they are marketing emails for a service that I don’t use, and neither do you.  When I get a marketing email from Facebook, the odds are very good that I will intuitively understand the context surrounding it.  The emails describe an action that happened, and the people involved in that action, so that I am brought up to speed before I even click through to the website.  THIS IS A GOOD DESIGN.

Moreover, the average Facebook user is way more likely to be active than a LinkedIn user.  That means that people tend not to lurk as hard on Facebook as they do on LinkedIn.  Think about that for a second.  Sounds crazy right?  Since both companies are publicly traded, you don’t have to take my word for it:

For the quarter ending June 2013, Facebook reported 1,155,000,000 monthly active users.  Calling their original registration numbers ~ 1,300,000,000 which is generous), that means that 88% of Facebook’s users actually use the site regularly.

Compare that to LinkedIn, which claims that 170,000,000 of its 218,000,000 users logged in during the quarter ending March 2013, for a total of closer to 77%.  That number actually understates the disparity, because it just measures unique visitors.
While LinkedIn users spend an average of 8 minutes on the site daily, Facebook users hang round for over 33 minutes, or OVER HALF AN HOUR each.  In fact, LinkedIn puts this problem much better than I can:

“The number of our registered members is higher than the number of actual members and a substantial majority of our page views are generated by a minority of our members. Our business may be adversely impacted if we are unable to attract and retain additional members who actively use our services.” (source)

(traffic stats: Facebook, LinkedIn, SEC data: LinkedIn, Facebook).

The point of all this isn’t to dump on LinkedIn.  If nothing else their engineering team is absolutely amazing. The point is that they’re a company that is already starting with an unengaged userbase, which means they face a higher bar for unsolicited emails they send their users.  When LinkedIn emails me something – let alone by hijacking a user’s email address (see above) – it is not going to trigger the same easy context recall that Facebook’s or Google’s will.

People tend to intuitively sense LinkedIn’s broad-but-shallow userbase problem.  Everyone knows that everyone has a LinkedIn profile, but I challenge you to find three friends who use theirs actively.  Now try it with Facebook.  Until today I had never read the statistics I linked to above, but it just feels obvious when you read your LinkedIn mail that it isn’t being generated by eager friends trying to network.  LinkedIn should not be sending annoying emails like that.  The company is facing pressure because the average user is turned off from deep engagement.  But the way to fix that absolutely IS NOT to spam them, which makes people even more leery, and irritated with your service.  

What all of this means is that LinkedIn faces a serious challenge in a crappy environment.  I don’t envy them.  Overall, there are a small number of very good reasons for me to get rid of my account, which I’ve discussed above.  They more or less boil down to this: I find the user experience annoying and intrusive.  But the real problem with LinkedIn is not that it’s kind of annoying.  There are lots of kind of annoying services that I continue to use, and will continue to use as long as they provide me with something of value.  The real problem with LinkedIn is that it does nothing useful for me.  Nothing.  In fact, aside from generating a boatload of spam, I can’t tell how exactly LinkedIn is even supposed to impact my life.  I know I’m supposed to “network” with it, but I already “network” with Facebook, and Twitter, and beer.

Maybe some people are finding hot job leads through LinkedIn and I’m just missing the party.  That sounds facetious, but the truth is I am willing to believe that I’m just not getting the full value out of this tool.  The problem is I am not willing to put in a substantial investment in learning how to use it (like I said, 80/20) without a decent value proposition ahead of time.  I’d rather spend my time learning Javascript, or blogging.  Useful career tips or leads tend to come from real friends of mine, who I tend to interact with in person or on real social media.  LinkedIn seems just a touch too tone-deaf to be useful for building real career networks for me for now.  The fact that they operate as a borderline spam factory would be bad for any service that I wasn’t completely sold on.  The fact that it’s one struggling with bored users and a weird image makes it downright toxic.  So for now, goodbye LinkedIn, maybe we’ll connect again in the future.


I have officially shut down my LinkedIn account.  To their credit, “closing” my account was relatively simple.  An odd coda for a pecuilar performance.  If I ever make a new account with LinkedIn, I will be sure to post my experiences.



After publishing this post on Monday a few readers took me up on my advice to:

go check your inbox for an “Invitation to connect on LinkedIn”

and called me out because I MADE A MISTAKE.  It turns out I made a technical error, and I’m extremely grateful some readers took the time to point it out, so here are corrections:

I had contended

  1. 1) that LinkedIn is engaged in a spammy practice of sending messages pretending to be from a user when in fact they are from the service itself and
  2. 2) that Google had enabled this bad behavior by making it possible to send email through users’ Gmail accounts.

The first point IS TRUE.  LinkedIn has engaged in the practice of sending users email that purports to be from other users, instead of from them.  Here is a screenshot of one such email as recently as 2012.  But it looks like LinkedIn has made the decision to move away from this route, at least for some of their emails.  This is a good thing.

As some readers pointed out, what they were doing was probably meant with good intentions, but the fact is that they were spoofing, using the same technique that spammers and phishing attacks use to trick users.  The good news is that LinkedIn is (I assume) not trying to steal anything from users.  Instead, they’re trying to sneak past your mental spam filters: when you get an email from someone you know, you’re more likely to read it than if it were from a big impersonal organization, like LinkedIn.  As far as I’m concerned then, the way they send (or apparently – sent) their emails is spammy, and should not be used by a serious organization.  This goes double for any company tha,t like LinkedIn, is literally built on the notions of professionalism and professional-communication.  It’s just wrong.  Thus, my main point remains, and I stand behind it.  With that said, I absolutely did get the technical point wrong there, and for that I apologize to anyone who I accidentally misled.

As to the second point: I have done a little more digging, and it appears that Google does not offer programmatic access to users’ email, and so as I said above, I was wrong.  But again, and this part is actually a bit more worrying, it looks like Google is still at least somewhat at fault here after all. When you spoof an email in Gmail, it usually warns the recipient about what’s happening.  So for example if I did what LinkedIn does, and sent you an email pretending to be from your own account, you’d get a big flag when you read the message, alerting you to the fact that it was not from you, that it was fraudulent.

Gmail does not raise these alerts for LinkedIn messages, which is presumably a choice Google made, to permit them to pass through as if they were real.  (If I am wrong on this point I invite anyone from Google to comment, or send me an email.)  If this is true, then it’s actually more of a problem than my original accusation.  In the post, I argued that Google had exposed users to abuse by spammy organizations like LinkedIn, but that it was an open question to me whether that was an acceptable trade-off for the amazing flexibility it gave developers.  It turns out that Google DOES NOT expose users in this particular way, and so I was wrong on the technical aspect.  But the deeper point remains, and is kind of crappy – that Google permits certain users to abuse their email system to the detriment of users.

Anyhow, read it and make a decision for yourself about whether LinkedIn is a good system to be plugged into.  I said my piece, and I stand behind it.  Next time I’ll work on getting all the moving parts more exactly correct.

Predictive Analytics and The End of “Just Looking”

Anyone who’s spoken with me about it knows I’m a huge fan of good predictive analytics and the services built on them.  I use a machine-learning-algorithm based news reader (Prismatic) I listen to computer generated radio stations (on Spotify and Grooveshark, among others), and I let Google Now generate most of my driving directions for me.  These services yield two major benefits: first, they save me a TON of time, and second, they expose me to more content and with way greater variety than I would find on my own.  The obscure blog posts I find through Prismatic about database engines are amazing, and there isn’t a snowball’s chance in hell I would be spending my own time rooting through the dark corners of the internet for them.

The problem with these services however, is that they are extremely efficient learners. People love to hate Apple autocorrect – and I don’t blame them.  Their system is imperfect, and can be annoying or even harmful if you use it without paying attention to the output.  Don’t hit the send button immediately after sending a message to your crush or ex.  But by far my bigger worry with these systems is how quickly and how well they do learn to conform to my demands.

I find my iphone typing to be extremely accurate, given how much I write on that device and how small its virtual keyboard is.  People make typos using full sized keyboards (at least I do) that refuse to autocorrect their users, so some degree of inaccuracy is to be expected.  But my iphone comes as close to getting it as right as my laptop does, despite having a keyboard that A) doesn’t physically exist and B) is about 1/50th as big as the laptop’s.  Unfortunately, it also picks up my idiotic quirks and jokes, and assumes they are to be repeated in the future.  My repertoire is littered with garbage in real life, and my iphone reflects that.  Unfortunately, to give me maximal leverage, it offers me all these words in all situations, including MANY where they are either too casual or too awful to be used.

The problem gets far worse with services like Prismatic, which are tasked with figuring out what kinds of things I might like to read based on what I tell it I like and what it’s seen me read before.  In case I haven’t made it obvious, I love this service. LOVE it, go download it right now.  The problem in this case is not that they can’t figure out what I will look at if offered, but that they know EXACTLY what I’ll click if offered, no matter how little I actually want to read it.  Because of this, Prismatic has begun littering my stream with incendiary garbage from right wing nutjobs.

I have filters set up to catch news about Congress, the Democratic party, and all sorts of economic and political topics.  At first, these created a small trickle of crap from right wing sources, which I clicked through and read out of a mixture of fascination and curiosity.  (I am sad but not surprised to say I am no less enlightened after having read them.)  The problem is that as soon as Prismatic caught me reading a few of these, it figured out it had found my catnip, and started pushing it into my news stream at increasing rates.  The problem is that while I want lots of varied sources for my news, I have a finite amount of time, and I am unwilling to spend a lot of it filtering my already FILTERED content.  The twin joys of Prismatic are that it will search anywhere for a story I might read – and that it will discard everything I am not going to click on.  The problem is that there’s a trick-space between the two categories, and now I am stuck in the swamp of stuff I will click on but don’t want to read.

And it’s not just Prismatic, or for that matter only niche startups.  I hate buying anything on behalf of a family member on Amazon because their shopping preferences get woven into my recommended purchases pretty quickly.  This problem got even worse after a night browsing exotic sex toys on Amazon, for a practical joke that was never fulfilled (for a college friend – NOT a family member).  Problem is that although I abandoned the joke Amazon still thinks I might be interested in buying something raunchy along with my Javascript textbooks (see? coders CAN have sex lives!).

I see the same annoying problem in my Facebook feed – clicking on an annoying friend’s post to see what they’re talking about tells Facebook not that I want to see it but that I WILL engage the content if it’s put in front of me, and so they start shoveling more of it my way. This process can be managed, at least to some extent,  by pruning my newsfeed, and selectively telling Facebook what I don’t want to see.  The same is true of my Amazon account, Prismatic, and I’m guessing my iphone suggestions dictionary too.  But doing so robs these services of the advantages I wanted from them in the first place, which sucks.

I *could* spend time managing my iphone dictionary, but then I wouldn’t end up saving time typing.

I *could* trim everything I don’t want to see in my Prismatic, but if I force the feed to conform to my expectations it will stop giving me the good surprises (obscure coding posts etc).

I *could* tell Facebook every time I see something I don’t want to read, but that would make my social network a CHORE rather than an entertainment, and I don’t want more work from a hobby.

The problem with super sensitive learning tools like these systems is that they eventually push me to censor my electronic activities, because I know that they will be added to my suggestion algorithms, and that I will either have to put in lots of work later to fix them or deal with degraded services. Instead, I end up limiting my engagement ahead of time, to make sure that my electronic servants know a consistent, vanilla version of myself, and that they focus their efforts on serving him. Because my computers know me so well, and because they are by definition shameless (gmail has a fascinating sort-of counterexample) I can’t go around “just looking” like I can at a human run establishment, for example.  Imagine if every time you went to Old Navy someone came up to you with a pink thong and said “you looked at the ugly polka dot shirt last time you were here, maybe you’ll like this too!”.  Or if at a restaurant your waiter greeted you and a date with “Hey fatty – you want me to get you the quadruple cheeseburger with extra lard again?”.

Humans have in this case a useful combination of incompetence and emotional sensitivity that mostly keeps them from embarrassing us that way.  A mall employee can’t usefully track everyone in the building and every store they’ve gone to, and everything they’ve looked at there.  Google can.  Facebook can.  And it’s not just on their sites.  When you browse the internet logged into Facebook, every time you land on a page with a “Like us on Facebook!” button or the option to log into a service through Facebook (which – for the record – I use all the time) those trigger little programs that report your presence there to Facebook.  Keep an eye out for social icons next time you’re looking for porn frying pans online and you don’t want to be noticed.

The good news is that there’s a decent chance that as analytics engines get even better, services will begin including sensitivity training in them, to avoid embarrassing or inconveniencing us this way.  The really bad news is that all this really means is that invasive web services will be made less noticeable and annoying, NOT that they will stop watching us.  I’m not Amazon, but I’d bet that if I was I would be giddy every time a user told me to ignore something I had been looking at before, because it’s a far richer source of information than is the fact that I landed on that page.  Now Amazon knows not just what I looked at but how I want to be seen, or at least treated.  Same for Facebook, Google, etc.

To be clear, this isn’t a condemnation per-se.  Web services watch and collect this information because it does make them more attractive for users like me, and I am generally pleased by the level of service I get in return (for free usually).  The problem is it means they are strongly disincentivized from ignoring any useful information about me that they find, and doubly so for information I tell them not to pay attention to.

One sort of exception to this is Apple, which seems to have made their enhanced privacy offerings a point of differentiation from their competitors.  So for instance Apple is fighting to protect various privacy standards (as is Mozilla – to their credit).  But Apple is fundamentally a premium company, and they don’t compete in most areas with for-free services like Facebook and Google.  Moreover, the model of paying for expensive, high quality web products makes no sense in 2013, when the vast majority of dominant web technologies are available to users at zero cost and with insanely high quality.  There already exist for-pay social networks, and you’d have to threaten me with physical harm to get me to switch from Facebook to one of those.

What this means is that anyone wanting to use certain features of the modern tech world without undue attention will have to do so conscientiously and with some effort.  There simply is no replacement for the services I want and expect as part of the modern web that will give me free access without the annoying data-mining.  For me, being able to use Gmail, Facebook, Amazon, Prismatic etc is worth the trade-off.  No questions asked, I’d rather figure out how to get along with free tech than live without it.  The downside of this all is that for me, and for anyone else who wants to do the same, for now and for the foreseeable future there is no such thing as “I’m just looking”.