iCalendar – Quirks From the Trenches

While working on Trello2iCal, an awesome webapp that schedules Trello cards on your calendar by their due dates, I worked a lot with the iCalendar protocol (or iCal). It was actually the biggest source of bugs and misbehaviors, me taking a very close second. It didn’t help that Apple decided to name their calendar app iCal adding noise to search results, which were scarce to begin with.

iCalendar itself has two version defined by two RFCs: RFC 2445 from 1998 and RFC 5545 from 2009. It’s a text-based protocol, which contains some metadata and a list of “components” that represent events, todos, alarms, and other calendar related structures. Each of these components is a list of “content-lines” which are mostly “key: value” lines. When you put these calendars online and constantly update them, like Trello2iCal does, it’s called a feed (much like an RSS feed). Here is an iCal feed with two events:

DESCRIPTION:Card URL -nsome url
DESCRIPTION:Card URL -nsome_url

If I was a smartass, I would really question the decision to invent yet another textual protocol instead of using XML which is quite suitable for this. Maybe if the protocol was XML based it would have been easier to implement, thus making it more popular, which would have resulted in a bigger knowledge base and better tools. One tool I do have to point out that saved me a lot of time is the iCal validator (thanks Ben Fortuna!), but it only checks whether your iCal feed is valid according to the spec, not with the real world quirks below.

Below, I talk about the issues and quirks I ran into and their solutions or workaround, in no particular order.

  • iCalendar clients, especially Google Calendar, are very bad at refreshing their data. I’ve had reports of users waiting for over 24 hours for their calendars to sync up. There is a workaround but if you’re doing active planning, this is unacceptable. It’s a pretty easy issue to solve for Google Calendar – just add a button that a user can click to refresh, but I guess it’s not a priority. Outlook and Apple Calendar are a bit better, syncing a few times an hour, but AFAIK lacks a refresh option as well. I wish the protocol had a way to tell clients the maximal time between syncs.
  • Google Calendar refused to load iCalendar feeds before I put a robots.txt file, allowing Google bot access. I don’t think that makes much sense as this isn’t a bot but a protocol client (which is probably best called an “agent”).
  • While developing, I put version 0.1 as my iCalendar version mistaking it for the version of my app. Every iCalendar client accepted my feed except Apple Calendar who silently rejected it. Apparently, the only value they consider valid is version 2.0.
  • Apple Calendar was also the only one that didn’t parse the feed without a DTSTAMP field.
  • All Day Events – There are many methods that don’t work or are wrong:
    • WRONG: Adding one day to a date time object is not right – just make a 24-hour meeting. It doesn’t matter if you start at midnight and end at midnight either.
    • WRONG: Having the start date time the same as end doesn’t work either as it produces undefined behavior.
    • WRONG: Removing the time component and setting the end day and the start day as the same day – this is an “event on the day”, not a full day event. This should be used for birthdays for example.

    The RIGHT way to do it is to remove the time component and set the end day for a day later. If not for the first point, this would have been easier to test, but when refresh is a 5-step process, it becomes tedious.

    A mess of iCal Calendars and Events.

    A mess of iCal Calendars and Events.

  • New lines in the description – in this case Outlook is the outlier where encoding the line ends as “=0d=0a” (this encoding is probably a relic from Outlook Express in which this was the way to encode ASCII characters in emails) worked. The RIGHT way to do it is using “n”. No, not the ASCII n which is 0x0d, but the actual ASCII slash character “” followed by the letter “n”. Fun.
  • I was using the iCalendar python library. It is well maintained but looking at the code, it is pretty gnarly. It also doesn’t support Unicode in a sane way making me modify it in a few places that I’m now retroactively opening bugs for in their github repo.
    A current issue I’m having – both Apple Calendar and Outlook have no problem parsing the unicode I’m serving, but Google Calendar insists to turn them into question marks. I’m pretty sure the problem is somewhere in my HTTP headers, but I still didn’t solve it.

To be honest, I hope I didn’t come off as too critical of the protocol. Passing complex data like calendar events is a very hard problem – you need to take into account time zones, multiple people, multiple fields, recurring events, whole day events, free-busy times, event updates, event deletion, and a myriad of other scenarios. iCalendar is a solution, which is better than none, and a lot of the quirks can be blamed on calendar clients, but I still find this whole environment very unfriendly and mediocre at best.

Google Drive and Android Hackathon

Last week I attended a hackathon organized by Google Israel with some special guests – Google Drive’s Nicolas Garnier and Rich Hyndman and Nick Butcher from the Android team. The event started with a series of lectures on the first day and continued with a whole day of hacking on the second day and ended with the demos and the winner announcements.

The lectures were nice, but the two lectures I caught were pretty lackluster containing, mostly the same “themes” Google is pushing on Android developers this year. If you’ve seen some of the I/O lectures or have been to an I/O reloaded event you probably already know most of them (Project Butter, new notification features, compatibility with various screen sizes etc.). On the other hand the Hackathon part was amazing. About 30 developers showed up on the second day to code some Drive and Android apps, which is quite a lot considering it was a workday and a day before another major conference in Israel (Geekcon – the subject of my next two posts).

The work (hack?) room at the start of the day.

The work (hack?) room at the start of the day.

Before I start talking about the project I contributed to I want to highlight a few other teams and their innovative projects. There were plenty of teams to demo, so much so it took more than an hour, and there were three winner teams that won a goody bag with awesome Google swag. Unfortunately I don’t remember all of them but I’ll try to reach out to the guys that were there and create a list of people and projects.

Notable Projects

Password manager over Drive – A team of guys from onavo (one of Israel’s most successful startups) were probably the most impressive team. They really know how to work with each other and they coded furiously to create a secure password manager Android application that keeps your passwords safely encrypted on Drive. There is nothing more to say beside that they were one of the winners.

Onavo winning one of the swag bags

Onavo winning one of the swag bags

gdg-booktrunk – An interesting project that keeps track of the progress you made reading ebooks in you Drive folder and provides statistics and motivates you to read more. This project was started by Roman who I considered joining just for his sheer enthusiasm.


I arrived right on time for the team pitches in the second day so you can say I had the luxury of choosing the project I wanted to work on. I thought I’d hop between a few of them, but I ended up being stuck on some authentication issues for way too long and only worked on one project – I helped my fellow GTUG Organizer Ran Nachmany on a great idea named DogFooder. It is basically an artifact repository with great Android integration including push notification. The basic use case is having a web service that you send your APKs and metadata (like versions and release notes) to. The web service saves this to Drive and issues a push notification for subscribed devices, like QA’s devices or beta tester’s devices. They can then choose to download the APK and install it easily and quickly. This is a major pain point for mobile developers – propagating new APKs to test devices, and big kudos to Ran for thinking of this solution. In Zimperium we usually send the APKs but we have to send them to Gmail accounts because our mail server doesn’t allow attachments as big as zAnti (which is pretty small at 7mb). Only recently we started using a build server, but that still doesn’t solve the notifications and propagation problems. You can see how this can be very useful to mobile developers.

I joined Ran and wrote the web service and Drive integration but due to some Drive API quirks I only finished the code about an hour after the demos started, so we basically integrated it and jumped on stage two minutes later. Only by some incredible miracle did the demo work perfectly (typically – when we tried it again off stage something went wrong). We were also one of the winning teams, and we split the swag.

My half of the SWAG for being one of the winning teams.

My half of the SWAG for being one of the winning teams – a coffee mug and water bottle

I’ll save the technical details for another post because we still have some work to do before it is fit to be online, but hopefully we’ll get it into some alpha shape and put it on github soon because the guys at onavo really want to have this product, and so does Zimperium.

StartupOnomics Summit – Behavioral Economics for Startups with Dan Ariely

This weekend I attended the Israeli extension of the StartupOnomics summit, an entrepreneur centric behavioral economics summit. It had some great speakers headlined by Dan Ariely the famous professor of psychology and behavioral economics. In Israel we didn’t have all the lectures but we did see most of Dan Ariely’s sessions and even better – we got two full hours of his time for Q&A. The crowd in Israel included entrepreneurs from airba.se, Logicalls, xplace, practi.is, livechar.com and many many more. It was fun and stimulating to talk with the people and hear their stories.

The crowd in the Israeli extension

The crowd in the Israeli extension of the summit.

Here are the notes I took while watching the sessions. I was familiar with a lot of Dan Ariely’s work, especially the various experiments and his keynote which was based on this TED talk. These are my notes from the various sessions and shouldn’t be seen as an exhaustive summary.

Labor and Love / Michael Norton (Here is a similar TED talk)

The concept behind this talk was that adding labor to a process or product will make the customer more likely to pay attention and take action.

  • People like what they invested time into, even if it’s a trivial amount of effort.
  • Showing progress and time savings also have a positive effect, for example Kayak.com’s search function that animates flights flying into the result list as it finds them instead of just showing a progress bar and the results when they are available. Doing it for more than 30-60 seconds might be annoying so use wisely.
  • Another interesting finding is that showing people what they like is easy but if you have the data to remove things they disliked it will leave a strong impression. This is because while a lot of people might know what you like; only people really close to you will know what you dislike.

Session with Dan Ariely

A session about irrational behavior.

  • Reward in future is less valuable then reward now, even if reward in future is better.
  • Taking away has a bigger effect than giving something.
  • A mobile phone is an excellent way to control and condition the behavior of people. It is frequently used and almost always around.

Israeli Q&A with Dan

Dan really shined answering questions from the Israeli audience. He was amazing at giving out valuable advice on the spot. Some of the highlights:

  • One company asked about how to incentivize people to car pool and he quickly came up with mandating meetings which can be done either on your own time or while commuting. I love this idea as it reframes the commute as a time for communication and idea sharing and not a boring ride where you are half asleep.
  • Same startup wanted to award top five carpoolers. Dan pointed out the fact that not everyone has an equal chance to get that award so it will be a bit unfair and they should think of other metrics like improvement.
  • I didn’t write the question but he suggested one startup that wants to gain credibility is to do it through promising a reward for finding an inaccuracy which acts like social proof – if no one claimed that prize than you must be right (that’s a fallacy because people just might not care enough to find problems, but it works).
Dan Answering My Question

Dan Ariely answering my question from San Francisco.

Another Dan Ariely Session

I only caught the end of this one so I’m not sure what the main topic was.

  • People with multiple debts will not pay the debt with the biggest interest first but rather the one that is smallest and easiest to pay because they want the number of debts to go down.
  • When you’re experimenting make sure to get people without prejudices. The example was of a campaign ad where the campaign workers overwhelmingly chose a video ad but actual voters that were tested chose an image ad. This happened because campaign workers put the most work into the video thus valuing it more.
  • Run more experiments.

Social Proof / Noah Goldstein

The main study described in this talk is about signs that hotel rooms use to persuade guests to reuse towels. This saves the hotel money but is presented as an environmental issue.

  • Social proof works best when you use a group your customer is in or will like. The shocking example is that copy about recycling worked better when the hotel room number was written although rationally that detail is irrelevant.
  • The counter-point is true too – if you want to prevent behavior don’t use social proof that will make people want to be on the wrong side because it is more popular. The example is a sign saying many people are stealing something.
  • They also experimented with giving away some of the savings to charity. They found out that just saying that they’ll donate part of the savings sounds like tit for tat and doesn’t really improve on the social proof version of the sign.
  • The version that worked best is one saying a donation was already given in your behalf for recycling the towel.

To summarize, I learnt a lot and it helped me put myself in the mind set for marketing my new project. I got some valuable advice from Dan and the local attendees. It was a great event and if you are interested in behavioral economics you should make sure to attend the summit next year.

Announcing a Trello to iCal Feed Web Application

tl;dr – I wrote an app that creates a feed that you can plug into your calendars out of Trello cards with due date. Update: That was two years ago and since then Trello finally added that feature, and users migrated to using Trello, so I took it down. RIP :)

Almost a year ago Fog Creek Software released Trello a flexible web based to-do list tool that can also be used as a project management tool. Because of its flexibilities Trello has been used for many use cases – software release processes, customer relations management, book authoring and much more. I started using it at launch because I hate every other to-do list tool, and I tried many, but I instantly fell in love with Trello. It was easy to use and got out of my way, which is something other to-do lists just didn’t get. I was hooked since and started getting others to use it to plan things with me. I use it to manage pretty much everything, including the backlog of features to this blog, development tasks for zAnti, the app I’m responsible for at Zimperium and more. The only thing that was really missing from Trello was integration with my calendar. It was a major pain point. Trello cards (the basic to-do items) have only one way of connecting them to time and date – a “due date” value that changes colors when the date passes. One of the most requested features was to have calendar integration.

Debugging the application

Debugging the application

When the Trello API was out I immediately knew that my first priority is integration with my Google Calendar. I started playing with the API and making the basic PoC (Proof of Concept) for an app that created an iCal feed from Trello cards with a due date. Using iCal was the best option as it is supported by every calendar program worth its salt. I than left the code alone for a few months because lack of time and general tendency not to finish things. At the end of July I decided it’s time to productize this and I wrote the first version of the “Trello to iCal feed app” which is up now. I am still adding features to it and changing stuff, but it works and seems stable. The source code can be found here. I will make a separate post about the design and code of the app sometimes this week when I finish updating to a new version. Definitely learned a few interesting things.

One amusing thing that happened is that between the PoC and the productization another developer (François de Metz) added this feature to his app. Three months ago his app only showed you your cards on a basic calendar he created without any integration to outside calendars which wasn’t helpful to me, but he since he added a way to get an iCal feed. We both even used Twitter Bootstrap so our sites look pretty similar. This coincidence isn’t magical though – Bootstrap is popping up everywhere and it’s easy to deploy, but I think I’ll steer clear of it in my next projects. I’ve had some problems with LESS support on windows and I’ve also seen some backlash and lack or trust from developers when they see a vanilla Bootstrap site.

If you have any feedback and comments you can use this Trello board (how meta) or just contact me using the usual methods.

Who Needs Code Comments?

A recent Hacker News discussion about source code comments has grown into a debate about whether you need them or not. Apparently it is a contentious issue. The comment that started it all included “Comments are for the weak” and these 5 words incited a hefty discussion and this post. Some people argued code comments are a bad code smell and others said that comments are essential for people to understand code. My theory is that this divide is between people using low level languages like Assembly, C, C++ and to some degree Java and people using high level languages like Python and Ruby.

The biggest reason I think that is that low level languages need a lot more lines to do the same thing and those lines are harder to understand. A simple but poignant example is opening a file and reading its content:
C code (taken from here):
char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");
if (f)
fseek (f, 0, SEEK_END);
length = ftell (f);
fseek (f, 0, SEEK_SET);
buffer = malloc (length);
if (buffer)
fread (buffer, 1, length, f);
fclose (f);

buffer = open(filename).read()

The python example is one line long and uses descriptive names for the actions – open and read – and doesn’t bother you with implementation details of allocating memory for the data. The C example is about 12 lines long and exposes a lot of implementation details both about memory and about how file systems work (seek etc.). Both methods have their uses and advantages but one thing for sure – anyone not familiar with C will have a hard time understanding the C code and even non-programmers can understand roughly what the python code does.
One commenter specifically caught my attention giving an example of “readable” C code from the Unix source code. Here is the second function from the source file:

* Wake up all processes sleeping on chan.
register struct proc *p;
register c, i;
c = chan;
p = &proc[0];
i = NPROC;
do {
if(p->p_wchan == c) {
} while(--i);

This is part of one the most influential operating systems written, but at least in my book this code wouldn’t pass code review. It will fail because:

  • using one letter variable names is bad – this is the biggest offender by far.
  • wakeup is a really general name for such a specific function. wakeup_on_channel is better.
  • Abbreviating “Number” to N in NPROC.
  • Not declaring input variables type (TiL that the default is int…)

This brings me to the second contributing point to my theory – writing something hard to read will be strongly discouraged by some communities more then others. High level languages are written with the axiom that code must be easy to understand. It’s even in their name – high means farther from machine code and closer to humans while low-level means closer to machines and their language.

Looking back at the HN discussion, you can make some good educated guesses about who in that thread is a high level programmer and who works closer to the metal. These two groups might not get each other’s context and so this discussions goes round and round. Both groups need to acknowledge that languages like C will need more comments and documentation to be understandable and that while commenting is good it might be a strong code smell that you need to refactor in a higher level language. I usually use this Python idiom – if you feel the need to comment something, make it into a function and write a docstring.

Blog and Podcast Roll

I have 138 RSS subscriptions and 29 podcast subscriptions. I have about 200 RSS items to wade through each week, and even if all I do for a week is listen to podcasts non-stop, I will still have unlistened podcasts. It’s pretty safe to say I’m addicted to passive information, news and entertainment. I’m even listening to a podcast as I’m writing this. Meta, I know.

I love “Real Simple Syndication” or RSS. This protocol allows content creators to propagate their new content passively – update the feed and everyone will eventually get the update. What I love about it is that it’s asynchronous, as is the nature of most “pull” communication methods – I don’t get notified about every RSS item or new podcast episode – as opposed to email which is synchronous and immediate. I am not subscribed to even one blog by mail because I have this separation between immediate items and the rest. I’m up to about 4 hours to go through my weekly RSS reading, but I think it’s a worthwhile investment for now, and it’s fun. I’ve picked a few highlights to create a blog roll and added a justification here.

Twenty Sided by Shamus Young and friends
www.shamusyoung.com/twentysidedtale (RSS)
If I can choose one blog to model mine after, it will be Twenty Sided. It has articles about graphic programming, game design and general entertaining commentary. I love the community that sprung around the let’s play “Spoiler Warning Show”. You should really check it out and specifically the new “New here” section.

Coding The Wheel
http://www.codingthewheel.com/ (RSS)
This blog needs way more posts. The author is obviously a knowledgeable programmer who writes about code and Poker and gives a look into the fascinating world of Poker Bots. You should definitely read the series.

Joel on Software by Joel Spolsky
http://www.joelonsoftware.com/ (RSS)
Joel on Software needs no introduction but I still wanted it to be high on the list of blogs because of how influential it was on my decision to become a programmer and entrepreneur. This is where I first discovered a lot of concepts about starting and running a company, treating customers and about Fog Creek’s unique philosophy and company culture. He has great reading lists according to what you do and they are all worth your time.

Beta List
betali.st (RSS)
This site aggregates a lot of new startups that are in Beta. This is a great way to stay updated on new startups and ideas, and sometimes I even sign up for some. I’m sure everyone can find a startup to check out from this list and this is mutually beneficial to you and to the startup – a win win!

tynan.com by Tynan
tynan.com (RSS)
A truly unique and interesting blogger with a wide selection of topics – minimalism, software, travel, living in an RV and picking up women. This blog is one I consistently enjoy every post in, which is pretty hard to achieve.

Procrastineering by Johnny Chung Lee
procrastineering.blogspot.com (RSS)
This blog is by a guy who is consistently working on the coolest projects around. From a do-it-yourself head tracking using the wii to Kinect to his work in Google – there is no one that sold me on the field of HCI more than him.


Security Now! by Steve Gibson and Leo Laporte
www.grc.com/securitynow.htm (RSS)
Hands down the one podcast that beginner programmers or people with an interest in the behind-the-scenes of computers should listen to weekly, and add another from the archives because it has been around for years. Steve is a genius and a hacker, but most importantly he has that elusive talent of being able to explain hard and complicated technical issues clearly and methodically in a way that is understandable even to laymen but is not oversimplified. If you never listened to podcasts, you should make this your first and you’ll be as addicted as me in no time.

Radio Free Python
radiofreepython.com (RSS)
A podcast about python, how can it not be awesome? I only listened to the first two episodes and they have interviews with the greatest pythonistas around, including the BDFL himself, Guido van Rossum. Definitely worth a listen if you want some programming in audio form.

Stanford University’s Entrepreneurship corner
ecorner.stanford.edu/podcasts.html (RSS)
A lecture and a Q&A by a successful entrepreneur, VC or other startup insider? YES PLEASE! If you need motivation to finish a project or to go out and start a company, just listen to a random episode and you’ll be pumped. The message is – just do it, and while you’re at it here are some tips and common mistakes to avoid. Archive includes people like Marissa Mayer, Steve Ballmer, Mark Zukerberg, etc. and basically every hot startup and successful company is represented.

A Life Well Wasted
alifewellwasted.com (RSS)
A shortlived but prominent podcast about games and why we play them. Only a few episodes but they are really insightful and have great production value and atmosphere. This podcast is not updated anymore but you should listen to the past episodes and just enjoy the feeling of nostalgia, bliss and pure innocent happiness.

If you still can’t get enough of RSS, here are my full RSS and podcast OPMLs (an XML format for a collection of RSS feeds). Be warned though – some of the podcasts are adult only and a lot of them are not programming related. You should customize it to your tastes and time constraints. Dome of them are in Hebrew and one is in Russian. You have been warned!

Podcasts OPML

Lastly – you should consider subscribing to my RSS feed. It is the best way to get updates, and who wouldn’t want more of this?

Got interesting items in your RSS feeds? Share them in a comment and feed my addiction.

Reflections: OmegleBot

Omegle.com allows people from around the world to converse with each other “anonymously”. It is one of those sites that let you start a text or video chat with someone online and consequentially makes you doubt the intelligence of man kind. On sites like this, text chat follows the famous “Greater Internet Fuckwad Theory” and the video chat is… Well it’s probably a phallus. Omegle started as a way for strangers to connect and talk with each other, but has since devolved and the chance of finding some meaningful conversation on it is minuscule which is a shame because random chatting is a fun concept. I would add a premium feature that administers an IQ test and matches you to someone according to that but that is an idea for a different time.

A typical Omegle chat

A typical Omegle chat

When I first discovered Omegle I quickly got tired of trying to find someone to talk to. The idea of Omegle is not new or revolutionary. IRC and chat rooms were there before but this made it as easy as can be. Since I already spent a ton of time in online social communities with people who have the same interests as me I dismissed it as a cost efficient way of communicating. I did find a use for Omegle though – there was nothing preventing me from spying on a random conversation and recording it. A nice challenge and it seemed fun. This was years before Omegle itself introduced the “Spy mode” so I guess there is something there. The concept of Spy Mode might look like something “evil” to do – spying on other people’s conversations is an ethical gray area in the real world, but is it online?
The answer to this question depends on how much you know and are aware of privacy online. In theory everything you do online can be (and is) monitored by a number of entities including your ISP (that can read all of your online activity), your operating system and other programs on your machine, a lot of routers on the Internet and at least a few governments. That is all beside the point – I thought about ethics for a second but to be honest I don’t see this as anything but a technological challenge (also, it isn’t illegal per se). My goal wasn’t to spy on people but to hack a bot together and as such I probably only ran the finished script once – to make sure all the bugs were solved. I call it the Hacker’s Mindset :).

This is how OmegleBot was born. A simple and a very quick and dirty and unrefactored python script. Before that script I didn’t have a lot of knowledge about HTTP, httplib and urllib because I used raw sockets to talk HTTP (poorly) in the past. This was a perfect project to help me understand the python libs relating to HTTP and JSON. The bot opens two simultaneous connections to Omegle and sends them both a simple greeting, “asl?”, which is the way most conversations in chat channels start. It then proceeds to proxy their conversation and also record it into a text file. The most interesting part is the post function. It started as a simple call to connection.request and evolved to include a variety of HTTP headers including a faked user-agent and referer needed to defeat some of Omegle’s “security checks”. Usually services will have more server side security checks (“never trust user input”), but unfortunately Omegle doesn’t have a choice here. Because they are open and allow anonymous chatting it leaves them with only so many ways to ensure I’m a client and I masqueraded as one well. Omegle uses the JSON protocol to pass data about events like whether the other user is typing, the message the user sent and of course when a user disconnects. Reverse engineering it was the hardest part of this project (and it wasn’t all that hard). I think the only challenge I faced was understanding why Omegle blocked the first iterations of the bot and adding various headers until I passed for a client in their book.

I also attached a sample output file with a few conversations. There is nothing interesting there nor did I capture anything interesting. All the conversations are very short which is definitely a symptom of Omegle – long and meaningful conversations are few and far between. I even sent “typing” statuses every few iterations to encourage people to converse and it didn’t help.
What can we learn from this? Masquerading as a browser is easy. Writing bots is easy. As a person on the internet you should take from this that bots are everywhere on the web. You should be aware of that because a lot of spam and fraud is done by bots – you can trivially change this bot to spam on Omegle (although ChatRoulette, a similar site has a “spam” button that might be useful against that). Radiolab even had a podcast on a bot that had an online relationship with a human. It is a fact that bots are becoming better and better at passing for human beings. Soon they might even be good enough to write a programming blog, and then what will I do?

Southpark's "dey took er jerbs" guy"

Southpark’s “dey took er jerbs” guy”

(program them, probably)

P.S. Unfortunately the bot stopped working. It can be that Omegle changed the protocol a bit, added some more security or that I have a bug. Feel free to fork it and bring it back to life!

Crysis 2 Design – Console to PC

I bought Crysis 2 during the recent Steam Summer Sale. I’m actually quite proud of myself for only buying one game since I have never even installed more than third of the games I own, but I knew that I’ll finish Crysis 2 pretty quickly. I was a huge fan of Crysis 1 when it came out, especially since I bought a state of the art gaming computer in 2008 and Crysis was the benchmark. The first play-through was an amazing experience graphically and gameplay wise. In my opinion Crysis 1 was the most innovative and awesome FPS since Half-Life 2.

It took me a week to play the 18 hours it took to finish Crysis 2 and I wanted to write about the design of the game and the gameplay differences between the two games. I’ll start with a sweeping declaration that Crysis 2 was designed to be a console game first while Crysis 1 was designed to be a PC game. This will be a recurring theme today and I don’t care if it is an overused, old and corny observation.

A computer showing an old TV rainbow test screen.

Is that how people in Crytek think computer monitors behave?

The first indication of it being a console title is the “press any button to start” screen, an annoying annoying reminder that I’m playing on the wrong platform. The first gameplay difference was the checkpoint system and the lack of quick saving. Quick saves are considered a PC feature because of disk space issues on consoles. Quick saves have to describe the whole state of the map and all the information about the player and his enemies which adds up to a lot of data on games with a lot of moveable items and destructible environments. Checkpoint saves only need to record some data about the player because they are placed at locations where the state of the game is controlled (for example you can’t return to the last area and see the destruction you caused and you still can’t do anything to affect the area ahead). On consoles you have less disk space and more memory constraints so the more data efficient solution, checkpoints, is used.

Quick saves are not the only place where the console design makes this a worse PC game. Hard mode is another example. In Crysis 2 hard mode is not hard, just tedious. Crysis 1’s hardest mode was awesome – the enemies talked in Korean so that you could not understand their intentions and strategies, your crosshairs were removed forcing you to use the various scopes and iron sights and there was no longer a huge red HUD warning when a grenade is thrown at you (among other things like making enemies stronger etc). Hard mode in Crysis 2 doesn’t have any of the creative features above. They just made more enemies and gave them more health. Tedious. I presume these changes are necessary because removing crosshairs is a huge cripple for people playing with stick controls where it is harder to aim and not showing grenade indicators is bad for people playing on bigger displays with less detail where grenades are harder to spot.
To continue the list of down grades in the game:

  • You can’t ruin the environment as much. Another performance issue – physics engines with dynamic environment destruction are just slower and have a bigger hit on the hardware. Coupled with the amazing graphics it is probably beyond current gen console’s abilities.
  • AI sucks. I guess it wasn’t too good in the first game, but the baddies in Crysis 2 just love to chat on their open radio com telling me exactly what they are doing and where they are going. C.E.L.L really needs to invest in some scrambling tech.
  • Soldier standing looking at a wall with his back open to attack.

    Apparently this soldier is so afraid he decided to literally hide in a corner.

  • I feel there is less variety in killing enemies. They did add stealth take-downs which are very fun, but they took away a lot of environmental ways to kill enemies (see point one). One of the most memorable moments in Crysis 1 is collapsing an outhouse on a soldier, killing him in a most uncomfortable manner. Crysis 2 had none of that. Even picking up enemies and throwing them is less fun.
  • Less modes – strength and speed modes on the suit were combined into the jump and sprint abilities and their effects lessened. I guess this is done to make the game easier for people who have less than 10 buttons to work with. I think that as a design decision it makes it easier to play but again removes variety which is bad.
  • Less visual and locational variety – it might be because of the location; New York city is not as visually varied as a tropical island with a spectacular alien cave and a frozen side. The visuals are actually very good, colorful and are a notch above the usual brownish of modern FPS but still not as good as Crysis 1.
  • I feel like the story got in the way of playing while not being good or memorable. In Crysis 1 there wasn’t much story which also meant less immersion breaking cut scenes and unskippabale sequences which is pretty good. I can write a whole post only about that, but Shamus Young already did and I agree with pretty much everything (also don’t miss the second part about player volition).

Despite all of that, I really loved playing Crysis 2. After modding the game to include quick saves, I must say the basic Crysis formula is still there and is still as fun. I really love some of the new features like the silencer attachment not taking you out of stealth. Also, detaching mounted machine guns really lets you feel like the superhero you are, especially combining it with armor mode and just walking in the open mowing down enemies like an uber-charged heavy from TF2. It’s pointless to comment on the graphics because the Crytek engine is synonymous with amazing views, even on a 4 year old PC like the one I was playing on. Another thing I liked is that the aliens in this game were pretty interesting and had some scary abilities like jumping from place to place, charging and EMPing the energy and cloak away. These could have been a good experience if only the enemies themselves weren’t dumb as hell and pretty much on rails for most of the time.

Using the environment to guide the player forward.

Using the environment to guide the player forward.

To summarize I enjoyed the game about half as much as Crysis 1 but it still was a fun experience. It is also interesting to see how the series is moving away from its PC start and into Console land (with PC as a secondary platform) and the gameplay consequences of this business decision.

Practical Programming: Technical Debt

This is part of the “Practical Programming” series in which I try to say something profound about the craft of programming.

Technical debt is a very important factor in any software project yet you might never even encounter it until you start working in the field and having long-term projects. Even then, technical debt is something that creeps up. It’s not a bug, nor a feature you forgot or a meeting you dozed off in. It is more of a feeling, a state, a cup that slowly fills up until it overflows.

My definition of technical debt is the difference between the code right now, and the code if it was written perfectly by a thousand coders with unlimited time. This debt includes missing documentation, no tests of any kind, no release process, no design, no consistency in the code and other such issues that are usually regarded as secondary to actual working code. The metaphor is aptly named because it also has interest – the more the technical debt grows, the harder it is to scale development, be it adding new features, adding new developers to the team or being efficient in solving bugs. In the end you go broke and declare code bankruptcy – a rewrite.
(Some great resources that explain it in more depth: Martin Fowler’s article, c2 wiki page)

Let’s unwind the stack of metaphors and get back to why is technical debt a problem – to create a functional program all you need is code that compiles, but to create a sustainable development environment you need code, design, documentation, tests, deployment procedures, etc. This difference is the technical debt of your project. If a lot of those basic things are missing from your project it’ll get tougher and tougher to make even the smallest change without ruining something. As with any debt the interest will keep on climbing until it will be obvious that a rewrite is easier than fighting the code. This step always comes. You can ignore it and expect developers to work around these problems but eventually this will stop being cost efficient – bugs will pop up everywhere taking time from developing new features and new features will be buggy and produce unexplained behavior (the old “compile it and let’s see what happens”).

There are several ways to lessen the burden of technical debt. The easiest is just slowing down the development process and allowing time for basic software development practices. This is definitely the preferred way to do your projects (hehe). The other side of that is rewriting the whole code base, adding 1 to the version number and starting a  marketing campaign – “new packaging”.

I’ve recently had to think about it and I’m advocating a hybrid solution – allocating time in each development sprint to pay the technical debt of a specific class or package in your code base. I have summarized my approach to five easy steps:

  1. Documentation – create a page for the package using whatever documentation repository you use. Document the following about the code:
    1. How should it work? This means a general overview of the correct working procedure for the code and comments, if there are any, about how the current implementation differs from this ideal.
    2. Class level documentation – a few words about each classes’ responsibilities and the interactions with other classes. If there are more than 3 classes a diagram might be needed.
    3. Other things that will need to be documented: XML and other data exchange formats, client server interactions, security measures, relevant databases, etc.
  1. Code Documentation – Document the code itself:
    1. Documentation for classes and all public functions. Use your platform’s most popular documentation framework.
    1. Document algorithms and private functions as needed.
    2. Add a header to the file. A header should contain who is responsible for the code (usually the author), a date and a short description of the file’s content. Some places will also want the license added to the code.
  1. Uni-test the code – Write at least one unit test for every public function in the class. Try to hit the major code paths. This will make refactoring easier because you’ll have the safety net of knowing the tests pass, and having one unit-test makes the mental barrier of adding more way smaller.
  2. Make easy and safe refactorings::
    1. Magic values into global constants. Strings to external files for easy L10N
    2. Don’t Repeat Yourself (DRY) problems – if some code is copy pasted make it into a helper function / its own module / whatever. Sometimes this needs to be said… I know.
    1. Run Lint on your code and see what is easy to fix.
  1. Code Review – Review the code and document the major changes you’ll need to do to make it more robust. Create tasks, put them in the backlog and allocate time for them to be fixed along-side bug fixes and features. This might seem like cheating because you’re still deferring the problem, but knowing what needs to be done is half the battle. If you have it among your other tasks it is easy to schedule it and consider it part of development instead of some concept that doesn’t contribute to the push forward.

Is there a way to write software without getting in debt? Probably, but it might not be practical. Let’s not have perfect get in the way of good enough and ask is there a way to write software without getting into much debt? Of course. The best way is identifying those moments where you are deciding between a “quick and dirty” solution and a slower but better solution, and understanding that the “quick” in “quick and dirty” is only short-term and it might be slower in the long run because of the effects of technical debt.

Reflections: UnUrlShield – Fighting CAPTCHAs

Update: A challenger appears. My security researcher friend Fox has challenged me to a duel. See her blog post for the details.

This is part of the Reflection series in which I go through my old projects and tell their story.

CAPTCHAs have become an integral part of the web in the last few years. Almost everyone on the web has encountered those twisted pictures, probably when signing up to an email service. They come in various shapes, sizes, colors and cats. When they first became popular, there was an explosion of different types of schemes that services used (who can forget Rapidshare’s cat captcha?).
Now as with every security measure there is a compromise between usability and protection. Some of the easier CAPTCHAs were broken using only OCR software, while some of the latest reCAPTCHA images are hard even for a human to solve (interesting but out dated chart).

Various captchas from wikimedia.

One such service was UrlShield. You would give UrlShield a URL you want to protect from bots and it created a page with a CAPTCHA that when solved correctly redirected you to your original URL. Simple enough. I can certainly see a use for such a service, for example if you want to give out promotional coupons and don’t want bots to snatch all of them. The service became popular in some file sharing sites for the same exact reason.
The particular image this site was generating had a checkerboard background with 4 characters all in different colors, sometimes overlapping. It was pretty easy for a human to parse it.
Example of UrlShield generated images and their OCR.
It even works pretty good against OCR. I used Microsoft OneNote OCR feature which uses the commercial OmniPage software to create the second column.

So far so good? Well, no. This scheme is flawed because it is easy to transform the image – remove the background and segment it (split it to region that each contains a single character), allowing OCR tools to easily get the letter. To remove the background you just clear all the black pixels out of the image. To segment it all you need to do is choose one color and mask all the others, which means you’ll end up with a single letter, as each letter is in a different color. This is what you end up with:
Parsed captcha images
OneNote has no problem parsing each of these to a letter.

The process described above is exactly what UnUrlShield does. It’s a simple Python script that use the Python Imaging Library to read the image. Then it counts all the colors that appear more than a certain threshold (MIN_PIXEL_PER_LETTER_COUNT) and saves each color’s pixel location. Lastly it goes through the colors, creating an image with only that color’s pixel locations.

Is there a lesson here about CAPTCHAs? I think so. UrlShield is now some kind of ad/malware site. Even complicated CAPTCHAs can be broken, or even better – be defeated by side-channel attacks like having an army of low-cost workers break them on-demand (The comments of this article are a treasure trove of irony) and sometimes people are even fooled into breaking CAPTCHAs. This is why it amazes me they are still around, annoying normal regular people while also being broken by even slightly motivated attackers.
Are there no solutions to spam? Of course there are! In fact gmail does a great job at stopping 100% of my spam using things like blacklisting known spammers, Bayesian filtering, “crowd-sourcing” protection (the “mark as spam” button) and other tools that don’t rely on CAPTCHAs.

Do you have good examples of silly, easily broken or bizzare CAPTCHAs? Did you find an easy way around some services blocked by CAPTCHAs? Leave a comment below and tell me about it!