Categories
Programming

Frindcare, some improvements

After getting to a working level with Friendcare, my web app to track friending and defriending on Facebook, I was still doing some tweaks. Some of the tweaks is to improve functionality, some of them to fix some broken behaviour by the services I use (especially Facebook). This is a little summary of what I have learned in this short time.

Changes

Heroku

The very day of my last blogpost, when I actually had others sign up as well, Heroku had a few hour long outage. That was quite unfortunate, and it was annoying to see the “Application Error” page, which pretty much hides what really happened. In real production environment probably I would have to run things on multiple independent platforms so if the platform itself is the one that crashes, the site could be just switched over. This time I’m just taking it simple, make a personalized maintenance page, which I can turn on in this case, or when I’ll really need some maintenance time. Also, played around Google Web Fonts, so can have the good old school Press Start 2P on this one.

Friendcare is offline for maintenance
Friendcare maintenance screen

It would be nice if in case of trouble I could use a web interface to trigger maintenance mode. Might build a service for that, though now that I think of it, that will have to be hosted outside of Heroku. Also, later might do an application error page as well, just in case. Might even choose a mascot for that. Though probably not.

Fonts

Besides the error page, I was playing around with other fonts for the main interface as well. In the end the current one is Crimson Text, which should be quite readable even with very different font size. It is mostly good, though it doesn’t play completely well with Twitter Bootstrap, the button texts for example are not totally well aligned vertically. Might have to look around the fonts a little bit more.

Friendcare header layout
Tweaked layout, and it actually already has results

Facebook

Facebook has so many weird things going on, that I can barely wrap my head around it. One of my impression, which probably shouldn’t surprise me, that they are most likely not using their own APIs, otherwise it wouldn’t have such huge bugs in it. Another impression is, that somehow they manage make everything almost good, but in some way completely bad.

Graph API reliability

Since the entire site relies on regularly polling the user’s friend list, getting that list consistently is a must. Somehow the Graph API occasionally misses some friends: they don’t appear in the list and I mistakenly assume as lost connections. An hour later, when polling again they are back in the list, so they show up as friendship gain. That doesn’t work very well. This happens a lot, so I had some algorithms in place for that and sometimes had to manually clean up the database (which I really shouldn’t do if it can be avoided).

FQL

I can also use the Facebook Query Language to get the information I want. It is quite straightforward, they even have the expression among their examples:

SELECT uid2 FROM friend WHERE uid1=me()

This is pretty fast as well, maybe half the time of the Graph API request. There’s one problem, though: it returns wrong results. Not entirely wrong, just wrong enough. The list I receive seem to have a bunch of invalid user IDs, such that it doesn’t belong to anyone. They just don’t exists, but I still receive them. So while Graph API suffered from false negatives, FQL suffers from false positives. Fortunately there seems to be a workaround, in the form of a deeper query such that

SELECT name, uid FROM user WHERE uid in (SELECT uid2 FROM friend WHERE uid1=me())

Here I will actually throw away the ‘name’ part, but at least I know that the IDs that I receive do belong to actual people. So far this seems to be the most reliable, though it takes as much time as the Graph API request, or maybe a bit more.

In the end, currently I query both the Graph and FQL in sequence, compare the results, note if they differ (for debugging purposes) and use the FQL result since that seems to be more reliable. Will check back later to this part, when I have more information to go on.

Offline access

Offline access is pretty important for this kind of service, because it’s not that useful if we can only check the results when the user comes to the page. Unfortunately, the ‘offline_access’ permission has just been removed. Instead the access token lives for 60 days, after which one has to get a new one by logging the user in again. I’m not saying it’s unreasonable, maybe even better since I don’t have to ask for any extended permission over the ones that are granted to every app automatically, but have to keep in mind. It might complicate things. Also reminds me that other apps that I’m using with Facebook (e.g. ifttt) will have some problem because of this thus I will have problem with this. Better check with them.

Send message prompt dialog

I kinda understood when they have removed the ability to send message to others on the user’s behalf without their interaction, that’s such a tempting spam delivery system for most, I presume. On the other hand, for a long time at least they had (or I seem to remember they had) a send dialog, where I could prompt a user to send a message to someone else: they write the actual message and they click send. On the other hand, the current Send Dialog has a required parameter ‘link’, this it is no longer possible to send “just a message”. This must be a relatively new chance, since the dialog’s page describes it as:

The Send Dialog lets people to send content to specific friends.

Fair enough, that’s exactly how does it work. On the other hand, just one level up, the Dialogs Overview says this:

The Send Dialog allows a user to send a Facebook Message to one or more of their friends.

Content vs. message, subtle but crucial difference. Thus in my app I cannot have a “send message to this user” button, unless I attach some kind of link to the message. Now this feels really spammy to me.

Overall

It is good to add a few more features, because I can use it better as well, and I learned a lot too. Might add a proper dashboard, a deregister option, notification to email or (if I can figure out how) Facebook message, stats, better layout. Or whatever suggestion I receive.

Finally, the most important thing that it works. In the last week I found two people who defriended me for whatever reason, and I wouldn’t have known about otherwise. They were not close ones, so not going to pursue them, but it could have been otherwise, so at least it’s good to know.

Categories
Programming

Facebook Hacker Cup 2012 Qualifier 1

This is that time of the year once again, when coders gather to take part in some good programming fun, the Facebook Hacker Cup. It’s only the first qualifier round, and while I hoped it will go better than last year, well, it didn’t. Not that I’m really surprised.

My Facebook Hacker Cup 2012 Qualifier score
I needed one right to qualify, but I wish I haven't messed up the easiest problem.

After that 72 hours results are in, as is the explanation and example source code for the solutions. It’s good that the example solutions are in Python, I might even learn a trick or too.

Alphabet Soup

The problem setting is easy enough. Funny thought that no matter how many times I counted the letters in the world “HACKERCUP”, I didn’t notice that there are two Cs. I mean, duh! As usual, the example input set was designed such that it wouldn’t trigger the bug of miscounted Cs. This carelessness is one thing that comes up quite often in my programming, probably should take better care of it.

Billboards

This problem actually worked, which means I’m in Round 2, but I guess it can be improved quite a bit, make it more efficient or come up with some heuristics. Or maybe it doesn’t matter much.

Auction

This problem was on a whole different level. While the first two had apparently more than 5000 and 3000 correct solutions respectively, this had only 28… I was thinking about it for quite a while, drawing diagrams, trying to use my intuition and imagination to see where the trick is since the naive O(N^2) algorithm is definitely unusable on the N~10^18 level. On the other hand, I might have tricked myself. Reading the solution the trick was completely different than I expected – I thought there’s some weakness in the random number generator that can be used to express everything analytically, while it is actually just about keeping good track of things. There’s no fancy algorithm to break this problem, just pure logical thinking. Now that’ll teach me as well…

Looking out

This of course means that I have a lot more to learn, and most likely I’m not cut out to be a Facebook caliber hacker. That’s no problem, but good to know. Whenever I think about it, the picture that comes to me is the hacking competition scene from The Social Network, where they hire their first employee. I’d love to be in the middle of such brainfest, such intense creation, such inspired learning from one another while having an an amazing time. Well, fortunately there are other places where I can have that experience, like the Startup Bus. And maybe, I can also set out to create that environment over here in Taiwan.

But first, let’s get ready for round 2, should make that one better.

 

Categories
Computers

Social networking exit strategy

This is the kind of thing I read so many times on Hacker News: someone manages a “page” on Facebook, only to have it disabled/deleted one day, out of the blue. The difference is that this time it was the page I was managing.

A few years back I took over the page for Inkscape, and updated it every now and then. Quite enjoyed it and even a relatively small following (3774 at the last count I had) people were very active. Then this week once when I logged in and wanted to send out an update – I just couldn’t find the page. Disappeared from the search, from the control panel, everywhere. I could only see the page that was created from its Wikipedia article, and there was the Inkscape “group” which is much smaller.

Inkscape
Draw up some good stuff with Inkscape

I was looking around for a long time for a support contact, and there was practically nothing. It’s a hell of a challenge for coming up with Google search keywords because there’s just too much noise for Facebook + support related terms (that’s a pretty bad sign). In the end I go to this form, that supposed to be used to report problems with a page. Too bad half of it I couldn’t fill out. What is my page’s web address? I don’t know, it was “on Facebook”. What happened to it? Have no idea, I haven’t had any notice about it. Attach a screenshot. I wish I could but did I mention that it disappeared? Anyway, after submission I’ve been told that I’ll receive an email and it is crucial to act on it, otherwise the procedure won’t go forward. That’s been 4 days ago, and no email since.

Contingencies

By now I have given up on getting it back, it’s just does not worth the effort. On the other hand, I do have another page, Ignite Taipei which we manage/organize together with a couple of friends. That would hurt much more if that would disappear, especially because there’s not much presence elsewhere on the net just yet.

Strange, but actually this whole issue bothers me much less than it would have about a year ago, even if I got locked into Facebook even more. Maybe I know now better that shit will always hit the fan, whatever you do, just have to be prepared for it.

Sometimes it's time to leave
Please exit in an orderly fashion

So, how about extending the idea of the World Backup Day to social identities? How can I prepare to lose the least in case my pages or even my profile gets blocked/hacked/deleted?

  • The personal information is the most valuable, who are my friends and how to reach them. Should write a script that can run somewhere in the background, backing that info up periodically. At least names and emails. Maybe even import them into GMail contacts right away…
  • Establish links with people outside of Facebook. I already use email instead of Facebook messages whenever I can, now I have to extend that really to everyone.
  • Build a proper site / blog for the pages I’m involved with and make people aware of it. Always have a point of contact outside of FB.
  • …. what else did I forget? (leave me a message in the comments if you have more idea)

Why Facebook?

Do I ever fret that Google might block me? That Tumblr would delete my stuff? That Twitter removes my account? Nope, didn’t even really occur to me, they are “not like that”. On the other hand, the power Facebook has over online identities makes people cringe. Why are they different? I feel it’s because they not at all transparent in their decisions and also quite arbitrary. Too many innocent were punished to go unnoticed, and every decision is final. Unless of course one is so big like Ars Technica who had contacts within the HQ.

I was reading an article lately how Facebook want to manage their growing pains algorithmically: more math instead of more people. To me that means that there will be just more arbitrariness and even less transparency…. I wish if instead they would have more support people and start to clean up the mess instead. But of course, if I’d know how to manage Facebook, I would have invented Facebook. :)

Categories
Programming

Hacker Cup Round 1 Redux

Last night’s Hacker Cup round, which was a second try of the first round after last week’s disaster, didn’t go quite as well for me as I hoped. Probably not that much for the others either, as even if 1000 people will advance, less than that have submitted code.  Can be disappointment due to last week’s failure, I too was hesitating to take part, but also because the problems were I think at least one level up. Which is not a problem, just observation.

by jailman @ Flickr
White Keyboard and Coffee by jailman @ flickr

Got my coffee at 1am, set in to mood and started off right on time at 2am.  In the next 3 hours I finished one of the problems, and got halfway with another one, which is below par, but what can I do. Need to learn more. One thing that I’ve noticed is if one just searched the web with the right keywords, Problem 1 and 3 could be handed over on a silver plate, more or less… Which is kind of random for a programming competition.  Did they want people who can find the right solutions or maybe someone who knows the solution for all of these problems alread? The first one begs the question whether there will be  Wikipedia access in the final. The second one begs the question: who can be such a person? And if neither – why choose such problems that cannot possible be solved in the allotted time without prior knowledge or external input?

Well, I wonder what would be the right keywords for Problem 2, which I could not solve yet.

I’m also bothered, that it seems that the input file one had to download for Problem 3 does not conform to the input specs. Why? Especially when if you download the file, the system gives you 6 minutes to submit. This is rather underhanded….

I will certainly try to check out the other two sub-rounds as well, even if they are 5am next Wednesday and next Sunday local time. Not doing this for the glory but for the education anyway. :)

Notes on solutions

Warning: Spoilers ahead. Don’t read if you want to solve them yourself.

Almost all my code for this can be found in this Github repo, together with other rounds’ code.

1) Wine Tasting

This one is more mathematics than anything. Let’s have G glasses of wine and minimum C guesses to be right. For i = C to  G calculate the number of : how many ways one can have good guesses, multiplied by how many ways none of the remaining guesses are right. Sum these numbers up, and that’s your result.

I was thinking for a long time how to calculate the number of ways none of the guesses are right (i.e. the number of permutations where non of the elements are at their original place). Finally I found there’s a name for that: derangement. And it is not at all simple, so I guess I wouldn’t have figured it out myself.

In the end, what we have to return is:
$$\mathrm{result} = \sum_{i = C}^{G}{G \choose i} (G – i)! \sum_{j = 0}^{(G-i)} \frac{(-1)^{j}}{j!}$$

Now that it actually makes sence, it turned out this very expression has a name too: recontres numbers.

Update: fixed my program and now should be correct.

This can be a very large number due to the factorials, so got to use the right types. In my python code I had to use 64bit numbers, but it’s rather ad-hoc, just choose the ones not to break. I do have to revisit this again and fix the types.  (And don’t forget the modulus).

Also, I used SciPy to get the results, but only needed the factorial. If I quickly write a fast factorial subroutine then SciPy is not needed – another thing to fix in my code.

It turned out that my code has failed, so basically this round is 0 out of 3. Nevertheless, wanted to see what was wrong, so I rewrote most of the math code. After messing around for a while, it turns out that all my problems came from

from __future__ import division

which is basically the newer, py3k-style integer division code. If I excluded that, suddenly everything was fine. I really have to remember this, because I used this include in practically every math-related code I wrote lately. Besides this change, I kept the other upgrades: fast factorial and choose. My output for the given input file has md5sum of d17dfc9e9fef637f771da3d693d7920b

2) Diversity Number

This one I failed on, and seems very few people actually succeeded in general (less than 60 people in the whole round). I start to understand bits and pieces of the possible solution, but I have nothing complete yet.  I wonder if there’s such an insight into this one as in the previous problem with the derangement.

Anyway, one thing I know: if there’s a list A, sorted, with length n and index starting at 0, then the diversity number is
$$\mathrm{diversity} = \prod_{i=0}^{n-1} \left(A_{i} – i\right)$$
or in Python

prod = lambda array: reduce(lambda x, y: x*y, array)
diverse = lambda array: prod([array[i]-i for i in xrange(len(array))]) if len(array) > 0 else 0

Now just have to figure out, what to do, not to enumerate all the $$2^{100} > 10^{30}$$ subsequences in the worse case scenario. :-?

(I’ll update this when I’ve found the solution).

3) Turn on the Lights

This one is pretty much straight out of the game Lights Out, with the only twist that one has to turn on all the lights instead of turning off. This I found only well after the round ended but this makes the puzzle a bit of a duh for me, nevertheless it is worth doing, since there are some optimization choices that are not trivial.

The method I used to get the solution, which follows most of the steps as outlined on the above link. Let’s call the button in row i and column j as [i, j].

  1. While importing the input, turn ever “on” into “off” to get the right setting for this problem.
  2. Turn off every light down to the bottom row. This is done by pressing button at [i, j] if [i-1, j] is on. Save all the buttons that are pressed.
  3. Check the state of last row at that point and keep it as a “target”.
  4. Starting from a blank puzzle (all lights off), turn on a single line in the first row and propagate it down just like in step 2. Save the last row. Do this for all of the lights in the first row, which will create the “base vectors” of the solutions we seek.
  5. We have to find the combination of those base vectors that equals to the above “target”. This can be done (i suppose) with a Gaussian elimination kind of method, but that’s just too much to implement for this little puzzle. According to the specs, the maximum number of colums is 18, thus there are $$2^{18}=262144$$ sett different configurations of lights at max.  That is not too bad, if there’s a fast comparison method, one can do a brute force search. That’s what I did. Also, I made use of binary expression of the light settings, e.g. switching lights can be done with XOR-ing the light pattern and the switch pattern.
  6. If found a match in the previous step then do the swicthing in the first line according to that result and do the propagation again. The final result should be an all-off board. Save all the button presses again.
  7. Check the pressed buttons. Since pressing a button twice is equal to not pressing it at all, if a button is pressed N times it is equivalent to N mod 2. Sum up all those presses, and that should be less than (rows * columns).

This is probably not the quickest method due to the brute force search, but it should be good enough for a programming competition where the speed of programming is more important the the speed of the program.

I do have one big complaint that I mentioned int the intro: the test file downloaded from Facebook did not follow the input specifications. In the specs every puzzle is a single line, the rows of the puzzle separated by whitespace. In the downloaded file for every puzzle every row is a new line.

The md5sum of my solution for the downloaded input file is 1e38422048a6aa9aeb007955d8b66f46