Thursday, November 5, 2009

Wrong way of thinking about it?

A lot of people are thinking about misinformation; and creating networks of people. As long as geeks are fairly well distributed, they will be posting this stuff on the internet because they will be so excited to see a balloon. Probably news organizations will report on it. Maybe all that needs to be done is to gather the data off the web in real-time. Then, if necessary, call somebody near the location and ask them to confirm.

Charity

Some people started by saying they would give you a cut if you were the first person to submit the location of a balloon (there's an authentication thing there: What, no you weren't first!) Lots of teams are saying they will donate the proceeds to charity. But why stop there? If it's not about the money, it's about the fame. Who says you shouldn't pay money for that? It seems rational to pay out more than the total value of the money. Think of all of these rocket ship competitions (e.g. the Ansari X Prize). Certain the money did not cover their expenses. But then again, they are sending ships up into space, and you are finding red balloons.

$40,000 or $1M

Why isn't the prize $1 million dollars?

Some of the previous DARPA challenges have run that high.

Could be budget. My guess: Nobody's proven that this is actually "DARPA-hard". So it could be a bit embarassing what the solution is. Kind of like these "Battlebots" programming contests where students program robots to fight each other in an online environment. The optimal solution was to program your robot to hide while the other robots killed each other.

Authenticating Locations

Okay, so suppose that somebody says they know the locations of one of the balloons, and gives you a site. How do you authenticate? Easy; go to the web; lookup the phone number of a random person who lives near the location. They'll probably do it for you, because it sounds wild and more exciting than whatever they are probably doing at that moment. If you're not sure if they'll do it; call up several people and have them check in parallel. It's very unlikely that a randomly picked person will even know about the contest. You can ask them some detail of the scene to confirm. Ideally you would tell them to ask the DARPA agent some question that is not available on the web. Of course, this only works during the contest. After the contest; it's up to documentation .. is there a photo, some kind of cryptographic proof, does the DARPA site confirm?

Submit one at a time; or all at once; Notification or no Notification?

The Darpa site actually says:

"Starting December 5, submit locations to the web site immediately after you find them."

 This implies that you may actually submit them one at a time. If so, they will need to limit the number of submissions you make. According to my previous calculations, there are only 2.5M or so points.  Presuming that they limit the number of submissions that you do, but not down to 10; it may also be valuable for people to collect together locations where they are sure there are no balloons. One could imagine some kind of statistical algorithm that could collate the data and come up with likely candidates based on both positive and negative reports. It would also help identify areas that have been under-explored.

Actually, if DARPA actually confirms your submissions (i.e. yes, you got one right!), then one attack is simply to have many users (or have a captcha-killer script) create accounts and exhaustively submit all possible locations to the site; compiling a list of which ones are accepted. It would probably not be wise for DARPA to do so.

Monopoly Pieces; and Sharing

The ten balloons are not unlike McDonald's monopoly pieces; except it hasn't been published what the relative values of the balloons are. As with the Netflix competition; it seems conceivable that at a late stage; various groups may share information or even merge in order to get access to a greater number of monopoly pieces. However, since $40,000 is unlikely to amount to much given the number of contestants; it seems like the real prize is the associated fame with leading the team that wins.  As a result, I believe it's more likely that people will merge (versus share) because it decreases their probability of not losing.

In a simple analysis, it would appear that the winning competitor should actively prevent others from winning as much as they should try to win themselves. For instance, by having 40 people hang out with a big red balloon and a large DARPA poster, scattered across the US, they massively increase the complexity of others in winning (particularly in the final "brute force" phase of the contest where people iterate through combination of potential hits in order to find the actual ten); particularly because; given the time frames involved; people will not necessarily be able to revisit the balloon site to confirm that it was actually a DARPA representative and not just somebody with a DARPA poster. Presumably DARPA could handle this issue by having the agent provide people with a cryptographic signature that can be validated on the DARPA site. However, doing so makes the problem a lot less interesting.

The Last Two Balloons; and Eliminating False Positives

Depending on how the DARPA submission website works; one possibility for finding the remaining one or two balloons, is to brute-force submit every minute/degree combination; something around 14*60*49*60 = 2.5 million combinations. However because of false positives, it may be necessary to use some degree of brute forcing just to select among false positives.  For instance, if we had 30 candidates; we would have 30 choose 10 = ~ 30 M possibilities to try.

Satellites

In this post, we'll examine using satellite footage as an option.

Some organizations, particularly governmental organizations, may have access to semi- real-time satellite feeds that could potential scan large areas of terrain for the balloon. Computers could scan this imagery for red regions (or, dependency on the resolution, a red pixel!). The  continental US is 3,119,885 square miles, which is a lot to scan.

However, there are a few problems with this approach. We look at a few, in increasing order of importance.

Resolution: Not all satellites have the resolution to resolve 8 ft objects. However, this is certainly within the resolution of modern commercial and spy satellites.

Processing: 3,119,885 square miles is 8.69x10^13 square feet; at a byte per square foot, it's about 86 Terabytes of data for one snapshot of the continental US. Storage and processing-wise, if you have access to satellites with these capabilities, it's not likely to pose huge challenges.

Coverage: Not clear that we have enough satellites to pull in this kind of imagery. This link suggests that one commercial satellite can cover 200,000 square miles per day. The  continental US is 3,119,885 square miles; so optimistically, for a 6 hour period, we are talking about > 60 such satellites. I'm sure someone else who is more of an expert on satellite imagery will be able to comment on this. It's feasible; but obviously only a select group would have access to this kind of resource.

Cloud Cover: As you can see here; cloud cover could well be a problem; at least for identifying all balloons. However, you might be able to get some leads on more clear areas of the US.

False Positives: One issue is; is this balloon actually the balloon we are interested in? Surely there are more than 10 red balloons floating in the US. More on this later. One approach is to diff the image on the contest day from images from previous days. Although this could narrow it down, even if things move a little, there still would be many false positives. More on this issue later.

Conclusion: Satellite footage could potential convey an image to a group correctly positioned; however it would have to be combined with other approaches to handle cloud cover and false positives. It could for instance, be use to confirm reports of a balloon, or to serve as a starting off point for wider social-networking approach. However, the issue of false positives is quite worrisome.

DARPA Network Challenge

The DARPA network challenge has begun. Well, you might say, it really begins December 5; but whoever accomplishes the challenge will surely be planning things right now.

If this challenge is anything like the Netflix challenge, there will be a bunch of semi-professional research teams that will actually be the true contenders.

The rules are given below:

1. Here are the rules in more detail:

"The challenge is to locate ten moored red weather balloons located at ten fixed locations in the continental United States. Balloons will be in readily accessible locations, visible from nearby roadways and accompanied by DARPA representatives. All balloons are scheduled to go on display at all locations at 10:00AM (ET) until approximately 4:00 PM (local time) on Saturday, December 5, 2009. Should weather or technical difficulties arise with the launch, the display will be delayed until Sunday, December 6 or later, depending on conditions. If, for any reason, the balloon is displayed in one location then moved to a second location, either location will be accepted. Entrants are required to register and submit entries on the event website. Latitudes and longitudes are entered in degree-minute-second (DDD-MM-SS) format as explained on the website Coordinates must be entered with an error of less than one arc-minute to be accepted. In the event that one or more balloons is never displayed, this fact will be noted on the event website and the rules adjusted accordingly."