Sunday, March 29, 2009

SOAP over HTTP, SOAP over Bus, and SOAP over Fedex

I was at a talk by Genevieve Bell (Anthropologist), and one example I remembered is that how India supported connectivity to few rural villages (for E-Seva initiative I believe).

The village did not have connectivity, but had a bus that everyday go back and forth between the nearest town. What they did was install a Wifi receiver in the bus. When at the village, the bust automatically reads messages from a computer (again Wifi enabled) that is located at the bus stand, and when at the town, it sends all the messages to Internet, and brings back whatever messages the world has sent back.

India is going to have (may be already have) satellites, which cover its breadth, and then this solution will be obsolete. Nevertheless, it strikes me by its simplicity (rather it is outof the box). Different versions of the same idea can be find in other places. For example, in Ploar Grid the use case is like follows (From what I heard). There are lot of equipments installed that continuously collect data across a large region, and rather than setting up a communication network, a small plane flys through the area and collect data from equipments via Wifi. Similarly professor Tanambaum said "never underestimate the bandwidth of truck full of tapes", and in the paper "Above the Clouds: A Berkeley View of Cloud Computing" authors recommended that it could be cheaper to Fedex the disks to the cloud computing provider.

Also, there are another set of use cases are emerging. With use cases like Large Hadron Collider and Large Telescopes, the size of data is going out of bounds. For example, March IEEE Spectrum reported an optical receiver with 640Gb/sec. With these systems, peta bytes (10^15) of data are common. Problem is even with 10Gb/sec Ethernet (yep Teragrid is connected via 10Gb/sec Ethernet) takes 20 minutes to send 1 tera bytes, and it takes 11 days to transfer a full peta byte. Therefore, it might be common in the future that we receive literally a container full of data.

These kinds of asynchronous (with very big latencies) communications provide different kinds of interactions, and call for different types of use cases. It is a challenge to figure out how best to use them, and how best to present them to the user. For example, client side validation and preprocessing is very important, and it might make sense to add additional data, which might be useful to the result. For example, if you are searching Google through this way (nobody will if they have a choice), you might need to return all the results not only links (may be small crawl of first few results) etc.

No comments: