More than data
Tuesday, October 20th, 2009There’s a trend occuring right now where people are interested in collating sensor data and providing a nice Web 2.0 interface. I can count several projects besides Sensorpedia, including: Sensorbase (CENS), Sensormap (MSR), and Sensor.Networks (Sun). Sensorbase provides a nice interface to construct and query relational tables, while Sensor.Networks provide a nice interface to interact with Sun Spot devices. Sensorpedia, in constrast, emphasizes a loosely-coupled approach by which “sensors” simply publish data in their own format and register specific URLs.
Although these tools do their job well within their original assumptions, they all lack a general way of interacting with the data sources as programmable units (ie: as computational sensor nodes). For example, Sensorpedia allows users to view sensor data, but does not provide any convenient method to manipulate that data. If a user wants to monitor water levels in a particular geographic area, he or she will need to download the relevant sensor data, write a script to parse that data, and finally execute the relevant computation. Unless there’s a way for the server to restrict which data sources to give back to the user, the user will most likely need to download all the data wasting time and bandwidth. In addition, these tools provide limited interfaces to interact with the data, and do not provide a simp
le way to construct additional interfaces.
However, imagine if users had access to a programmable substrate and could write scripts for virtual sensor nodes. Users could then write scripts to transform the native sensor format (ie: whatever Sensorpedia finds on the internet) to a uniform, structured format, enabling a variety of end-user interfaces, including Tables and SQL. In turn, additional scripts could perform analysis, etc.
So what would these scripts look like? Each script would be associated with either sensor nodes or specific tags. A user may write a script applicable for all nodes (ie: exposing the node ID or url) or write a script only applicable for specific nodes. By associating a script with a tag, all nodes that share that tag would inherit functionality. This way, all nodes tagged with Dasmet would automatically employ the Dasmet conversion script. Since these scripts are designed to interact with sensor data, it makes sense to expose an event-driven data-oriented interface. As scripts generate new data, they may trigger additional scripts, and so on. That way, users will be able to write relatively short scripts that can interact with other scripts in a loosely-coupled, flexible manner.
Since scripts are associated with specific nodes and tags, there may be many scripts executing at any given time. It’s easy to imagine that most data sources will be associated with a basic conversion script. Many virtual sensor nodes will also be associated with some analysis scripts and one or more interface scripts. Consequently as Sensorpedia becomes popular, scalability will become an issue. Since these are virtual sensor nodes, it makes sense to explore using virtual machines to implement this scalability. As load increases, the virtual sensor nodes could instantiate themselves on additional physical nodes. As load decreases, the sensor nodes could merge back onto fewer physical nodes.
So where do we go from here? Ultimately, it’s my goal to implement something that resembles the embedded figure (in terms of functionality). Tables should be able to execute over this scripting layer along with other interesting querying interfaces. In the meantime, I still need to design the scripting environment, produce some examples, and implement a scalable execution environment. I plan on posting implementation details and results as times goes on. As always please feel free to contribute ideas.

Subscribe


