You are currently browsing the archives for the Uncategorized category.

Archive for the ‘Uncategorized’ Category

A lesson learned… again

Thursday, June 17th, 2010

In many ways, it was just another day of class. But as I sat there awaiting the final exam of my first graduate computer science course, Advanced Algorithm Analysis and Design, I wondered if my studies had prepared me to succeed at this level. Most of the students surrounding me were PhD students, and their experience in this subject was greater than mine. For a moment my mind flashed back to a time three years earlier. It was the day I told my wife that I wanted to go back to school, starting over in a new subject, to get a Masters degree in computer science. Then I plotted a course to make it happen.

Unlike the degree that I completed nine years earlier, there were no liberal arts or elective classes in my plan for this new degree; it was all math and computer science. I wasn’t sure how well I would fair since the last math class I had taken was when I was a junior in high school. But I moved forward with my plan and one by one, I received an “A” in all of the undergraduate math and computer science courses that led to me sitting in that classroom surrounded by fellow graduate students. Would this class end in the same manner?

Two weeks later, I was walking across the campus of the Oak Ridge National Laboratory. I was starting my second summer internship with Sensorpedia, so I was heading to over to meet up with my mentor and lead developer of Sensorpedia, David Resseguie. After catching up on what we’d been up to the past several months, David told me that one of my first tasks of the summer was to write a blog post. He said to write about something I had learned this past school year. For those of you that don’t know David, that might sound a little silly… to just write about something that I learned. For those of you who do know David, you know that learning is very important to him. He knows that good ideas can come from a wide variety of sources, and a great way to facilitate that is to continue to learn new things.

For those of you still hanging on to my opening story about my Advanced Algorithms class, I can happily report that my 4.0 GPA in this subject remains intact. So the question was now before me: Of all the things I learned this past school year, what did I feel was worthy of writing about for this blog? A quick thought back revealed several possibilities. Should I write about the inner workings of the proof of Rice’s theorem from Theory of Computation class (which gives us the fact that only trivial properties of programs are algorithmically decidable)? Or should I write about the mangled NP complete proof for the Hamiltonian Cycle problem (where we transform an instance of Vertex Cover to an instance of Hamiltonian)? Perhaps the topic should be more programming related. I could write about the nights spend debugging my bipartite network flow program. Or just as easily I could tell of my experiences implementing Unix pipes. Which topic truly deserves the honor?

To answer this question, I will ask one of my own. Why do we ask our kids to clean their rooms? Is it to rid ourselves of the feeling the chaos? Sure. But isn’t another main reason that, as adults, we understand the importance of having the things we need access to accessible when we need them? We know that the investment of time to clean one’s room pays off. Now, how does this relate to my past semesters of math and computer science studies? As I reflected over these months of study, it dawned on me that the things that I truly learned were things that I had already learned before: the power of preparation, hard work, and in particular, organization. Of all the things that I’ve studied in this journey of my new degree, it is these simple, timeless qualities that I have again learned to be priceless. It’s true that I might not have the raw intelligence of some of my classmates. But I have found that I can make up that difference with these other qualities. I found that if I spent the time necessary to organize complex issues, I could perform at a high level. I found my investment of organization paid off.

Others also know the power of organization. When Bryan Gorman and David Resseguie started Sensorpedia, they knew the potential of organizing the world’s sensor data. Sensor data was somewhat available, but not organized. One type of sensor data was often times not compatible with other similar types of sensor data. There was one proprietary format after another. It was harder to have access to certain data when you needed it. Sensorpedia seeks to fix these problems. It seeks to organize complex issues and data sets, so that our users can perform high level tasks. How ironic is it that my lesson learned has been a core goal of Sensorpedia all along?

Sensorpedia highlighted in Sensors Magazine article

Wednesday, May 12th, 2010

I’m happy to announce that an article about Sensorpedia has been published by Sensors Magazine. I’d like to thank my co-author Scott Fairgrieve of Northrop Grumman for his input and development of a translation tool to simplify the registration of standards-based sensor systems that use the OGC Sensor Observation Service (SOS) Interface Standard. Watch for a guest blog post by Scott describing his effort in the near future.

Here is a link to the Sensors Magazine Article:

Unifying Isolated Sensor Systems Using Web 2.0 and Open Standards

“Taking a page from social networking sites that offer users the ability to share and manipulate data in novel ways, Sensorpedia allows users to find, share, and use sensor data online.”

I’m thankful for the questions and requests for more information I’ve received since the article was published earlier this week. If you haven’t already done so, please check out the Sensorpedia sneak peek. I know the blog and Twitter feed have been fairly quiet over the last few months. That doesn’t mean there’s been nothing going on with Sensorpedia. In fact, we’ve been very busy. As part of the private beta effort, we’ve been collecting feedback on how to improve the Sensorpedia API and web application. In addition to the beta testing, we’ve also been applying the Sensorpedia concepts and software to other related domains as part of our ongoing work here at ORNL. We’ll be writing more about these individual efforts in the near future. But the key point for now is that we’ve been incorporating the valuable feedback and are preparing to migrate Sensorpedia to an updated API. It’s still Atom-based, but is much more powerful and flexible. The updated API will fully support the Atom Publishing Protocol (AtomPub) and additional querying capabilities. I’m very excited about where we are and the changes we’ve been making. I’ve already used an alpha version of the new services internally on several projects with lots of success.

Because of the particular interest I’ve received regarding technical details on the framework and the new API (thanks @freaklabs, @rafik, @SiliconFarmer and others!), I will plan to post a draft version of the updated API documentation before even completing the development work to incorporate it into the beta web application. When I do, I’ll post links on the blog and @Sensorpedia Twitter account. I’d love to hear your thoughts on the changes and the direction we’re taking with Sensorpedia in general.

More than data

Tuesday, October 20th, 2009

There’s a trend occuring right now where people are interested in collating sensor data and providing a nice Web 2.0 interface. I can count several projects besides Sensorpedia, including: Sensorbase (CENS), Sensormap (MSR), and Sensor.Networks (Sun). Sensorbase provides a nice interface to construct and query  relational tables, while Sensor.Networks provide a nice interface to interact with Sun Spot devices. Sensorpedia, in constrast, emphasizes a loosely-coupled approach by which “sensors” simply publish data in their own format and register specific URLs.

Although these tools do their job well within their original assumptions, they all lack a general way of interacting with the data sources as programmable units (ie: as computational sensor nodes).  For example, Sensorpedia allows users to view sensor data, but does not provide any convenient method to manipulate that data. If a user wants to monitor water levels in a particular geographic area, he or she will need to download the relevant sensor data, write a script to parse that data, and finally execute the relevant computation. Unless there’s a way for the server to restrict which data sources to give back to the user, the user will most likely need to download all the data wasting time and bandwidth. In addition, these tools provide limited interfaces to interact with the data, and do not provide a simpscript_architecturele way to construct additional interfaces.

However, imagine if users had access to a programmable substrate and could write scripts for virtual sensor nodes. Users could then write scripts to transform the native sensor format (ie: whatever Sensorpedia finds on the internet) to a uniform, structured format, enabling a variety of end-user interfaces, including Tables and SQL. In turn, additional scripts could perform analysis, etc.

So what would these scripts look like? Each script would be associated with either sensor nodes or specific tags. A user may write a script applicable for all nodes (ie: exposing the node ID or url) or write a script only applicable for specific nodes. By associating a script with a tag, all nodes that share that tag would inherit functionality. This way, all nodes tagged with Dasmet would automatically employ the Dasmet conversion script. Since these scripts are designed to interact with sensor data, it makes sense to expose an event-driven data-oriented interface. As scripts generate new data, they may trigger additional scripts, and so on. That way, users will be able to write relatively short scripts that can interact with other scripts in a loosely-coupled, flexible manner.

Since scripts are associated with specific nodes and tags, there may be many scripts executing at any given time. It’s easy to imagine that most data sources will be associated with a basic conversion script. Many virtual sensor nodes will also be associated with some analysis scripts and one or more interface scripts. Consequently as Sensorpedia becomes popular, scalability will become an issue. Since these are virtual sensor nodes, it makes sense to explore using virtual machines to implement this scalability. As load increases, the virtual sensor nodes could instantiate themselves on additional physical nodes. As load decreases, the sensor nodes could merge back onto fewer physical nodes.

So where do we go from here? Ultimately, it’s my goal to implement something that resembles the embedded figure (in terms of functionality). Tables should be able to execute over this scripting layer along with other interesting querying interfaces. In the meantime, I still need to design the scripting environment, produce some examples, and implement a scalable execution environment. I plan on posting implementation details and results as times goes on. As always please feel free to contribute ideas.

Discovering new sensor data

Friday, October 9th, 2009

While David has been working hard on Sensorpedia’s infrastructure, I’ve been thinking about different ways to automate the process of identifying, tagging, and extracting sensor data from the internet. This would be handy for several reasons. 1) we wouldn’t have to spend valuable human time performing a relatively mundane task and 2) having a sensor crawler would ensure that we would discover new sensors as they come online. Overall this is a pretty ambitious task, but to get started I’ve been asking: what is sensor data anyway? For the purpose of this small experiment, I decided sensor data is any numeric data  that contains some textual elements that describes that data. This is probably too simple of a definition, but it will do for now. Using this simple definition, it should be relatively straightforward to examine the number of numeric characters in a document to determine if a page has “sensor” data.

known_sensor_data_static_thresholdrandom_data_static_threshold

In the first figure I took the list of known sensor sources from the sensorpedia database. The sources were filtered to only include ‘text/html’ and ‘text/plain’ to avoid images, video, etc. For each data source I downloaded the page and graphed the ratio of numeric characters that appears in the main body (excluding any html tags and punctuation). For example, if the page contained exactly two characters (an ‘a’ and ‘5′), then the ratio would be 0.5.

It’s pretty evident that most of the data sources contained between 30 and 50 percent numeric characters. The only exceptions to this were the first few sources and the very last source. As for the first few sources, I found that they were php files that contained images of sensor graphs instead of alphanumeric content (apparently mislabeled in the sensorpedia database). The last source supposedly contained 100% numeric data (after punctuation removal). This is a little weird since most users would have no way of understanding this data, but presumably somebody is publishing this data for their own benefit. After removing these two extreme groups, we get an average of about 37%.

As for the second figure, I did the exact same thing except I substituted the known sensor sources with 2695 random webpages (I wrote a small Ruby crawler to do this for me). It’s pretty striking how different the figures are. There appears to be two distinct groups of pages. The great majority of the webpages contained less than 1% numeric data. There’s also a smaller group that contains about 20% numeric data. Oddly enough many of the ones with 20% numeric data seemed to be pointing to some Japanese website discussing weather data. I can’t read Japanese, so I’m not quite sure what it’s all about. Finally there’s at least one page with nearly 50% numeric data. Upon closer inspection that extreme page ended up being a UPS page that contained lots of actual data (see screenshot).random_with_lots_of_numeric

Once I graphed this data I wanted to know if a simple threshold test would work to differentiate the two types of webpages. The threshold I used was the average numeric ratio of the known sensor data minus one deviation. This excludes the random webpages, but also excludes several of the legitimate sensor sources. Using two deviations (the lower brown line) still excluded most of the random pages, but also included all the known sensor data. For a first pass, this test seems to work pretty well!

There’s still a lot of work to do (ie: differentiate sensor data from any old table of data, etc.) and I haven’t even thought about graphs, images, and video… Until then, please send me suggestions (or better yet, results)!

Introductions

Friday, October 9th, 2009

Hello everybody! I’m new here (three weeks in Knoxville!) and I  just wanted to introduce myself to the Sensorpedia community. I just started working at Oak Ridge National Lab in the Data Systems Sciences & Engineering group and will be involved with various Sensorpedia-related projects. Before arriving here, I was doing post-doctoral work at the Renaissance Computing Institute, a research institute in Chapel Hill, NC affiliated with UNC. It was fun being a postdoc and will miss Franklin St. but I’m pretty stoked about the exciting work happening here at the lab. Before then I was in sunny Albuquerque, NM where I received my PhD in CS from UNM (with advisor Barney Maccabe). As time goes on, I plan on posting some of my sensorpedia related research, results, and ideas here on this blog. Anybody should feel free to comment, give suggestions, and of course collaborate to provide results!

Sensorpedia Sneak Peek

Thursday, October 1st, 2009

Sensorpedia is still in a limited beta testing phase, but we’re happy to announce a new Sneak Peek at the application. We’d love to hear your thoughts on our progress so far.

The Sneak Peek provides read-only access for non-beta users to search and explore the data currently in Sensorpedia. Contribution of data is currently still limited to beta testers. Sign up for our Sensorpedia mailing list to be notified when we move to open beta. (If you’ve got some really cool data you’d like to interface to Sensorpedia that just can’t wait, please contact us with details.)

Check out the Sensorpedia Sneak Peek!

Here are some sample searches to point you in the right direction to start exploring:

Click the + sign in the search sidebar to add it to the active layers list and expand out into individual sensor locations.

(Note, some feeds like the ICAO weather data for the US and some of the buoy data sets are rather large and take some time to pop up when you add them to the active layers list.)

Please send us your feedback on the Sensorpedia Sneak Peek so we can incorporate your suggestions into the final product.

UPDATE (Jul 6, 2010)
We have been incorporating beta user feedback since we released the “sneak peek” and are looking forward to releasing a new version with improved interface and more powerful API this summer! Stay tuned to the blog for all the latest news.

Thank You Chris!

Friday, February 27th, 2009

This was the last week of Christopher Tomkins-Tinch’s current co-op rotation here at ORNL. Chris played a vital role in the design and development of Sensorpedia (and several other related projects). We can’t thank him enough for his hard work and endless supply of great ideas. We wish him the best this semester at Rochester Institute of Technology and hope to have him back again this summer.

Are you interested in an internship at ORNL?

How To: Interface an irradiance sensor with Sensorpedia (guide 4)

Thursday, February 26th, 2009

tsl230_side_view

We have previously detailed how to read in an LM34 temperature sensor, an ultrasonic rangefinder, and a light color sensor. This guide will show how to interface with a TSL230 light-to-frequency converter.

(more…)

How To: Interface a light color sensor with Sensorpedia and serve Atom! (guide 3)

Friday, February 20th, 2009

We have previously documented how to interface two sensors: an LM34 temperature and a Maxbotix-EZ distance with Sensorpedia.  For this guide we will be switching things up a bit.  Instead of using a Make Controller to communicate with sensors, we will be using another popular DIY/tinkerer microcontroller: the Arduino (duemilanove). Instead of pushing our data to Twitter, we will now be generating an Atom feed and serving it ourselves.

The color sensor we will be working with is the ADJD-S371. First, a note on voltage requirements.  The color sensor is designed to operate at 3.3VDC with a maximum operating voltage of 3.6VDC.  We’ll want to be sure to use the 3.3[V] supply from the Arduino board. Proper calibration of the ADJD-S371 will be left as an exercise for the reader (a half ping-pong works reasonably well as an integrator) .

color_sensor_11

color_sensor

The wires were connected as follows:

(more…)

How To: Interface an ultrasonic rangefinder with Sensorpedia via Twitter (guide 2)

Friday, February 6th, 2009

Previously, we detailed how to read in an LM34 temperature sensor. This guide will show how to interface a Maxbotix-EZ ultrasonic rangefinder with Sensorpedia.

maxbotix_1

The Maxbotix-EZ measures the distance between its detector and the nearest solid object.  It emits a sound at an inaudible 42kHz frequency and listens for an echo wave to be reflected.  Based on the time between these events it is able to calculate the distance to a reflecting object.

The Maxbotix sensor is easy to interface. As with the LM34, we only need three wires: Vcc, ground, and analog output.  Here are the pins we will use:

maxbotix_3

(more…)