Acculation
Talk with Ivy League PhD data scientists...
Internet map: network data visualization...
A Visualization of Wikipedia Data...
Streamgraph: multidimensional data visua...
Data viz: scoreboards as the original an...
Evolution of historical notions of Earth...

Crowdsourced seismic sensors might save your life someday.

Japan Earthquake Early Warning, which can be implemented in US with help of Crowdsource Seismic Sensors. Photo: Wikimedia/Denelson83/Japan Earthquake Warning System/Creative Commons Share-alike by attribution

Crowdsourced Seismic Sensors?

A frequent topic on this blog is the use of Arudino and crowdsourced technologies to address air quality issues. Can similar technologies be used adopted from air quality technologies to improve seismic predictions? It turns out the answer is yes.

Unless you’ve been living under a large rock these last few days, you’ve probably heard that Los Angeles was struck in the last two weeks by what the USGS describes as a “moderate” 5.1 earthquake with “light” fore and aftershocks of around 4.5. (The Saint Patrick’s Day foreshock trembler prompted our earlier article on robot-written newspaper articles , music, and movies.)

During that same period, there were similar or slightly quakes in Chile, Alaska, Greece, and Japan. And let’s not forget the 5.7 quake that struck DC back in 2011 to much mirth on Facebook. There was a significant difference between these quakes and the ones in Los Angeles: (1) they didn’t occur underneath a megapolis of some 13+ million people, and (2) they didn’t occur under one of the world’s major media capitals, where celebrities and publicists are conditioned, like Pavlov’s dog, to associate earthquakes with the salivating opportunity to tweet against a trending hashtag, emergency smartphone power at the ready, and (3) they didn’t have 100 aftershocks within a 24 hour period.

Using Analytics to Figure Out if It’s Time for Vegas, Baby

[Tweet: “LA quakes: Time to head for Vegas, Baby?”] Some of the tweets suggested Los Angelinos should beat a hasty retreat out of town. (Easier said than done for an area with 13 million citizens and serious traffic problems. Fortunately, most people aren’t following that advice, and our freeways remain open.) But this is a blog we’re we’ve talked about data analysis. In fact, one of the slogans for our blog should be:

So are we in danger? Is it time, as some tweets suggested, to visit Vegas again, so soon after just blogging about CES?

We’re going to do another blog article on why formal decision making using data is useful in overcoming known biases in human decision making. (Scary, media-exaggerated, life-and-death situations like these are rife with demonstrable decision-making bias). Nate Silver, in his best-selling book on big data analysis, spends an entire chapter on data overfitting and the hazards of earthquake prediction.

Earthquake prediction from data is not easy. There is a ton of data from seismographs, but like certain problems in economics or business, the underlying processes are very poor understood. It’s easy to overfit the data and come with a model that has “99% R^2” or some similar supposed statistical accolade, only to have in reality a very poor model that fits mainly noise and is not very predictive. Silver writes that many seismologists have fallen into this trap with bogus models that are able to “predict” the past but not the future. He explains this very clearly using simple analogies, and we’ll refer those readers that are interested to his book.

So what does work? Simple models making long-term “forecasts” (and not “predictions”, as the USGS insists the two are different) are quite predictive. For Los Angeles, looking at past quake history, the USGS estimates a “major” quake (defined as magnitude 6.5 or larger) as occurring once every 40 years (not as bad as a number of other heavily populated regions.) It does mean that quakes happen exactly every 40 years, or if one hasn’t happened in 40 years one is overdue; this is simply the longer term average. (Which the press has again gotten wrong in this most recent reporting. Just today a major newspaper reported that a certain LA fault produces a major quake every 2500 years, but no one knew when the last such quake was, so they weren’t sure if we were “overdue”. This is a statistical frequency; when the last quake was tells you nothing about tomorrow.)

OK, a major quake once every 40 years. That means on a given day Los Angeles has a one in 12,000 chance of a 6.5 quake or greater. (Of course, most of the city will survive that quake, thanks to modern earthquake construction codes and perhaps even a future earthquake early warning system, which we’ll get to in a bit. The USGS estimates the lifetime odds of a US citizen dying in an earthquake are 1-in-131,890. Obviously, those odds are higher for residents of California, but not much higher since 10% of the country lives here. 1 in 100 US Citizens will die in a motor vehicle accident, and at 1 in 100,000 a US resident is more likely to die from a snake bite or bee sting is more likely than dying from an earthquake according to the CDC.) So, even with 1 in 12,000 odds of a 6.5 quake striking on any given day, we’ll take Los Angeles, over, say, cities in rattlesnake country any time.

What about the aftershocks? Do they increase the odds? The big problem is that the underlying processes behind earthquakes are extremely poorly understood. Some models have described the earth as similar to a complex tangle of rubber bands. Pressures build up over geologic time due to plate tectonics. The rubber bands get stretched very slowly over many years, until, sudden, a bunch break all at once, and you have an earthquake. This may destabilize some of the remaining rubber bands, which may then break shortly thereafter, and you end up with fore- and aftershocks.


But, ultimately, earthquakes are a physical process. The earthquake releases physical energy that has been stored in the earth. So, shouldn’t a series of small, frequent quakes reduce the chances for a large earthquake? Possibly — the town of Moodus, CT experiences very frequent micro-quakes, to the point of supposedly inducing native religious beliefs around the phenomenon. Perhaps as a result, the area is considered seismically stable, with large quakes rare in nearly 500 years of recorded history.

A similar unanswered question is whether there are physical limits on maximum quake size.

(A certain “non-fiction” cable channel frequently carries a “documentary” about a magnitude 11 quake that hit the St Louis area in the unpopulated 1700s, “but would destroy Chicago if it happened today.” Although St Louis is slightly seismically active, according to USGS quake maps Chicago is not. Moreover, even after reviewing historical accounts, seismologists think the physical limit for a quake might be 9-10. This same “non-fiction” cable channel also frequently carries a “documentary” asserting that Nostradamus died standing, which we assure you is impossible unless you are a pirate in a ride at Disneyland. Maintaining bipedal balance is computationally challenging for robots and even most animals. Consciousness (or physical supports) are required to remain standing. Cable channel buyer beware. If the media can get away with distortions on such well-understood topics, imagine how bad the reporting on earthquakes must be.)

As Nate Silver discusses in the aforementioned book chapter, the physical limit discussion actually turns out to be important. Apparently, Japanese seismologists (or at least the ones consulting in the design of Fukushima) believed the sea floor in Japan limited especially strong earthquakes. In this, they ignored the historical and oral records of powerful tsunamis  hitting the area over the centuries. This caused them to adopt a more complicated model that may have “overfit” the earthquake data, underestimating the odds of the magnitude-9 earthquake and resulting tsunami that ultimately caused a major nuclear accident.

Lack of Predictive Power in Existing Earthquake Models Due to Lack of the Right Data

Unfortunately, with current models that have any demonstrated predictive power, very little can be said. The 100-aftershocks might perhaps constitute an “earthquake swarm” like the one seen in Italy, but this only slightly increases the chances short term of a major quake. (It is not much of an “earthquake swarm” yet, either. If you look at USGS earthquakes 3.5 and above that have hit California recently, you see a typical pattern of 7-14 per month. March was different, with over 30 so far. But that’s only 2x the normal rate. Most of the 100-aftershocks quoted by the media have been below 3.5 in magnitude, barely perceptible and likely well in “the noise” given the failure of any predictive model to date to successfully use this information.)

The one valid predictive model observation is that, with the 5.1 quake, there is now a 5% chance of a stronger quake within the next three days. (Put another way, there is a 95% chance that there will not be a stronger quake within the next three days, or a 95% chance that the much-feared “Big One” will not happen within the next three days. After three days, we go back to the best predictive model we have, which is the old 1/12000 chance of a 6.5 quake on any given day in Los Angeles.)

5% (of a quake stronger than 5.1 magnitude) is certainly higher than the normal 1/12000 of a 6.5 quake (more than 10x stronger than a 5.1 magnitude), but it is still low odds. (And, based on USGS mortality statistics, a 6.5 earthquake in modern earthquake-resistant construction should be very survivable.) We won’t be temporarily relocating to rattlesnake country or increasing our life insurance coverage for this. We will double check our earthquake-preparedness kit, however.

So why are these predictive models so poor, and is there anything we can do about it? Well, the fundamental problem is that we don’t understand the underlying physical processes very well, because we can’t easily observe pressure in rock. This will change. As Arthur C. Clarke once said, any sufficiently advanced technology is indistinguishable from magic. Although seeing through rock with technologies such as ground-penetrating radar is currently limited, with time and computing power, these technologies will get better. Moreover, it will become increasingly more feasible to place pressure sensors and ground-penetrating radar in very deep wells, despite the costs and enormous heat. Although there are some experimental seismic monitoring deep wells, to build a vast arrays these is currently cost prohibitive. But technology is advancing exponentially, and the cost of placing deep sensors in seismically active areas won’t always stay prohibitive. (This is where we disagree with Nate Silver, who speculates we may never be able to predict earthquakes because we will never be able to see through rocks.)


Can Crowdsourcing Help?

This brings us to the main point of this article, which is to ask, is there something cheap but effective today that can be done to provide better or better warnings on earthquakes in California. And the answer here is a resounding yes, involving the same Internet of Things and crowdsourced sensors like the Air Quality Egg  (supported by our smartphone app) that we have talked about in past blog articles, but in an air quality context.

There is an Earthquake Early Warning System in Japan. Earthquake waves travel slowly (at least compared to the speed of light), and so a sensor monitored near the source of the quake can provide a few seconds warning to other parts of the quake zone. This is used, in Japan, to have elevators open their doors at the next floor (while power is still available), brake trains, attempt to shutdown nuclear reactors, halt airplane landings and takeoffs, and have surgeons stop delicate procedures.

The USGS is working on a similar system for California. (And the folks at CalTech, located near Los Angeles, are participants in the still-experimental system. The campus reportedly got a 4-second warning to take cover on the recent tremor.) Unlike Japan, however, California has a far lower density of public seismic sensors “with robust communications.” And, unlike the Japanese mobile phone system, there is no way to send a rapid alert to US mobile phones.

We’ll point out that all of the major smartphone OS providers are headquartered in major seismic areas (Apple and Google in the Bay Area and Microsoft in Seattle). It is difficult to imagine they would not cooperate with efforts to bake a robust earthquake alert system into their operating systems. (If it uses push mechanisms over existing cell data networks — pervasive in heavily populated areas — we doubt it would require changes by cell phone operators.) It probably needs to be even more robust than the existing Amber alert system Apple has added to iPhones, in that the phone should send an earthquake alert to everyone in the area even if the ringer is silenced.

The other problem is a lack of sensors. One suggestion is to use inexpensive, crowdsourced seismic sensors to supplement the public sensors. This could not only be used to provide earthquake early warning, but it would also provide a more detailed quake shake map in realtime, allowing first responders to immediately know where to concentrate their efforts in the chaotic moments immediately following a major quake.

Crowdsources Seismic Sensors

There are at least two such government-funded projects, including the Community Seismic Network at CalTech. If you live in one of their program ranges (which includes Greater Los Angeles as well as the Bay Area) you can obtain a seismic sensor for free from them. Unfortunately, an examination of their website suggests they are using older (2011) technology that is not nearly as good as the crowdsourced technology being used in more recent air pollution devices like the Air Quality Egg.

The biggest problem we see with these devices (at least the ones we were able to find on websites) is that they require a USB connection to an always-on computer to report the life-saving seismic data. In addition, special drivers need to be installed on the computers.

This is so 2011. Installing drivers is a security risk and requires Administrative privileges. Many business IT departments will be reluctant to take part if privileged software needs to be installed on their machines. (In some cases it may violate contract with clients.) It is also unnecessary in this day and age, as the Air Quality Egg and inexpensive Arduino devices prove.

The devices should be installed on the lowest-level of a building, often with a special orientation (using the supplied compass to point to magnetic north with one of the devices). Since there is a physical limit to the length of USB cables, that creates further problems with this arrangement. (The lowest level will often be a garage or building lobby, possibly far away from any computers with Internet connections.)

Applying Arduino and Air Quality Crowdsourcing Technology to Seismic Sensors

An Arduino Yun costs around $80 these days — that’s for the prototyping system, not the finished product. This is a full-fledged Linux-system-on-a-chip, complete with WiFi and USB interfaces. It can be programmed with all major web development languages, so if CalTech has already written USB drivers for Mac and Windows it should be no problem to write a Linux ARM driver for the Arduino.

Once that’s done, you have a fully working prototype of your seismic sensor together a working WiFi interface — for $80 more that what you have now. The Arduino is an open source hardware prototyping platforming. What means is you also now have the full schematics for the product you’ve developed, and rights to use these in your product. You can simplify the design (you probably don’t need the USB components since you can directly connect the seismic sensor to the Arduino’s internal serial bus, nor some of the other microcontrollers or duplicated power supply components.) The Arduino Yun is at the high end. You can get a Raspberry Pi for $35 or less with Ethernet (also Linux) — Wifi will add $10 or less to that cost. There are numerous Arduino and Arduino-like boards under $20 that include Ethernet and/or Wifi for much less than $80, including some that are designed to cut costs to the absolute bone.

Even cheaper still is what the Air Quality Egg (AQE) has done, taking advantage of some of the new mass-produced home networking mesh chips. These chips cost around $5 and can communicate via a mesh with the central home node, which is then connected to the outside world Internet via WiFi or Ethernet. The AQE uses a Nanode, which is an inexpensive Arduino-like board, together with one of these home networking chips. These chips are also being widely used by DIY home automation enthusiasts together with Arduinos or Raspberries.

CalTech could probably hook their seismic sensor up to a Nanode or other Arduino and have it connect and report data over to an Air Quality Egg, which would then rely out to the Internet. (The Air Quality Egg is also open source hardware, so schematics for their remote nodes are fully available.)

The advantage of this approach is that it is plug-and-play with an always-on Internet device constantly recording data in the cloud. In addition, the seismic sensor can be wirelessly located far away from the Internet connect, much further than the physical limits that USB will allow. And business IT departments will not need to give privileges to an unknown USB drives; the AQE or similar can be placed on an unprivileged guest network. And, finally, as the AQE proves, all of this is very inexpensive (and continuously getting even less expensive).

Next steps: Check out our YouTube channel for more great info, including our popular "Data Science Careers, or how to make 6-figures on Wall Street" video (click here)!