On This Day in 1983, Data Analytics Might Have Been a Fail
On 26 September 1983, Stanislav Petrov took a stand against what his systems were telling him and he may have changed the course of history. Petrov was working as a duty officer at the command center for the Oko nuclear early warning system. This is the place where the Soviets monitored incoming attacks, much like the US command center you remember from War Games. Earlier that month, the Soviet Union shot down a Korean commercial jetliner over the Sea of Japan, claiming that it was on a spy mission. 269 people died in that incident, including a US Congressman. Some at the Soviet Union were fearful of a retaliation strike by the US. Cold War tensions were high.
At the command center, Petrov was getting data that a launch of five missiles had been made in the US towards the Soviet Union. But instead of just reading that dashboard and acting he actually used his own inner analytics system to process the data and decide not to report or react.
Had Petrov reported incoming American missiles, his superiors might have launched an assault against the United States, precipitating a corresponding nuclear response from the United States. Petrov declared the system’s indications a false alarm. Later, it was apparent that he was right: no missiles were approaching and the computer detection system was malfunctioning. It was subsequently determined that the false alarms had been created by a rare alignment of sunlight on high-altitude clouds and the satellites’ Molniya orbits, an error later corrected by cross-referencing a geostationary satellite.[5]
Petrov later indicated the influences in this decision included: that he was informed a U.S. strike would be all-out, so five missiles seemed an illogical start;[1] that the launch detection system was new and, in his view, not yet wholly trustworthy; and that ground radars failed to pick up corroborative evidence, even after minutes of delay.[6]
- Wikipedia contributors. "Stanislav Petrov." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 26 Sep. 2012. Web. 26 Sep. 2012.
I’ve always wondered if the system he was using had a bunch of fancy dashboard features, like shiny 3D pie charts, moving average lines and drill down capable reports if he would have been able to not trust the data. I’ve seen this sort of over-trust of data with data model diagrams. It seems the prettier or more advanced the presentation of the data is, the more people want to believe it is right. In fact, I’ve learned to present draft documents to people on my teams with hand-written notes/comments on them to sort of "break the ice" to show people that they are drafts. A modern solution might have included some sort of decision making guidance that say "Confidence Factor of Attack: 99%" or something like that. And it would have been highlighted by some sort of red bar, showing just how confident the system was based on the data – bad data, it turns out.
More details about Petrov and his actions in the video above from History.com
Ethics.Data.Gov – Where Open Data is Taking Us
I came across this video via Twitter from my friend Jim Hendler (blog | @jahendler). It’s a walkthrough by US Deputy Chief Technology Officer Chris Vein of http://ethics.data.gov .
This website brings together key open data sets such as White House visitors,lobbying, campaign donations, etc. As the URL shows, it’s a sub site of the over all US open data project, http://data.gov. You can see in the image below the datasets that comprise the Ethics data site:
The data is available for download and the website offers some nifty ways of working with, visualizing, and embedding the data. For instance, I’ve embedded the White House Visitor data right here. Go ahead, do some searching or filtering, right here.
You can change the column order by using the Manage button:
You can set up some fairly decent filters (is, contains, etc.) on the columns, too. Here are the visitors named Karen Lopez:
That’s not me. (I seem to recall that I am mayor of the Lincoln Bedroom on Foursquare, though.) This is the problem with trying to use something like First Name and Last Name as a primary key. My data does show up in the Federal Campaign donations list, though. Only one donation…my other donation was returned to me because "Canadians can’t donate to US campaigns". Unfortunately for that candidate, they assumed that I was Canadian based on my residency, not my citizenship. They lost the money, but the other campaign got to keep my money. The entire world is one big data modeling problem, I tell ya. Get your semantics and your syntax right and you can take over the world. Or at least the US.
The real power in open data is being able to find correlations. As Deputy CTO Vein mentions, one could match up the data from the White House visitors, lobbyists and campaign donations to see if you find any matches. That’s not bad, it’s just more information. This is tough to pull off with any certainty, though, due to that dang primary key issue I mentioned above. What might help this? URIs. Or some other way of uniquely identifying people and organizations.
To cross match data, you’ll need to use one of the Export methods of using the API (Socrata ) or download the data to your own tools.
Data is available for download in these formats:
You can also discuss the datasets right on the site (registration required). There are only 7 datasets that are part of this ethics website, but the data stewards are eager to find out what datasets you’d like to see added. I’d also like to hear what data you think should be part of an ethics website focused on data. I’m thinking:
- Expenditures that required extra approval/oversight
- Travel data (who went where an why)
Some of the criticism that I’ve heard about data.gov is that there are too few datasets or that so much more could be provided. I’ve even heard complaints about money being spent on this service. As Tony Clement, Canadian MP and President of the Treasury Board (site | @tonyclementCPC ) said recently about the Canadian open data initiatives: open data is about transparency. We can’t wait until we have all the data, in a perfect format, to share it. He also mentioned that open data is saving the Canadian Government in significantly reduced costs for Freedom of Information Access requests. Think about it. What open data will become is self-serve FOIA. No waiting around for someone to spend weeks or months to find some data, then thousands of dollars to prepare and provide it.
I’m also hoping that the move to open data will allow government data architects to influence good data management practices. Exposing the data to sunshine is going to allow us, the people who fund the data collection and processing, to point out where the data is poor quality. The usability and ability to integrate data sets is going to be key in making it useful.
I’m thinking that I’d like to use some of these sets and others from data.gov for some upcoming demos.
Subscribe via E-mail
Recent Comments
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Joey D'Antoni on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Thomas LaRock on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
Recent Posts
Downloads
- EDW 2013 Karen Lopez Get Blogging
- Karen Lopez presentation DAMA PS 2012
- Data Modeling Contentious Issues - DAMA Nebraska
- Karen Lopez - 10 Physical Blunders - DAMA
- Career Success In Data Profession - DAMA
- The Straw Poll
- You've Just Inherited a Data Model CheckList
- KarenLopez - 5 Physical Blunders - 24HOP-2011
- Handouts for OEMUG / CA Global Modeling User Group Why Be Normal Webcast
- Handouts Database Design Contentious Issues - New York 2010
- Handouts Database Design Contentious Issues - DC 2010
Archive
- May 2013 (5)
- April 2013 (5)
- March 2013 (4)
- February 2013 (7)
- January 2013 (12)
- December 2012 (2)
- November 2012 (3)
- October 2012 (3)
- September 2012 (13)
- August 2012 (5)
- July 2012 (17)
- June 2012 (2)
- May 2012 (4)
- April 2012 (4)
- March 2012 (8)
- February 2012 (11)
- January 2012 (3)
- December 2011 (10)
- November 2011 (8)
- October 2011 (5)
- September 2011 (3)
- August 2011 (9)
- July 2011 (5)
- June 2011 (5)
- May 2011 (5)
- April 2011 (9)
- March 2011 (4)
- February 2011 (9)
- January 2011 (8)
- December 2010 (15)
- November 2010 (27)
- September 2010 (2)
- August 2010 (1)
- July 2010 (4)




