Browsing articles in "Data Visualization"

On This Day in 1983, Data Analytics Might Have Been a Fail

Sep 26, 2012   //   by Karen Lopez   //   Blog, Data, Data Visualization  //  2 Comments
Stanislav Petrov – Human decision making

 

On 26 September 1983, Stanislav Petrov took a stand against what his systems were telling him and he may have changed the course of history.  Petrov was working as a duty officer at the command center for the Oko nuclear early warning system.  This is the place where the Soviets monitored incoming attacks, much like the US command center you remember from War Games.  Earlier that month, the Soviet Union shot down a Korean commercial jetliner over the Sea of Japan, claiming that it was on a spy mission.  269 people died in that incident, including a US Congressman.  Some at the Soviet Union were fearful of a retaliation strike by the US.  Cold War tensions were high.

At the command center, Petrov was getting data that a launch of five missiles had been made in the US towards the Soviet Union.  But instead of just reading that dashboard and acting he actually used his own inner analytics system to process the data and decide not to report or react.

Had Petrov reported incoming American missiles, his superiors might have launched an assault against the United States, precipitating a corresponding nuclear response from the United States. Petrov declared the system’s indications a false alarm. Later, it was apparent that he was right: no missiles were approaching and the computer detection system was malfunctioning. It was subsequently determined that the false alarms had been created by a rare alignment of sunlight on high-altitude clouds and the satellites’ Molniya orbits, an error later corrected by cross-referencing a geostationary satellite.[5]

Petrov later indicated the influences in this decision included: that he was informed a U.S. strike would be all-out, so five missiles seemed an illogical start;[1] that the launch detection system was new and, in his view, not yet wholly trustworthy; and that ground radars failed to pick up corroborative evidence, even after minutes of delay.[6]

- Wikipedia contributors. "Stanislav Petrov." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 26 Sep. 2012. Web. 26 Sep. 2012.

I’ve always wondered if the system he was using had a bunch of fancy dashboard features, like shiny 3D pie charts, moving average lines and drill down capable reports if he would have been able to not trust the data.  I’ve seen this sort of over-trust of data with data model diagrams.  It seems the prettier or more advanced the presentation of the data is, the more people want to believe it is right.  In fact, I’ve learned to present draft documents to people on my teams with hand-written notes/comments on them to sort of "break the ice" to show people that they are drafts.  A modern solution might have included some sort of decision making guidance that say "Confidence Factor of Attack: 99%" or something like that.  And it would have been highlighted by some sort of red bar, showing just how confident the system was based on the data – bad data, it turns out.

More details about Petrov and his actions in the video above from History.com

Ethics.Data.Gov – Where Open Data is Taking Us

Mar 15, 2012   //   by Karen Lopez   //   Blog, Data, Data Visualization, Open Data  //  No Comments

I came across this video via Twitter from my friend Jim Hendler (blog | @jahendler).  It’s a walkthrough by US Deputy Chief Technology Officer Chris Vein of http://ethics.data.gov

Walkthrough of Ethics.Data.Gov

 

This website brings together key open data sets such as White House visitors,lobbying, campaign donations, etc. As the URL shows, it’s a sub site of the over all US open data project, http://data.gov.  You can see in the image below the datasets that comprise the Ethics data site:

Ethics.data.gov datasets list

The data is available for download and the website offers some nifty ways of working with, visualizing, and embedding the data. For instance, I’ve embedded the White House Visitor data right here. Go ahead, do some searching or filtering, right here.

Powered by Socrata

 

You can change the column order by using the Manage button:

Show and hide columns

You can set up some fairly decent filters (is, contains, etc.) on the columns, too.  Here are the visitors named Karen Lopez:

Filter Columns

That’s not me.  (I seem to recall that I am mayor of the Lincoln Bedroom on Foursquare, though.) This is the problem with trying to use something like First Name and Last Name as a primary key.  My data does show up in the Federal Campaign donations list, though.  Only one donation…my other donation was returned to me because "Canadians can’t donate to US campaigns".  Unfortunately for that candidate, they assumed that I was Canadian based on my residency, not my citizenship.  They lost the money, but the other campaign got to keep my money.  The entire world is one big data modeling problem, I tell ya.  Get your semantics and your syntax right and you can take over the world.  Or at least the US.

The real power in open data is being able to find correlations.  As Deputy CTO Vein mentions, one could match up the data from the White House visitors, lobbyists and campaign donations to see if you find any matches.  That’s not bad, it’s just more information.  This is tough to pull off with any certainty, though, due to that dang primary key issue I mentioned above.  What might help this? URIs.  Or some other way of uniquely identifying people and organizations.

To cross match data, you’ll need to use one of the Export methods of using the API (Socrata ) or download the data to your own tools.

Data is available for download in these formats:

Download As

You can also discuss the datasets right on the site (registration required).  There are only 7 datasets that are part of this ethics website, but the data stewards are eager to find out what datasets you’d like to see added.  I’d also like to hear what data you think should be part of an ethics website focused on data. I’m thinking:

  • Expenditures that required extra approval/oversight
  • Travel data (who went where an why)

Some of the criticism that I’ve heard about data.gov is that there are too few datasets or that so much more could be provided.  I’ve even heard complaints about money being spent on this service.  As Tony Clement, Canadian MP and President of the Treasury Board (site | @tonyclementCPC ) said recently about the Canadian open data initiatives: open data is about transparency.  We can’t wait until we have all the data, in a perfect format, to share it.  He also mentioned that open data is saving the Canadian Government in significantly reduced costs for Freedom of Information Access requests.  Think about it.  What open data will become is self-serve FOIA.  No waiting around for someone to spend weeks or months to find some data, then thousands of dollars to prepare and provide it.

I’m also hoping that the move to open data will allow government data architects to influence good data management practices.  Exposing the data to sunshine is going to allow us, the people who fund the data collection and processing, to point out where the data is poor quality.  The usability and ability to integrate data sets is going to be key in making it useful.

I’m thinking that I’d like to use some of these sets and others from data.gov for some upcoming demos.

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to blog.infoadvisors.com by Email


Facebook Flickr foursquare Google+ LinkedIn Skype StumbleUpon Twitter YouTube
UA-356944-2