Ethics.Data.Gov – Where Open Data is Taking Us
I came across this video via Twitter from my friend Jim Hendler (blog | @jahendler). It’s a walkthrough by US Deputy Chief Technology Officer Chris Vein of http://ethics.data.gov .
This website brings together key open data sets such as White House visitors,lobbying, campaign donations, etc. As the URL shows, it’s a sub site of the over all US open data project, http://data.gov. You can see in the image below the datasets that comprise the Ethics data site:
The data is available for download and the website offers some nifty ways of working with, visualizing, and embedding the data. For instance, I’ve embedded the White House Visitor data right here. Go ahead, do some searching or filtering, right here.
You can change the column order by using the Manage button:
You can set up some fairly decent filters (is, contains, etc.) on the columns, too. Here are the visitors named Karen Lopez:
That’s not me. (I seem to recall that I am mayor of the Lincoln Bedroom on Foursquare, though.) This is the problem with trying to use something like First Name and Last Name as a primary key. My data does show up in the Federal Campaign donations list, though. Only one donation…my other donation was returned to me because "Canadians can’t donate to US campaigns". Unfortunately for that candidate, they assumed that I was Canadian based on my residency, not my citizenship. They lost the money, but the other campaign got to keep my money. The entire world is one big data modeling problem, I tell ya. Get your semantics and your syntax right and you can take over the world. Or at least the US.
The real power in open data is being able to find correlations. As Deputy CTO Vein mentions, one could match up the data from the White House visitors, lobbyists and campaign donations to see if you find any matches. That’s not bad, it’s just more information. This is tough to pull off with any certainty, though, due to that dang primary key issue I mentioned above. What might help this? URIs. Or some other way of uniquely identifying people and organizations.
To cross match data, you’ll need to use one of the Export methods of using the API (Socrata ) or download the data to your own tools.
Data is available for download in these formats:
You can also discuss the datasets right on the site (registration required). There are only 7 datasets that are part of this ethics website, but the data stewards are eager to find out what datasets you’d like to see added. I’d also like to hear what data you think should be part of an ethics website focused on data. I’m thinking:
- Expenditures that required extra approval/oversight
- Travel data (who went where an why)
Some of the criticism that I’ve heard about data.gov is that there are too few datasets or that so much more could be provided. I’ve even heard complaints about money being spent on this service. As Tony Clement, Canadian MP and President of the Treasury Board (site | @tonyclementCPC ) said recently about the Canadian open data initiatives: open data is about transparency. We can’t wait until we have all the data, in a perfect format, to share it. He also mentioned that open data is saving the Canadian Government in significantly reduced costs for Freedom of Information Access requests. Think about it. What open data will become is self-serve FOIA. No waiting around for someone to spend weeks or months to find some data, then thousands of dollars to prepare and provide it.
I’m also hoping that the move to open data will allow government data architects to influence good data management practices. Exposing the data to sunshine is going to allow us, the people who fund the data collection and processing, to point out where the data is poor quality. The usability and ability to integrate data sets is going to be key in making it useful.
I’m thinking that I’d like to use some of these sets and others from data.gov for some upcoming demos.
Subscribe via E-mail
Recent Comments
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Joey D'Antoni on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Thomas LaRock on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
Recent Posts
Downloads
- EDW 2013 Karen Lopez Get Blogging
- Karen Lopez presentation DAMA PS 2012
- Data Modeling Contentious Issues - DAMA Nebraska
- Karen Lopez - 10 Physical Blunders - DAMA
- Career Success In Data Profession - DAMA
- The Straw Poll
- You've Just Inherited a Data Model CheckList
- KarenLopez - 5 Physical Blunders - 24HOP-2011
- Handouts for OEMUG / CA Global Modeling User Group Why Be Normal Webcast
- Handouts Database Design Contentious Issues - New York 2010
- Handouts Database Design Contentious Issues - DC 2010
Archive
- May 2013 (4)
- April 2013 (5)
- March 2013 (4)
- February 2013 (7)
- January 2013 (12)
- December 2012 (2)
- November 2012 (3)
- October 2012 (3)
- September 2012 (13)
- August 2012 (5)
- July 2012 (17)
- June 2012 (2)
- May 2012 (4)
- April 2012 (4)
- March 2012 (8)
- February 2012 (11)
- January 2012 (3)
- December 2011 (10)
- November 2011 (8)
- October 2011 (5)
- September 2011 (3)
- August 2011 (9)
- July 2011 (5)
- June 2011 (5)
- May 2011 (5)
- April 2011 (9)
- March 2011 (4)
- February 2011 (9)
- January 2011 (8)
- December 2010 (15)
- November 2010 (27)
- September 2010 (2)
- August 2010 (1)
- July 2010 (4)




