Grab this Free eBook on What’s New in SQL Server 2012 (288 pages)
Ross Mistry (site | @RossMistry) and Stacia Misner (blog |@StaciaMisner) have written Introducing SQL Server 2012, a 288 page eBook and it’s free for download. It’s currently available only as a PDF, but according to the publisher, .mobi and EPUB version will be available by 23 March.
Ross covers the newest features and enhancements on the database administration side while Stacia describes what’s new with business intelligence, data quality, master data, analysis services, integration services and reporting services.
Once you get the book, you’re going to want to get your hands on the pre-release (RTM) version of SQL Server 2012.
So head over to http://www.microsoft.com/sqlserver/en/us/default.aspx , download the RTM of SQL Server 2012 and get started. You should also check out the demos from the SQL Server 2012 Virtual Launch site for overviews of some of the newest features.
New in SQL Server 2012 SEQUENCEs: Why They Aren’t Just for Surrogate Keys
Starting with SQL Server 2012, database designers will have the ability to use SEQUENCEs to generate numbers, such as for populating surrogate keys (primary Keys with no meaning). Sounds like the IDENTITY property, right? It’s not quite the same. In SQL Server (and other DBMSs), the IDENTITY property is a characteristic of a column in a table. A database designer can set this property up to automatically generate a value in that column and specify certain features such as what number to start with and what number to increment by. These properties are set for a specific column and only one column in a table can have this identity property. That’s why IDENTITY is almost always used as a surrogate primary key on a table.
To create a column with the IDENTITY property in SQL Server 2012:
CREATE TABLE
[ database_name . [ schema_name ] . | schema_name . ] table_name
[ AS FileTable ]
( { <column_definition> | <computed_column_definition>
| <column_set_definition> | [ <table_constraint> ] [ ,...n ] } )
[ ON { partition_scheme_name ( partition_column_name ) | filegroup
| "default" } ]
[ { TEXTIMAGE_ON { filegroup | "default" } ]
[ FILESTREAM_ON { partition_scheme_name | filegroup
| "default" } ]
[ WITH ( <table_option> [ ,...n ] ) ]
[ ; ]
<column_definition> ::=
column_name <data_type>
[ FILESTREAM ]
[ COLLATE collation_name ]
[ NULL | NOT NULL ]
[
[ CONSTRAINT constraint_name ] DEFAULT constant_expression ]
| [ IDENTITY [ ( seed ,increment ) ] [ NOT FOR REPLICATION ]
]
[ ROWGUIDCOL ] [ <column_constraint> [ ...n ] ]
[ SPARSE ]
http://msdn.microsoft.com/en-us/library/ms174979(v=sql.110).aspx
I’ve highlighted the syntax that applies the IDENTITY to a column. SQL Server will know which column has the identity property (and you can only have one of these columns in the same table). Notice that there are only three options for an IDENTITY.
To create a SQUENCE in SQL Server 2012:
CREATE SEQUENCE [schema_name . ] sequence_name
[ AS [ built_in_integer_type | user-defined_integer_type ] ]
[ START WITH <constant> ]
[ INCREMENT BY <constant> ]
[ { MINVALUE [ <constant> ] } | { NO MINVALUE } ]
[ { MAXVALUE [ <constant> ] } | { NO MAXVALUE } ]
[ CYCLE | { NO CYCLE } ]
[ { CACHE [ <constant> ] } | { NO CACHE } ]
[ ; ]
http://msdn.microsoft.com/en-us/library/ff878091(v=sql.110).aspx
Notice that there is no TABLE or COLUMN referenced in that TSQL. You are just creating a SEQUENCE. SQL Server will have no idea what you are going to use it for or where it will be used. In fact, you could create a SEQUENCE and never use it. Notice there are several more options for a SEQUENCE.
SEQUENCEs are separate objects in a database and can be used to populate more than one column even across several tables. Because they aren’t bound to a table in any way, they can be used in ways that IDENTITY columns can’t be used. For instance:
- One SEQUENCE can be used across more than one table. For instance, a Control Number or Document Number on several tables. I’ve seen this used in other DBMSs to generate document numbers across subtypes, say Purchase Orders, Shipping Notices, Return to Vendor Numbers, etc. There’s a requirement that they be unique across all types of documents. This could not be done with IDENTITY.
- Using a SEQUENCE allows tables to have more than one auto-generated number in multiple columns. Think of Order Number and Cancellation Number, both on the Order table. You can’t do that with IDENTITY.
- Since sequences are generated completely outside a table, you can chose to store the resulting value in a different format, say VARCHAR or CHAR. Or, heaven forbid, you might want to append data to it or do other kinds of evil to it before storing it.
- Since sequences are generated completely outside a table, you can store the results in a NULLable column.
I think the confusion sets in because we tend to think of auto-generated number as useful only as surrogate keys. But we have real data needs for generating numbers other than just row identifiers, even if that’s the primary use for this. In fact, one could have a table with a PK that uses IDENTITY and an unlimited number of columns that use a SEQUENCE to set their values. Of course, SEQUENCEs can be used for surrogate primary keys, too. The existence of SEQUENCEs gives us another option over IDENTITY.
So SEQUENCE is a feature that just generates numbers based on some characteristics. A designer has many ways to use the number once it is generated and tables are not dependent on that method to get that number, nor do they even “know” about how that value was provided.
This separation of the COLUMN from how the data is populated is a feature of using SEQUENCEs. How one uses this separation depends on the technical and business requirements for the data.
Preview to Tuesday, 7 February Webinar on Tailoring Data Models
Here’s a preview data model for my webinar on 7 February (yes, today!) at 1PM EST. My topic is Help your Business Love its Data (Models): Tailoring Data Models for your Audience and it’s Part 2 of a three part serious on Getting Down to Business, sponsored by CA.
It’s free to register at https://www.ca.com/us/events/webcasts/na/Lets-Get-Down-to-Business-Help-your-Business-Love-its-Data-Models/02-07-2012.aspx
This is what I’ll be covering:
7 Feb 2012
1:00 p.m. – 2:00 p.m. ESTDuration 60 min.
There’s no one data model hiding in your modeling tool. There are actually thousands of them – not just multiple data model files, but different views and presentations of the same data model, each one ready to be used by different purposes and outcomes. In this session, Karen Lopez will discuss the steps in of preparing and presenting the “right” data model for the right audience, as well as making them accessible via the web. We will also cover the 10 tips for ensuring that your audience is happy they attended the data model presentation and looks forward to attending the next one.
As usual for my presentations, this will have a bit of snark and talk about good things to do in business data models an show some anti-patterns for modeling, too. That’s probably where the snark will come it. Can’t guarantee it, but it sounds about right.
Bring your ideas about tailoring the presentations of data models or any type of design. Oh, and if you have any action figures, bring those, too.
Deadlines and Data Architects: You know the Date, Do You Know the Deliverable? #MemeMonday
It appears I’m an expert at deadline-driven accomplishments. I mean, it’s meme Monday and even though I’ve known about the deadline for today’s post for weeks, I’m writing it at lunch on said day. I’ve been thinking about what I’d write about for all this time, so it’s not like I’m just starting…but still, the deadline is what making this post happen. I wish I could be the type of person who gets stuff done because it just needs to be done, but deadlines are what drive me.
The ultimate inspiration is the deadline.
-Nolan Bushnell
I’ve been thinking about some experiences I’ve had working with other data architects and working with deadlines. Usually as part of a team, we don’t get to set our own deadlines; they are set by a project manager or if we are lucky, via negotiation with the DBAs, developers and other architects on the team. Our deliverables are typically the input into their deliverables, so their success is depends our getting our deliverables to them. In my experience, though, it’s too common for data architects and data modelers to forget what deliverable those teams are waiting for or for them to make it available in the right format.
In his post for #mememonday, Thomas LaRock (@sqlrockstar | blog ) give his best tip for working towards a deadline: work backwards.
What is Your Deliverable?
This is the most frustrating question. It should be very apparent to a data architect what deliverables the team needs from them. But time after time I see modelers focused 99.999% on the data models. Yes, your data models should be beautiful. They should be complete, correct and pretty. Every single one of them should get a trophy. All the meta data you want to capture about them should be in the data model. Do you think, though, that your development DBA and developers are tapping their feet waiting for the PDFs of the data models? Or do you suppose that they just might want that alter script to run in their environment so that they can get to work on their deliverables?
Ideally, the team wants both. They want the model to be created and they want the scripts with all the changes needed for their next deliverable. I can assure you, though, that the scripts are what they want first. I can also assure you that no one will love the data models more than you do, Mr. Data Architect. So embrace the love you feel for the data models and funnel it into getting the "real" deliverable done: DDL. Sure, it hurts a bit that they don’t love your models as much as you do, but they will learn to love them because they produce beautiful databases.
On most my projects (mostly development ones) I never lose sight of the fact that ultimately, the data models are a method to get to good quality databases and data.
What Makes for a Good Deliverable?
I worked with a data architect who was very passionate about her data models. She spend days making them beautiful. She filled them with all kinds of nifty meta data to help people understand their data better. She wrote definitions that were paragraphs long, then added more notes to them just to make it even better. She often missed review meetings because she was too busy making a more beautiful data model. She missed deadlines for delivering DDL because she wasn’t quite sure on how to do that. She researched data modeling layout techniques to make models more readable. She demoed different layouts and took surveys about the alternatives to find just the right one. And she got further and and further behind on delivering revisions to development, her only direct customer of her work. The team had gone from every other day deliveries of fixes and enhancements to the development databases to every other week.
Our modeler didn’t have time to learn the features of our target DBMS because was busy making the model a perfect data model. Our sprints were failing because database changes were coming too slowly and were often done incorrectly. The DBAs and developers had to spend days cleaning up the DDL she had produced because she didn’t know how to use those features of our modeling tool. So not only did she miss the deadlines, she missed the deliverables. She didn’t understand this, though. In her mind, the data model was all anyone needed. In reality, what they really needed was working DDL scripts to apply to their environments. That was what good meant to the team.
What Format?
I worked with another data architect who would spend months working on the data model, then publish it inside a PDF of a Word document. The actual data model file was not shared. No DDL was shared. Nothing was delivered that teams could actually import into their tools or apply to their local development environment. This also meant that it was nearly impossible to comment on the data model or easily provide corrections or feedback. The modeler wanted corrections to come to him in emails and he would make them and generate the Word documents and PDFs months later. Requests for the data model file itself were met with "You don’t need that, just retype the model in whatever tool you use". We can’t build stuff on PDFs. Produce them, sure. But they aren’t the deliverables we are looking for. Your team (your customers) can tell you what format works best for them. Just ask.
Where Do You Deliver the Deliverables?
I work with many data architects who don’t use the same deliverable locations as the rest of the project team and I think that’s a huge failure. Professional development teams use some sort of versioning or configuration control environment to share their work product. Data architects should use those, too. Yes, we have our repositories and model marts, but those are typically only accessible by people who use the modeling tools. They are my versioning control for the models, but I also deliver to the teams in the same place they expect to find all the components of their systems. Maybe it’s just a wiki, or an intranet location. Where ever it is, I’m delivering there.
I worked with a data modeler who refused to use or learn the team’s versioning system. So he just emailed around files, with no version numbers and expected people to be able to search through their emails to find scripts to deploy them in production environments. He seemed to randomly include and exclude people in the distribution list. He often just attached a bunch of files, then said "here they are" in the body of the message. Email is the worst versioning system in the world. Don’t use it for deliveries.
Another approach I find annoying is fairly new, though. I work with a an architect that keeps all the files he works on for several clients on Dropbox. All in one giant folder. When he gets his work done, he sends out an email saying: it’s all on Dropbox. We as a team have to try to figure out what the files are, which is the right version and try to ignore all the other stuff that sitting there in the folder. As afar as I’m concerned, he might as well randomly generate his data models.
Work Backwards
If you have the answers to those questions above, you can work backwards to meet your deadlines. My inefficient team members mentioned above don’t do this. They start with a generic to-do list of how to produce data models, start at step one and work their way as far as they can before they hit the deadline. They often miss deadlines because they haven’t started at the end and worked backwards. Don’t be that guy (or gal). Know what you are expected to deliver, when, where, and in what format.
I start with what my deliverable is, what format I need to produce it in, and what measures I should use to ensure it’s good enough. Then I start there with the tasks that will get me closest to that completion. It’s only when those are completed do I keep working backwards to fill in the other high priority tasks. If I still have time left, I do more. I make it prettier. I make it more useful. I make it beautiful. I love it.
Not all data architecture is done on development projects. I understand that. If your duties include supporting development teams, though, you need to support them. That means loving your data models, the data AND databases. There’s no reason why you can’t have it all. Just remember which parts you’re supposed to deliver first.
Have you worked with data architects or modelers who worked backwards and got the job done? What about people like I’ve mentioned in this post? What would you wished they had done instead?
Get Hands On with SQL Server 2012 – Virtual Labs
Want to get your hands on SQL Server 2012 right now? You can be up and running without needing a server or to install SQL Server locally. Just visit SQL Server Virtual Labs (use IE) and choose a lab to get started. The labs run in a virtual machine, so these aren’t just slides and a demo. This is real, hands-on working with the tools. Here’s your chance to get up to speed on SQL Server before the other 99%.
.
You can get real experience with new SQL Server 2012 features such as:
- AlwaysOn, SQL Server’s new High Availability feature. I have to say this is fun to play with. Pull the plug and watch SQL Server gracefully fail over. Like nothing happened at all. Monitor the status of all the AlwaysOn components. Very cool stuff here.
- Master Data Services, SQL Server’s new feature for managing reference and master data.
- Data Quality Services, a new feature for helping you Love Your Data even more.
- Sparklines and Data bars, new data visualization features.
- Columnstore Indexes, SQL Server’s new feature to make queries just fly.
- Spatial and Location Services for mashing your data up with location based services.
While you are running the VM set up, register for the SQL Server 2012 Launch events coming in March in the US. I hope to make it to one or two of those.
Another Zombie Job Posting…Data Architect Designer Implementer Operational Support
I blogged over on Dataversity about Hiring Data Professionals: Mason Dixon Lines and Zombies in Your Job Postings . In that rant, I talk about organizations that want to hire people who can do everything in the data column of the Zachman Framework.
I call these people "wonder candidates" and write about how they don’t exist in sufficient numbers to supply all the organizations in the world:
It would seem to make sense that if you were hiring a data professional you’d design a position that fills in the Data column, right? No? It turns out, though, that most people don’t think and work along a column. In my experience, people aren’t passionate about tasks that span columns from top to bottom. They normally aren’t skilled along the whole column, either. Referring to the Zachman Framework, what sorts of skills and passions would this candidate need: planning, architecting, designing, building systems, building parts, keeping the systems up and running.
I thought about my rant in this area while reading a job posting on Dataversity for a Data Architect. I’m sure the people at Miami Children’s Hospital do amazing things, probably with very limited budgets. That’s why these hiring organizations tell me they have to fill their positions with Architect Designer Developer Implementer Operational Support Wonder Candidates. I’m going to pick on this posting, so apologies to the hospital for using them as an anti-pattern for finding good data architects. I’m sure they are nice people there and really want to get to successful database and data warehouse solutions. You might even want to apply for that job.
“Designs and constructs very large relational databases for data warehousing. Develops data models and is responsible for data acquisition, access analysis and design, and archive/recovery/load design and implementation. Integrates new data with existing data data warehouses in design and planning.”
Right there we have the keywords design, constructs, develops, implementation. These activities are done in different rows in the data column of the Zachman Framework. There’s also this:
performance tuning, data retention policies, data classification, data security, and data acquisition….Data modeling experience. Database and application object management, including DDL, table constraints and triggers, clusters, object storage allocation and tuning, indexing options and tradeoffs, partitioning, etc., experience.”
Those activities are clearly down in the lower half of the Framework. Yet data modeling, which exists along the entire data column, is not typically a strong skill set for people who work so far down in the Framework. So my guess is that professional data architects and modelers will not be qualified to do the clustering/partitioning/indexing/performance tuning part of the job and that implementers who can won’t be qualified to prepare and maintain the data models they also want out of this role.
If I were interviewing for this type of position, I’d focus on why this organization wants data models but doesn’t seem to want to fund a data architect. It’s sounds crazy, but I recommend that organizations not incur the costs of preparing and maintaining data models when they don’t want to work with professional data modelers. They won’t see many of the benefits of having an active data model but will incur all the costs and the risks associated with preparing incorrect ones.
I realize that there are many successful IT professionals who can work along many rows and columns. I’ve worked with these amazing people. But staffing a team of these amazing people is costly: they are difficult to find, expensive to hire, and tough to keep around because:
There may be people who can do a lot of those things, but in my experience they aren’t passionate about all of them. New hires won’t be happy and the organization will not realize the economies that they think they will.
I recommend that if organizations want to combine responsibilities that they do so across the columns in the same range of rows. Combining positions where thought processes are similar (business and data analysts, DBAs and developers, etc.). Analysts in general make for good analysts in other columns. Operational people tend to think operationally, builders tend to think mostly of building, not planning well. Let’s not drag people up or down the rows.
Go now and check your job postings. Do they reflect the true nature of the job? Or are they actually full of zombies ready to drag someone to an assignment that they don’t really want?
Do you work with any of these Zombies? People who have been hired to fill several jobs, but don’t have the passion or skills to do all of them? How is that working?

Subscribe via E-mail
Recent Comments
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Joey D'Antoni on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Thomas LaRock on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
Recent Posts
Downloads
- EDW 2013 Karen Lopez Get Blogging
- Karen Lopez presentation DAMA PS 2012
- Data Modeling Contentious Issues - DAMA Nebraska
- Karen Lopez - 10 Physical Blunders - DAMA
- Career Success In Data Profession - DAMA
- The Straw Poll
- You've Just Inherited a Data Model CheckList
- KarenLopez - 5 Physical Blunders - 24HOP-2011
- Handouts for OEMUG / CA Global Modeling User Group Why Be Normal Webcast
- Handouts Database Design Contentious Issues - New York 2010
- Handouts Database Design Contentious Issues - DC 2010
Archive
- May 2013 (4)
- April 2013 (5)
- March 2013 (4)
- February 2013 (7)
- January 2013 (12)
- December 2012 (2)
- November 2012 (3)
- October 2012 (3)
- September 2012 (13)
- August 2012 (5)
- July 2012 (17)
- June 2012 (2)
- May 2012 (4)
- April 2012 (4)
- March 2012 (8)
- February 2012 (11)
- January 2012 (3)
- December 2011 (10)
- November 2011 (8)
- October 2011 (5)
- September 2011 (3)
- August 2011 (9)
- July 2011 (5)
- June 2011 (5)
- May 2011 (5)
- April 2011 (9)
- March 2011 (4)
- February 2011 (9)
- January 2011 (8)
- December 2010 (15)
- November 2010 (27)
- September 2010 (2)
- August 2010 (1)
- July 2010 (4)




