Blog

Apr 01
Choosing DATETIME2 as your PK can lead to some exciting challenges

​Category                                       DBA

One year ago when I really started focussing on SQL Server Database Administration I was challenged by a talk on choosing the right Index. At that stage I was convinced there was only one answer – INT IDENTITY and was very surprised when somebody suggested a DATETIME2. I was fascinated with the challenge and could imagine all sorts of applications and solutions to current performance issues that I was dealing with when handling high-velocity data.

The thing that excited me most about the solution was the following points:

  1. DATETIME2 adheres to the Clustered Index Design Guidelines as defined by MSDN
    1. Basically you should ensure that your Clustered Index key is narrow, unique, static and ever-increasing
    2. Kimberley Tripp goes into further detail in her blog post: The Clustered Index Debate Continues
  2. DATETIME2 also supports the RANGE query and is very helpful for SELECT queries
  3. Avoids the need for the AutoID INT IDENTITY column which is rather meaningless from a business point-of-view

I was aware of the fact that we could get duplicates and we discussed this during the design session – the application doing the INSERTs has the following characteristics:

  1. INSERTs are performed one-at-a-time which I believed meant that we wouldn't have duplicates if I used a DATETIME2 as a Clustered Index and a Primary Key
    1. I also added the DATETIME2 column with a default as we did not want to change the .Net Application
  2. Only one instance of the Application should ever be running at the same time – multiple instances have some serious negative side-effects
  3. We required a Business Key (Unique) to ensure that the transactions were not settled multiple times

Design work was done and the TABLE design went through all the levels of testing without any complaint. Yes we did have a challenge with existing data as it data which violated the Unique Key but that was resolved with a data fix and it removed dirty data from the existing table.

It wasn't long after the table changes went live that we started to experience problems with the INSERTs failing on a DUPLICATE KEY. This immediately raised many concerns and we had to go through a series of discussions and SQL Profiler tracing sessions to understand exactly what was happening.

Very early on we ruled out some important points:

  1. No multiple application instances
  2. SQL profiler was indeed showing the fact that the application was not doing a batch INSERT

So what was the problem?

One of things that I noticed whilst analysing the SQL Profiler trace was that the start time for 3 INSERTS where exactly the same – how is this possible?

Early this week I read a blog on learning by breaking during the week and it was one of those blogs where you can fully relate to what the author has said because you have done it before but you that you don't do it enough. So I started building my one simple model to break the design, then I developed some test cases to break the design. Strangely enough, it didn't break (see profiler results). This now started worrying me for the following reasons:

  1. My profiler trace was not getting the same start times that I saw on production – several rows with the same start time
  2. Was there a faster way to run the data inserts
  3. Production had to be fixed by 06:00 in the morning and I could not replicate the problem or break the design

At this stage of the evening and the week as well, it was time to braai and drink red wine. Typically, I really enjoy this time-out but my brain could not understand the problem. After a couple of glasses of red wine and a lovely time with my wife I couldn't go back to my machine and I went to bed. Bad mistake, by 01:00 my head would not let it go and I realized that I had try and emulate the load – several sessions running concurrently. So I developed Test Case 3 something I would never normally do but it got closer to the way in which the .Net code was running

Figure 1.png

Figure 1: .Net INSERT code

As I started my machine and ran the test code, it took forever and I noticed that I turned on the Actual Execution Plan. Switch off the Execution plan and it failed immediately – what a relief

Figure 2.png

Figure 2: Duplicate Key Error


Now I could start fixing the problem. The change to CPS included adding a TRIGGER (I know bad news but we couldn't change the application) but at the same time this meant that I add a code point to effect a change that was desperately needed.

First attempt: DATEADD(ns,A,SYSDATETIME()) using the value of A add it to the new NOW for DATETIME2. No luck, it still failed. This puzzled me as I was calculating the value for PK and it should be unique. In researching into DATEADD I found a very interesting clause

This meant that A was not causing the required change. So I decided to go for broke

Second attempt: DATEADD(ns,10000*A,SYSDATETIME()) and it worked! What a relief.

Now for the volume test I had being dreaming about – 3 SSMS Sessions running concurrently but using different stating points for @A – 10,000.00; 20,000.00 and 30,000.00. After several seconds the first SSMS session failed with the duplicate key. Fortunately for me, this was something that we want to happen but it does not vindicate the table design.

I applied a similar change to production: InsertDateTime = DATEADD(NANOSECOND,10000*I.fkCPSTransactionRowId,SYSDATETIME()).

The change worked and the CPS INSERTs completed successfully.

Questions that still need answering:

  1. How does this Duplicate Key happen:
    1. Can I be doing multiple INSERTs at the same time (within the same nanosecond) to get a DUPLICATE KEY?
  2. Why do I need to apply such a radical bump factor – I tried with 1000*A and it still failed

Table Design answer:

  1. Go back to an INT IDENTITY column for the Primary Key – yes it is a waste but more reliable
    1. Fix the CONCURRENCY by using the correct Business UNIQUE KEY to prevent loading duplicates – we have done this so we don't need to stress
  2. Use the InsertDateTime for the CLUSTERED INDEX to maintain the order of data storage

 

Figure 3.png

Figure 3: Really simple table design

Figure 4.png

Figure 4: Test Case 1

Figure 5.png

Figure 5: SQL Profiler Trace for Test Case 2

Figure 6.png

Figure 6: Test Case 2

Figure 7.png

Figure 7: SQL Profiler for Test Case 2

Figure 8.png

Figure 8: Test Case 3

Mar 04
You have to feel my pain …… a look at customer development

​Customer development is new to me, in fact until Howard introduced me to the concept, I had never heard of it.  He had stumbled upon it at a startup conference last year hosted by Jozihub.  Specifically for startups, the concept was formulated to research the viability of a product prior to its development and to ascertain whether anyone would be willing to pay good money for that awesome idea that will make us rich quickly. Howard saw it benefitting us hugely.

Researching customer development

I spent time researching and understanding this rather touchy feely marketing approach. Interviews are carried out in a very broad manner, the interviewer doesn't lead the customers in any way, allowing the interviewee to talk about what he believes will make his business successful and the challenges that stand in the way of that success.

I learnt that it is an effective way to market research a nontangible concept like software. We will have no packaging, no pretty product not even any cool features to advertise or market, as Veronica coined it… we sell vaporware.

Howard was keen for someone who wasn't a developer to do the customer development interviews because, as I did not have the capabilities to fix the problem the customer presents I would really listen and unlike him, my head would not be searching for a solution while the interview is progressing. I would simply listen!

My script has various paths that I could take depending on the answers to my questions. I formulated a flow chart as seen below and began finding out about 'the who', where the interviewee is placed in the organisation, following on with questions about success and what it looks like. Subsequent questions pertained to data and how it challenges the organisations idea of success. I bak this up with a set of questions about the people within the organisation and how they were involved in the solution process. A common theme throughout the questioning is to repeatedly ask why…. 'Why was that hard?', 'Why did your staff not understand the urgency?' hoping that the questions will encourage the interviewee to probe the problem deeply. Every question should be followed up with a 'why?'

My final set of questions, the most challenging to me, pertained to what the interviewee would be willing to pay to make that pain go away.

With many service based products positioning oneself pricewise is one of the biggest challenges faced in software development, but I have to confess that asking that particular question is daunting for me.

Customer Dev screenshot.png

The flow chart of questions that we formulated shows that the questions are rhetorical and can be visited multiple times

We are good to go

Once we were happy with the questions Howard set about finding a sympathetic first subject, deciding on his friend and colleague JG Cowper CDO from Healthbridge. He felt that JG would be patient, but also able to give a solid interview. We sat in together for this interview and these are my findings on the interview process

  • It is important to set the context before the interview commences; otherwise the process can head off in a direction that will bring no returns. JG commented that is important for us to hear his pain even if it is not directly connected to his data, which is true but if I fail to keep steering the conversation back to data we will not get the information that we are looking for.
  • Customer development questions are ethereal for the reason and you have to resist the temptation to lead the conversation. I imagine with each interview the questions will evolve as we work out what was successful and what was not. We learnt that the first interview will usually need to be followed up with a second more specific interview to fill in missing holes in our feedback

  • More than one interview is probably needed. The first is the general and non-leading, the second more specific as we discover parts of data governance that have not been covered and to clarify the meaning of some answers.


Formulating the feedback

Infographic-04-03.pngThe infographic illustrating the findings of our customer development interview


While carrying out the interview we discovered that JG was very enthusiastic about plotting the findings of the interview on an industry-accepted standard such as DAMA. Customers are curious about where they are pegged in the industry and will be far more open to giving us their valuable time if we can give them some feedback. Whilst Howard is not mad about getting too carried away with DAMA or even IME we need to come up with a compromise.

I transcribed the interview and Howard and I plotted the results on a DAMA wheel noting both the physical and emotive results, during this process we learnt that

  • Compiling visual material is essential and an exciting way to document the findings, explain the results to your interviewee. We designed a heatmap, and infographic  documenting the results.
  • The heat map was the most powerful visual tool, made up with emotional (amount of stress shown and how many times emotive words were used) and physical (actual answers to questions) feedback from the interview. Inadvertently the emotional one drew the most reaction and became the focus of attention, leading us to re-evaluate and we decided that the physical one needs to be presented alone initially then later with the emotive overlay

  • At this stage we went back for more. Filling in the missing 11 tangents of DAMA and presenting our findings so far.

Customer development is time consuming and labour intensive!

 Dama feedbackemotions3.png

The heatmap formulated from the customer development interview and plotted on a DAMA wheel with both the physical and emotive reactions

A Final report was compiled and Howard used the industry standard Gartner BI for recommendations. Staffing issues are reported as success is often about placing people in the correct positions, to stop running and get others to do the running for you! Howard offered tips to building a successful BI platform that would facilitate the transition from being IT-Curated to User-Empowered. This is a fascinating journey that we are on and one that we hope will give us deeper understanding on data governance and the requirements of the industry.

 

Feb 02
Why should I be mentored?

​There are two reasons that I feel are the most important to me:

  1. How can I mentor others if I am not being mentored?
  2. For me to really enjoy and excel at something I need to know "How to think"


To Mentor you need to be mentored


Many years ago I attended a financial planning conference in Cape Town. I went down expecting to learn all sorts of ways to get rich and plan for the future. What I wasn't expecting was to be challenged about my mentoring of the people in my business!


One of the talks was given by a man who was in his 80s and he said that he still had a mentor! I thought that by the time you got that old, there was not much people could teach you about life. He said something that has radically changed my thinking about mentoring:


"How can you mentor people if you are not being mentored yourself?"


I started my career of 29 years after finishing my BSC Computer Science degree, working for Edgars, a clothing company in South Africa, as an Adabas Database Administrator on an IBM mainframe. The learning curve was very steep and I enjoyed every minute of the long climb. The technical support department was very lucky to have a man called Dennis Murray. He challenged everything I said and I soon realized that I couldn't open my mouth before I knew what I was talking about.


Dennis was in charge of MVS and didn't really understand the world of Data. To my surprise he didn't need to! If I presented what I had researched and understood and what I didn't understand, he could help me without fail. The process of preparing for a discussion with him and then watching him detect my incorrect or lack of understanding was intense, mainly because I thought I had prepared properly. After spending 3 years with him I had grown a lot technically and today I still attribute much of my success to this time spent with Dennis.


I learnt how to think and work through the issue facing me. The first point was always, do you really understand what you are talking about. If not, go back and find out where you lost the plot. This meant that I had to break down my understanding to the simplest level. Many years later I read about Occam's razor:


"The principle states that among competing hypotheses, the one with the fewest assumptions should be selected. Other, more complicated solutions may ultimately prove correct, but—in the absence of certainty—the fewer assumptions that are made, the better."


As I progressed through the ranks, the people that could mentor me in the way that Dennis could got smaller and smaller. Living in South Africa made it harder and harder and I stopped searching for a wise man. This is one of the bigger mistakes I have made in my life.


Last year, I received a call from my brother who has a software development company, saying, "We really, really need help!" For some reason, I have always got involved in difficult and stretching engagements. This was no different, the learning curve was similar to the early days with Edgars. Fortunately, the internet was the fire hydrant that would provide the water I needed to drink. The #sqlfamily was HUGE and for months I could read enough about the "accidental DBA"! As my knowledge on the how and what of SQL Server progressed I started to realize that my foundation was not deep enough.


I would listen to myself talk to the developers and constantly hear Dennis say – really? This pushed me learn more and more but I still feel that I don't really know how to think like the SQL Engine. And this is where I need help to get my thinking right.


I read a fascinating article on "Dashboard visualizations", it discussed and explained the thinking behind the dials in an aeroplane cockpit. An inexperienced person just sees 100s of dials and wonders what do I look at? The article then explained that the main dials are always at eye-level and then the other dials are positioned according to a well-defined story of events. This means that the next dial is dependent on what the previous dial told you. This was a lovely analogue of what we should all strive for when building a feedback system to management and other readers.


Then something else struck me, words that another brother challenged me with: "If you can't explain something simply then you probably don't understand it well enough to teach me!"

   

I need to know "How to think"!


I spend a lot of time training people to "take over" from me and it is really worrying that I am still not sure that I fully understand "How to think" about the SQL Engine. I need to be able to work things out from first principles and build on to facts / hypotheses that I have collected and developed, knowing that I am building on stone and not sand.


I have been developing MS SQL Server software solutions for customers for the last 10 years and I am constantly amazed at how I continue to learn on a daily basis. As the software development community progressed from Cobol, C to C++ I quickly realized that just learning the new C++ syntax was not good enough. I had to learn about Object-Oriented design and development. The change was radical, and I can see with T-SQL how my coding has changed from RBAR to "thinking in sets". I had to force myself to learn and not just code to be "DONE"!


On-Time Delivery is important but I also had to be CORRECT for the world I was living in.


Now I find myself in the uncomfortable position of knowing that my thinking is not real and flaky in some areas.


In summary


I am at an age now where the need for success is not as importance as significance is, and I would love to help people more and more. Whilst I am comfortable admitting that I don't really understand, I want to really understand so that I can help others around me. South Africa still suffers from a small knowledge base. This in no ways means that we are not capable, just we are limited in the people that we can rub shoulders with and exchange ideas and understanding.


What we don't need is the sharing of "flawed thinking" and this bothers me. Fortunately the #sqlfamily community is not lacking in the ability and desire to help in the areas that I am short in.