SQL Server– How to Encrypt Column Data
Encryption is one of the most secure way to protect your confidential data like Social Security Number, Date Of Birth , Patient clinical information etc. We need to encrypt certain data to meet the business requirements and sometime to meet certain compliance. Here we’ll see how can we encrypt column data using symmetric key encryption in SQL Server 2012.
DEMO
Lets first create a table and insert a record into it. Here in this demo we’ll encrypt the RegdNumber column data.
CREATE TABLE Employee (EmpID int, EmpName varchar(100), RegdNumber nvarchar(20), EncryptedRegdNumber varbinary(128))
GO
INSERT Employee (EmpID, EmpName, RegdNumber)
Values (1,‘Jackson’,N’Regd_AA1′)
Steps to encrypt column data
Step 1- Check if master key does not exist then create one
IF NOT EXISTS
(SELECT * FROM sys.symmetric_keys WHERE symmetric_key_id = 101)
CREATE MASTER KEY ENCRYPTION BY
PASSWORD = ‘abc12$$^7′
GO
| Note : You should back up the master key by using BACKUP MASTER KEY and store the backup in a secure, off-site location. |
Step 2 – Create certificate
CREATE CERTIFICATE EmployeeCert
WITH SUBJECT = ‘Employee RegdNumber’;
GO
Step 3 – Create symmetric key using the certificate
CREATE SYMMETRIC KEY EmployeeRegdNumberKey_1
WITH ALGORITHM = AES_256
ENCRYPTION BY CERTIFICATE EmployeeCert;
GO
Step 4 – Open the symmetric key using which the data will be encrypted.
OPEN SYMMETRIC KEY EmployeeRegdNumber
DECRYPTION BY CERTIFICATE EmployeeCert;
Step 5 – Encrypt the value in column RegdNumber using the symmetric key EmployeeRegdNumberKey_1.
UPDATE Employee
SET EncryptedRegdNumber = EncryptByKey(Key_GUID(‘EmployeeRegdNumber’)
,RegdNumber , 1, HashBytes(‘SHA1′, CONVERT( varbinary , EmpID)));
GO
Now query the original RegdNumber, encrypted RegdNumber and decrypted RegdNumber. If encryption is working fine the the original RegdNumber column value should match with DecryptedRegdNumbers
SELECT RegdNumber ,EncryptedRegdNumber AS ‘Encrypted RegdNumber’,
CONVERT(nvarchar, DecryptByKey(EncryptedRegdnumbet, 1 ,HashBytes(‘SHA1′, CONVERT(varbinary, empid)))) AS ‘Decrypted RegdNumber’
FROM Employee;
GO
The above script you can use in your local environment to check how you can encrypt column data.
Hope the above demo will help to understand how to encrypt column data.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server – Index Selection Fundamental
Index is one of most widely discussed topic among developers and DBAs. During query optimization Index is one of the important aspects. I came across multiple instances where Developer/ DBA creates Index but they are not sure whether that index is used by the optimizer or not. Also I received emails from developer asking How to design an Index? So Here I’ll try to explain the fundamental of Index Selection.
While designing an Index the basic idea is that the optimizer should use the index so that it will fetch the data using an optimal path with fewer disks I/O operation and less system resource usage.
Below are the important parameters for Optimizer to consider before it selects an Index to use
- Predicate from the WHERE clause
- Join Condition
- Other limiting option used in a query (Ex: Group by, Distinct etc.)
Using Index, optimizer can perform either Seek Operation or Scan Operation.
Seek Operation is used when fetching a single value or a range of value and in Scan operation it traverse through the Index pages either forward or backward.
All predicate used in where clause are not always used for Index selection. Predicate which can be converted into an Index operation are often called SARGgable or Search-ARGument-able and predicate that do not match the selected index are called non-sargable predicates which are used as filter predicate. SQL SERVER usually evaluated non-sargable predicates with in the seek/scan operator in the query tree.
Once you create an Index always checks with your query to determine whether it improves your query performance. If it is not improving your query performance then remove that index.
To understand SARGgable and Non SARGgable predicate, let me demonstrate it using a simple demo.
For this Demo I’m using AdventureWorks database which can be downloadable from here.
Step 1: create table Product_1 using Product table.
SELECT * INTO Product_1 FROM [Production].[Product]
Step 2: Create an UNIQUE Clustered Index on Product_1 for Column ProductID
CREATE UNIQUE CLUSTERED INDEX CIX_Product ON Product_1 (ProductID)
Step 3: Run the below query and check its actual execution plan. You can enable actual execution plan either by shortcut Ctrl+M or from the Query menu in SSMS.
SELECT * FROM Product_1 WHERE ProductID <100 and Makeflag=1
We can see below Optimizer uses the Index Seek operation to fetch the data. If you mouse over on Clustered Index Seek operator you will get the operator details. Here we can see that ProductID is used for Seek predicate as we have Index on that and column MakeFlag is used for filter predicate.
Actual Execution Plan:
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server – New Analytical functions in SQL Server 2012 – Lead and Lag
LAG:
Using this function we can fetch the nth row before (Lag) the Current row without using self-join.
It is useful when we need to compare the current row values with values in a previous row.
Syntax:
LAG (scalar_expression [,offset] [,default])
OVER ( [ partition_by_clause ] order_by_clause )
EXAMPLE:
Create an Employee table and insert some records to it.
CREATE TABLE Employee (EmpID int, HikeDate date, Salary money)
GO
INSERT INTO employee values (1, '2009-04-01 00:00:00.000',10000),
(1, '2010-04-01 00:00:00.000',17000),
(1, '2011-04-01 00:00:00.000',23000)
GO
SELECT EmpID, HikeDate, Salary
FROM Employee
ORDER BY HikeDate
Output:
Now my requirement is to show what the previous Salary before current hike was?
Solution Using Lag function:
LEAD:
Using this function we can fetch the nth row after (LEAD) the current row without using self-join.
It is useful when we need to compare the current row values with values in a following row.
Syntax:
LEAD (scalar_expression [,offset] [,default])
OVER ( [ partition_by_clause ] order_by_clause )
Example:
My requirement is to show what my following year salary is?
Solution using Lead function:
Note: Default offset value is 1. We can specify offset value by a column or a sub query, or other expression that evaluates to a positive integer. Negative value or analytical function can’t be used as offset value.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server – Happy New Year 2013 and Happy First Anniversary to My Blog
Wish you a very Happy New Year 2013. Today is a special day for me to celebrate the New Year because today my blog post turned 1 year old. I can’t believe that I have been blogging for a year now. This blog was just created with an intension to share what I learned during my daily work. I’m happy that I initiated to contribute to the SQL Server community.
Thank you to Pinal Dave (B|T) and Vinod Kumar (B|T) for their guidance and encouragement. Thank you to my Wife for her support. Last but not least I thank you to all my readers.
I posted 24 blog posts in 2012 that is one post in every 2 weeks. Many have their New Year resolutions and I’m not the exception in this case. One of my New Year resolutions is to write at least one blog post per week.
SQL Server – Order of Column and Impact of equality operator on Index operation
Does order of Column in where clause matters?
Before you answer to this question I recommend all to read the blog post SQLAuthority.
Whenever I have any doubt, I always play around with it to clarify my doubt. While doing experiment with index operation, I learnt something and here I’m demonstrating what I learnt?
Let’s do some experiment,
In below experiment I’m using the AdevntureWorks database, which can be downloadable from Codeplex.
Demo – 1 (Order doesn’t matter)
Create Customer1 table from Sales.Customer table.
SELECT * INTO Customer1 FROM Sales.Customer
Now Create an Index on Column CustomerID,TerritoryID
CREATE INDEX IXNC_Customer1_1 ON Customer1 (CustomerID,TerritoryID)
Run below two queries, the only difference in the queries are the order of column in where clause.
–Query – 1
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE CustomerID =1 AND TerritoryID =1
–Query – 2
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE TerritoryID =1 and CustomerID = 1
Compare Execution plan for both the queries:
We can see in the above Demo (Fig-1) both queries uses Index Seek and query cost is same.
Now compare index seek operation for both of the queries, Seek predicates prefix CustomerID as the first column in both the cases and both CustomerID and TerritoryID used by SQL server as seek predicates.
Before proceed further, from Demo- 1 we can conclude that the order doesn’t matter in where clause.
Demo – 2 (It matters if other than first column of the index used in where clause)
Now let’s see what will happen if we use only one column in where clause?
Run below two queries. Here I’ve used CustomerID in first query and TerritoryID in second query respectively in where clause.
–Query – 3
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE CustomerID = 1
–Query – 4
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE TerritoryID = 1
In Demo – 2, We can see that when we use Column CustomerID in Where clause it uses Seek Predicate for CustomerID Column but in Query – 4, there is no seek predicate because it is using Index scan.
In Fig – 4, The actual query plan shows the actual number of row are more in Query – 4, so we may assume that in second query, the number of resultant rows are more so it uses Index Scan and It doesn’t matter whether we use first column or second column of the index in where clause. It does matter if we use only one column of the index.
To prove that, now let’s see if we have only one row for TerritoryID is equal to 1, which operation will be used by SQL Server.
Below query will delete all records from Customer1 table except one row where TerritoryID = 1.
DELETE FROM Customer1 WHERE CustomerID >1 AND TerritoryID = 1
Confirm it by running below query that we have only one record in Customer1 table where TerritoryID = 1
SELECT * FROM Customer1 WHERE TerritoryID = 1
Now let’s run the same query that is Query – 4
–Query – 5
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE TerritoryID = 1
In above example we can see that even though there is only one row for TerritoryID = 1, It is not using TerritoryID column as seek predicate.
It is using Index scan because TerritoryID is not the first column of the index.
Demo – 3: The second column will participate in seek predicate Only If the first column of the index compares with equality operator irrespective of the order in where clause.
Note: As I’ve deleted some rows from the table. I’m dropping the table and recreating the Table and Index.
DROP TABLE Customer1
SELECT * INTO Customer1 FROM Sales.Customer
CREATE INDEX IXNC_Customer1_1 ON Customer1 (CustomerID,TerritoryID)
GO
Let’s run below two queries, where in first query, CustomerID is compared with equality operator and second query, CustomerID is compared with non-equality operator.
–Query – 6
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE TerritoryID >1 AND CustomerID = 1
–Query – 7
SELECT CustomerID, TerritoryID
FROM Customer1
WHERE TerritoryID > 1 AND CustomerID > 1
In the above example, we can see that even though both the queries uses Index seek operation but in Query – 5, Both CustomerID and TerritoryID columns are participated in Seek Predicate and in Query – 6 only CustomerID is participated in Seek predicate and TerritoryID is used as filter predicate.
Conclusion:
- Order of column in where clause doesn’t matter.
- It matters if we use Columns except first column of the Index.
- If first column of the index compared with equality operator then only second column will participate in Seek predicate.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql.
SQL SERVER – Trace Flags
To know about a topic we usually ask questions. Here I used the same method to explain about trace flag of SQL Server.
Below are some questions related to trace flags.
What is the use of Trace Flag?
Trace Flags are used to enable or disable certain functionality of SQL server temporarily. It is used to diagnose performance issues or to debug stored procedures.
Where will I get all the trace flags available for SQL Server?
For SQL Server 2012 : – http://msdn.microsoft.com/en-us/library/ms188396.aspx
For SQL Server 2008 R2 : – http://msdn.microsoft.com/en-us/library/ms188396(v=sql.105).aspx
For SQL Server 2008 : – http://msdn.microsoft.com/en-us/library/ms188396(v=sql.100).aspx
For SQL Server 2005 : – http://msdn.microsoft.com/en-us/library/ms188396(v=sql.90).aspx
What are the types of Trace Flags?
SQL Server provides below two types of trace flag.
Session Level: Active and visible only to a specific connection where it is set. A session level trace flag never effect any other session.
Global Level: Active and visible at the server level (for all connections). Some trace flag can only be set at global level. If a global trace flag enable at session level there will be no effect of the flag on the server.
Note: Some of the trace flags can be enabled either at session level or at global level.
How to enable or disable a Trace Flag?
Using method DBCC TRACEON, we can enable a trace flag.
Ex: DBCC TRACEON (1224)
To set a trace flag at Global level use argument -1.
Ex: DBCC TRACEON (1224, -1)
Global level trace flag can also be set using –T start-up option. Using –T start up option we can’t enable any session level trace flag.
How to check a trace flag is enabled or disabled?
Using method DBCC TRACESTATUS, we can check the status of a trace flag.
Ex: DBCC TRACESTATUS (-1) will display list of trace flags enabled at global level
DBCC TRACESTATUS () will display list of trace flags enabled for current session
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql.
SQL Server – Customize Database object Script
SSMS provides functionality to generate script for database objects. I’ve seen developer use this functionality to generate script for various objects like Tables, Procedures, Triggers and other database objects but after generating the script they manually modify the generated scripts. When I asked one of the developers why you need to modify the generated scripts? He said the generated script is not meeting his requirement.
Below are some of the requirements –
- Script should not start with USE <database name>.
- It should check for existence of the object prior to the create statement.
- It should include collation
- If the object is Table or view it should generate scripts for all the associated index and Triggers.
SSMS allow to customize our script as per our requirement. SSMS has options to customize it.
Follow below steps to customize your scripts.
Step 1 – Go to Tools – Option. It will open the Option Page
Step 2 – Click on SQL Server Object Explorer and select Scripting
Step 3 – On left side panel you can find various customization option grouped into below three categories.
- General scripting options
- Object scripting options
- Table and view options
You can set these options as per your requirement.
Example: If you don’t want Use <Database name> statement at the beginning of your script then Set Script USE <database> to false.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server– Unique Constraint VS Unique Index
Both Unique constraint and Unique Index enforces uniqueness of the column. When we define an Unique constraint, SQL server creates an Unique non clustered index on the column where Unique constraint is defined.
Even though both of them can be used for same purpose but consider below points before you decide which one to use.
1 - From SQL server 2008 onwards we have Filtered index , Include columns which can’t be define on index which is created while defining Unique constraint but we can define it while defining Unique Index. We can have better control over the Unique index.
2 - Derived from point 1, sometime we need our column should be unique but it should allow multiple NULL values in that case by implementing Unique filtered Index we can achieve it. Also we can avoid Key lookup by adding included columns in an Unique Index but not in Unique constraint.
3 - Unique constraint always creates non clustered index. Unique Index can be defined either Clustered or non Clustered.
4 - You can’t drop only the index associated with the Unique constraint.
Leave your comment, If you have any points which we should consider before we decide which one to use between Unique Constraint and Unique Index.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server– Template Explorer
Yesterday when I was about to leave office , I got a call on my mobile from one of my friend. He wanted to know how to write a recursive query?
I said use Template Explorer, Where you can get examples for recursive query but He was unaware of this feature of SQL server. So I thought to write on this.
SQL server provides number of templates which can be used to Create, Alter and Drop different database objects like Table, Index, Triggers, View and many more. With the help of Template Explorer you can do DDL operation on object as well as you can learn How to write recursive query too.
Follow below steps to open the recursive query example:
Step 1- Open SQL Server Management Studio
Step 2 – Use shortcut key Ctrl + Alt + T or You can find this option under View-> Template Explorer
Step 3: Expand Recursive Queries section
We can create our own Custom Template. Refer How to create custom template?. When first time the template explorer is opened, a copy of the templates are placed in the user’s folder in C:\Users, under <username>\AppData\Roaming\Microsoft\SQL Server Management Studio\11.0\Templates. So that all users can have their own copy of templates.
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
SQL Server – Join Algorithm
Today we’ll find how SQL Server works when we do join operation. SQL Server optimizer chooses one of the below physical operator to perform join operation.
- Hash Match
- Merge Join
- Nested Loop Join
As far as performance is concern let me tell you, we can’t say which one is suitable for all kind of scenarios. Each operator has its own advantages and disadvantages. Most of the time query optimizer picks the correct physical join operator to perform join operation. Here we’ll find out how these operators works and in which scenario which operator is suitable.
Two processes together makes hash join i.e. Build and Probe. Build process create the build input. Optimizer chooses the smaller input to create the build input. During build process , it creates a build hash table by computing a hash value for each row from the build input. Then from the probe input it creates the hash value for applicable rows using same hash function and looks in build hash table for match. In case of multiple join on the same join column, these operation are grouped into a hash team. There are 3 types of hash join i.e. in-memory hash join, grace hash join, and recursive hash join. To know more on Hash Join refer MSDN.
| Apart from join operation, this operator is used for other operations which includes intersection, union, difference, grouping , distinct. Hash join algorithm is suitable for large unsorted or non indexed input. |
Lets try the below example:
Create two tables, Table1 and Table1. Insert some records to both the tables.
Create Table Table1(Col1 int identity, Col2 varchar(20))
Go
Create Table Table2(Col1 int identity, Col2 varchar(20))
Go
Declare @i int =0
While(@i<10000)
Begin
Insert Table1 values( 'A'+cast(@i as varchar))
if(@i%2=0)
Insert Table2 values( 'A'+cast(@i as varchar))
Set @i+=1
End
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1 Where T1.Col1 = 1
Most of the time optimizer chooses Merge Join operator during Join operation when both the inputs are large and sorted on the join column. If Optimizer uses Merge Join, it scans an index if exist on the join column else it applies a sort operator before Merge Join Operator. During Merge Join operation it reads row from each input and compare them. To Know more on Merge Join refer MSDN .
| Merge Join is possible when the inputs are sorted and not small . The Merge Join operator performs the inner join, left outer join, left semi join, left anti semi join, right outer join, right semi join, right anti semi join, and union logical operations. |
Lets try the below example:
Case 1: Now we know that Merge Join required both inputs should be sorted. Here we’ll apply Merge Join Hint to join Table1 and Table2 to see how optimizer applies sort operator if inputs are not sorted.
Select T1.Col2
From Table1 T1
Inner Merge Join Table2 T2 ON T1.Col1 = T2.Col1
In case you are using Inner Join Operation with out any join hint and if you see your plan is look like above then in that case it is advisable to create index on join columns which will enhance the performance of join operation.
Case 2: Lets create an Unique Clustered Index on join columns and then will try to join the tables.
Create Unique Clustered Index IX_Table1_Col1 ON Table1(Col1)
Go
Create Unique Clustered Index IX_Table2_Col1 ON Table2(Col1)
Go
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1
Even though we are not using any hint in the above query, Optimizer will use Merge Join operator to perform join operation because both inputs are already sorted when we defined index on the join columns. See below the Actual Execution Plan for the above select query.
This Join is most suitable when outer input is small and inner input is large and the join column is indexed. In this kind of join operation it process each row from outer input and loop through all rows of inner input to search for matching row based on join column. To know more refer MSDN.
| Nested loops joins perform a search on the inner table for each row of the outer table, typically using an index. The Nested Loops operator performs the inner join, left outer join, left semi join, and left anti semi join logical operations |
Lets try the below example:
Here we reduce the input rows by applying where clause.
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 36
For the above query we can’t say which operator is efficient. Now lets reduce the number of input rows and will see how it perform.
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 15
Select T1.Col2
From Table1 T1
Inner Merge Join Table2 T2 ON T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 15
Below is the Actual Execution Plan for the above query.
In the above screenshot we can see when the number of input is small Nested Loop Join perform better.
All of the above join operators perform well based on the criteria. We can’t say which one is better.
Can you answer to the below question?
To demonstrate merge join Why I created Unique Clustered Index? What will happen If I create Clustered Index without Unique key word?
Reference : MSDN
If you liked this post, do like on Facebook at https://www.facebook.com/s4sql
-
Archives
- March 2013 (1)
- January 2013 (3)
- December 2012 (2)
- November 2012 (1)
- August 2012 (4)
- July 2012 (2)
- June 2012 (2)
- May 2012 (1)
- April 2012 (5)
- March 2012 (1)
- February 2012 (1)
- January 2012 (5)
-
Categories
-
RSS
Entries RSS
Comments RSS

