Transparent Data Encryption(TDE) is used in SQL Server to protect data by encrypting physical files of the database.In TDE only files (ldf, mdf and bak) will be encrypted, but not the data. In this TDE, database will be encrypted using a Database Encryption Key. This database encryption key will be protected by a certificate and master key which are present in Master Database. To get the database information decrypted on the fly, a user connecting to the database should be accompanied by the Instance level Service Master key, this preserves the security in tact. As Service master key is at instance level, physical data files cannot be taken to other instances and get decrypted, that way data is protected.
FileTable is a new concept in SQL Server 2012, using which you can store files and documents in special tables in SQL Server called FileTables, but access them from Windows applications as if they were stored in the file system, without making any changes to your client applications. File Table deals with directories as it stores all the hierarchies in in the directory configured as root directory for a given FileTable.
A FileTable represents a hierarchy of directories and files. It stores data related to all the nodes in that hierarchy, for both directories and the files they contain. This hierarchy starts from a root directory that you specify when you create the FileTable. When the database is configured for non-transactional access, the file and directory data stored in a FileTable is exposed through a Windows share for Windows API based applications. For a Windows application, this looks like a normal share with its files and directories.
This articles narrates on how to parse XML data in TSQL using NODES(). In this tutorial we are going to validate input XML data against its corresponding schema. Upon successful validation data has been extracted from the same using nodes() method. Nodes() method can be used in TSQL for extracting relational information from a XML data type.
At times, we have to validate the XML input passed from different sources. In SQL Server, we can use XML Schema Collection to have a schema defined and then use the same for validating corresponding XML inputs which are received. XML Schema declaration and consumption is a vast topic with lot of available tags and simultaneously with lot of restrictions and limitations. In future articles I am going to narrate about different key concepts and tips one should remember in handling XML in SQL Server.
In this tutorial, we are going to create a simple Student XML Schema with specific declarations on his basic data, enrolled information, semester details and his previous semesters’ results. Our schema will compress of some basic validations like Data types validations, required elements etc.
There are two types of DML triggers in SQL Server. First type was AFTER/FOR trigger and the other was INSTEAD OF trigger. Triggers are extremely useful for logging different database operations (Insert/Update/Delete) into audit tables. Apart from afore said reason, triggers are helpful in using contextual data from different business operations on specific operations like aggregations/statistics.
We must be careful in planning DML triggers around different tables, because there can be infinite deadlock. For example, if an INSTEAD OF trigger causes an AFTER trigger to fire, and in return if that particular AFTER trigger cause INSTEAD OF trigger to fire again. This will lead to Direct Recursion and will cause error.
To perform rename on both logical and physical files of a database, first we need to set the database into single user mode. Then change the logical file names followed by the change to database name. But tricky part comes when renaming physical files. We need to use shell commands to change the names of physical files. So for that we need to configure SQL Server to run shell commands using sp_configure. On the later part we need to detach the database, then run the RENAME command to change physical file names. Now re-attach the database to engine with new names. Finally bring back renamed database to multi user mode.
There can be permissions issues when we try to rename the physical files due to ownership conflict. Check out the code in this tutorial along with code comments which addresses permissions issues.
At times we might need to check out all active connections made by all processes to the SQL Server databases. Also some times we might need to check with all active transactions in the SQL Engine. To accomplish the same tasks, we have two system provided views in master database. To get processes, use ‘sys.sysprocesses’. To get transactions, use ‘dm_tran_active_transactions’.
Sometimes we might get this error when we perform ALTER statements on an existing database. This error means, that particular database is being used by lot of other user connections and the present ALTER statement needs to have an exclusive lock on the database in order to complete the action.
To fix this error, we need to take the database into SINGLE USER mode and then ALTER the database, and finally bring the database into MULTI USER mode.
Sometimes, we might need to have a copy of database running on a different server for testing purpose. In that scenario creating a backup and restoring it in a different server would help. In this short tip, I am going to narrate on how to backup an existing database to a location in ‘.BAK’ format and then restore the same BAK file to another database name. Backup operation can be done using ‘BACKUP DATABASE’ and restore operation can be done using ‘RESTORE DATABASE’.
At times, we might need to search a string in TSQL using LIKE operator for a special character (%,_,[,]). These special characters do have different meaning in normal string comparisons so we need to escape them for making them to be a part of search pattern. We can use either ESCAPE keyword followed by the character we want to escape or we can directly use ‘’ to escape special characters in SQL Server.
In some cases we tend to write ‘SELECT * …’ in our queries. Even though we have code maintenance issues with ‘SELECT * …’ query (say for example in case a new column gets in to the schema, application coding against this query breaks), most important issue would be performance. In this tutorial I would like to show a sample code to differentiate how performance would get hit due to SELECT * query from others.
There can be argument that we can still make SELECT * query give us performance by covering all columns of the table in Covering Indexes and there by making a seek possible, but in general realistic scenarios we do not tend to put all the columns in an index. And more realistically we tend to put an index on top of most frequently used predicate. So a proper combination of predicate with columns listed in SELECT would give you a good performance.
Today there was an interesting conversation between me and a young SQL Developer on how we can get a custom order of a resultset in TSQL. We spoke about different combinations starting from Table Variables, Temp Tables till Sub-queries. I am not able to agree with most of the preferred answers and wanted to solve the problem with a CTE, at least to maintain code readability and not so many performance implications involved with this approach.
For the sake of simplicity, I have taken a sample employees table with their locations and salaries. I need to order my resultset so that first I would like to get all Employees from Los Angeles area in the descending order of their salaries and then remaining rows needs to follow this groups with descending order of salaries.
There are majorly 4 different ranking functions in SQL Server. They are –
They all are different and in my own terminology they can be used to define row versions in programmatic point of view. I mean to say, when we want more precise control on every row of a group or partition, these functions will give us more flexibility.
ISNULL function replaces NULL with the specified value. COALESCE function
returns the first non-NULL value among its arguments.
ISNULL uses the first parameter data type. COALESCE uses highest precedence
data type. ISNULL and COALESCE throws invalid cast exception when it encounters
invalid casts because of precedence.
In this tutorial, we see how to retrieve records from a particular table using pagination concept. In traditional SQL we sometimes send unrequired data to frontend UI even though it is not displaying the same. Most of the modern UIs and collaborative systems are interested only in qualitative and quantitative contextual data. So the idea behind pagination in TSQL is to send only the required data which frontend UI is anticipating rather than sending lot of unnecessary data. This increases performance and also scalability of SQL Server database.
We disscuss two ways of getting paginated data. First way is to use traditional CTE which uses ROW_NUMBER() with OVER clause having required ORDER BY clauses. And then run a select query on top of this CTE to get the results. Second approach is new in SQL Server 2012 which uses OFFSET and FETCH NEXT ROWS ONLY to get results. Second approach is far more readable and easy to understand, but for legacy SQL Server versions it is not available as it is included only in 2012 version.
This blog post is not planned ;-). I am going through some routine work related at this late hour and I need to insert large amount of data into a test table to test some behaviors of indexes. Well I got started off with SSMS and sending multiple batches. This technique of batches is good for small data, but for large data of 100,000’s rows they are going to take lot of time, especially when you use simple INSERT Statements in a batch and execute it multiple times with GO. So I came up with one more solution using Recursive CTE which is effectively fast than the previous one.
We have already discussed in a previous tutorial on how to delete duplicate records from a table using IDENTITY column approach (add,delete and drop identity column). The only problem with this approach is that the underlying schema needs to be modified and reverted back after deletion.
In this tutorial, a much more cleaner approach without touching the schema will be demonstrated. We use Common Table Expression (CTE) and we group duplicate rows using OVER PARTITION BY and then version each row of group using ROW_NUMBER(). Later we delete all the duplicates in the groups except the first row i.e., with ROW_NUMBER() = 1. This way we can delete all duplicates.
A Common Table Expression(CTE) is a temporary resultset within the scope of the immediate SELECT, INSERT, DELETE, UPDATE and CREATE VIEW statements. It is different from derived table as its scope is only persisted for immediate query and it can be self referencing.
In this small tutorial we are going to see how we can use a recursive CTE to perform query on hierarchical data. Recursive CTE is a CTE in which a query runs repeatedly and evaluates the results.
Sometimes we have to use the column value which we will be using in the where clause of the same update statement. At that time FROM clause statement would be very helpful. FROM clause is used to specify a table, view, or derived table source to provide the criteria for the update operation.
In the below code sample, we use Employee Roles table to update Employee Table but only for the employees who are developers. We use a FROM clause in the update query in which we get the Employee roles and then in the WHERE clause we get check for a developer role and update the Employees table correspondingly.
At times we might need to get the recently added top 2 employees for each department in an organization, or else we might want to fetch latest news in all categories. Former said requirements are pretty common in nature and to solve them CTE (Common Table Expression) would be at rescue along with ROW_NUMBER() OVER PARTITION.
In the following code snippet, we have different employees (typically employee Ids) are being added with new roles and using CTE we can get the top 2 newly inserted employee into the roles table.
Imagine you have a table Employee with no identity column, and by mistake you got a ETL package executed twice and landed up duplicate information twice. So now its time for you to clean up duplicate data. There are many ways to do the same, say with cursor or with one more ETL package etc. But in this tutorial I am going to show a simple TSQL script which does the same. Idea is to add a identity column to the existing table, then find all the MAX values of identity column, by grouping all the duplicate rows. Once we get the MAX values of unique rows of the Identity column of grouped data, then delete all other rows from the table. Finally drop the identity column to bring back the table to original state.
Dirty Reads and Phantom Reads are the concurrency problems which one can land up in case transactions are not planned correctly. Dirty reads results when a particular select statement retrieve data which is still uncommitted, in this case data can be changed/lost if other transaction makes changes and then commits or rollbacks.
Phantom reads are the reads which give different resultsets for the same query in a transaction. That means another transaction might change data between the same query calls and might result in different results for the same query when operated more than once in the same transaction.