Search Results

Search found 14485 results on 580 pages for 'general technology'.

Page 49/580 | < Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56 | Next Page >

SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS

- by pinaldave

Data Quality Services is a very important concept of SQL Server. I have recently started to explore the same and I am really learning some good concepts. Here are two very important blog posts which one should go over before continuing this blog post. Installing Data Quality Services (DQS) on SQL Server 2012 Connecting Error to Data Quality Services (DQS) on SQL Server 2012 This article is introduction to Data Quality Services for beginners. We will be using an Excel file Click on the image to enlarge the it. In the first article we learned to install DQS. In this article we will see how we can learn about building Knowledge Base and using it to help us identify the quality of the data as well help correct the bad quality of the data. Here are the two very important steps we will be learning in this tutorial. Building a New Knowledge Base Creating a New Data Quality Project Let us start the building the Knowledge Base. Click on New Knowledge Base. In our project we will be using the Excel as a knowledge base. Here is the Excel which we will be using. There are two columns. One is Colors and another is Shade. They are independent columns and not related to each other. The point which I am trying to show is that in Column A there are unique data and in Column B there are duplicate records. Clicking on New Knowledge Base will bring up the following screen. Enter the name of the new knowledge base. Clicking NEXT will bring up following screen where it will allow to select the EXCE file and it will also let users select the source column. I have selected Colors and Shade both as a source column. Creating a domain is very important. Here you can create a unique domain or domain which is compositely build from Colors and Shade. As this is the first example, I will create unique domain – for Colors I will create domain Colors and for Shade I will create domain Shade. Here is the screen which will demonstrate how the screen will look after creating domains. Clicking NEXT it will bring you to following screen where you can do the data discovery. Clicking on the START will start the processing of the source data provided. Pre-processed data will show various information related to the source data. In our case it shows that Colors column have unique data whereas Shade have non-unique data and unique data rows are only two. In the next screen you can actually add more rows as well see the frequency of the data as the values are listed unique. Clicking next will publish the knowledge base which is just created. Now the knowledge base is created. We will try to take any random data and attempt to do DQS implementation over it. I am using another excel sheet here for simplicity purpose. In reality you can easily use SQL Server table for the same. Click on New Data Quality Project to see start DQS Project. In the next screen it will ask which knowledge base to use. We will be using our Colors knowledge base which we have recently created. In the Colors knowledge base we had two columns – 1) Colors and 2) Shade. In our case we will be using both of the mappings here. User can select one or multiple column mapping over here. Now the most important phase of the complete project. Click on Start and it will make the cleaning process and shows various results. In our case there were two columns to be processed and it completed the task with necessary information. It demonstrated that in Colors columns it has not corrected any value by itself but in Shade value there is a suggestion it has. We can train the DQS to correct values but let us keep that subject for future blog posts. Now click next and keep the domain Colors selected left side. It will demonstrate that there are two incorrect columns which it needs to be corrected. Here is the place where once corrected value will be auto-corrected in future. I manually corrected the value here and clicked on Approve radio buttons. As soon as I click on Approve buttons the rows will be disappeared from this tab and will move to Corrected Tab. If I had rejected tab it would have moved the rows to Invalid tab as well. In this screen you can see how the corrected 2 rows are demonstrated. You can click on Correct tab and see previously validated 6 rows which passed the DQS process. Now let us click on the Shade domain on the left side of the screen. This domain shows very interesting details as there DQS system guessed the correct answer as Dark with the confidence level of 77%. It is quite a high confidence level and manual observation also demonstrate that Dark is the correct answer. I clicked on Approve and the row moved to corrected tab. On the next screen DQS shows the summary of all the activities. It also demonstrates how the correction of the quality of the data was performed. The user can explore their data to a SQL Server Table, CSV file or Excel. The user also has an option to either explore data and all the associated cleansing info or data only. I will select Data only for demonstration purpose. Clicking explore will generate the files. Let us open the generated file. It will look as following and it looks pretty complete and corrected. Well, we have successfully completed DQS Process. The process is indeed very easy. I suggest you try this out yourself and you will find it very easy to learn. In future we will go over advanced concepts. Are you using this feature on your production server? If yes, would you please leave a comment with your environment and business need. It will be indeed interesting to see where it is implemented. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Business Intelligence, Data Warehousing, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Data Quality Services, DQS

Read the article
SQL SERVER – Guest Post – Architecting Data Warehouse – Niraj Bhatt

- by pinaldave

Niraj Bhatt works as an Enterprise Architect for a Fortune 500 company and has an innate passion for building / studying software systems. He is a top rated speaker at various technical forums including Tech·Ed, MCT Summit, Developer Summit, and Virtual Tech Days, among others. Having run a successful startup for four years Niraj enjoys working on – IT innovations that can impact an enterprise bottom line, streamlining IT budgets through IT consolidation, architecture and integration of systems, performance tuning, and review of enterprise applications. He has received Microsoft MVP award for ASP.NET, Connected Systems and most recently on Windows Azure. When he is away from his laptop, you will find him taking deep dives in automobiles, pottery, rafting, photography, cooking and financial statements though not necessarily in that order. He is also a manager/speaker at BDOTNET, Asia’s largest .NET user group. Here is the guest post by Niraj Bhatt. As data in your applications grows it’s the database that usually becomes a bottleneck. It’s hard to scale a relational DB and the preferred approach for large scale applications is to create separate databases for writes and reads. These databases are referred as transactional database and reporting database. Though there are tools / techniques which can allow you to create snapshot of your transactional database for reporting purpose, sometimes they don’t quite fit the reporting requirements of an enterprise. These requirements typically are data analytics, effective schema (for an Information worker to self-service herself), historical data, better performance (flat data, no joins) etc. This is where a need for data warehouse or an OLAP system arises. A Key point to remember is a data warehouse is mostly a relational database. It’s built on top of same concepts like Tables, Rows, Columns, Primary keys, Foreign Keys, etc. Before we talk about how data warehouses are typically structured let’s understand key components that can create a data flow between OLTP systems and OLAP systems. There are 3 major areas to it: a) OLTP system should be capable of tracking its changes as all these changes should go back to data warehouse for historical recording. For e.g. if an OLTP transaction moves a customer from silver to gold category, OLTP system needs to ensure that this change is tracked and send to data warehouse for reporting purpose. A report in context could be how many customers divided by geographies moved from sliver to gold category. In data warehouse terminology this process is called Change Data Capture. There are quite a few systems that leverage database triggers to move these changes to corresponding tracking tables. There are also out of box features provided by some databases e.g. SQL Server 2008 offers Change Data Capture and Change Tracking for addressing such requirements. b) After we make the OLTP system capable of tracking its changes we need to provision a batch process that can run periodically and takes these changes from OLTP system and dump them into data warehouse. There are many tools out there that can help you fill this gap – SQL Server Integration Services happens to be one of them. c) So we have an OLTP system that knows how to track its changes, we have jobs that run periodically to move these changes to warehouse. The question though remains is how warehouse will record these changes? This structural change in data warehouse arena is often covered under something called Slowly Changing Dimension (SCD). While we will talk about dimensions in a while, SCD can be applied to pure relational tables too. SCD enables a database structure to capture historical data. This would create multiple records for a given entity in relational database and data warehouses prefer having their own primary key, often known as surrogate key. As I mentioned a data warehouse is just a relational database but industry often attributes a specific schema style to data warehouses. These styles are Star Schema or Snowflake Schema. The motivation behind these styles is to create a flat database structure (as opposed to normalized one), which is easy to understand / use, easy to query and easy to slice / dice. Star schema is a database structure made up of dimensions and facts. Facts are generally the numbers (sales, quantity, etc.) that you want to slice and dice. Fact tables have these numbers and have references (foreign keys) to set of tables that provide context around those facts. E.g. if you have recorded 10,000 USD as sales that number would go in a sales fact table and could have foreign keys attached to it that refers to the sales agent responsible for sale and to time table which contains the dates between which that sale was made. These agent and time tables are called dimensions which provide context to the numbers stored in fact tables. This schema structure of fact being at center surrounded by dimensions is called Star schema. A similar structure with difference of dimension tables being normalized is called a Snowflake schema. This relational structure of facts and dimensions serves as an input for another analysis structure called Cube. Though physically Cube is a special structure supported by commercial databases like SQL Server Analysis Services, logically it’s a multidimensional structure where dimensions define the sides of cube and facts define the content. Facts are often called as Measures inside a cube. Dimensions often tend to form a hierarchy. E.g. Product may be broken into categories and categories in turn to individual items. Category and Items are often referred as Levels and their constituents as Members with their overall structure called as Hierarchy. Measures are rolled up as per dimensional hierarchy. These rolled up measures are called Aggregates. Now this may seem like an overwhelming vocabulary to deal with but don’t worry it will sink in as you start working with Cubes and others. Let’s see few other terms that we would run into while talking about data warehouses. ODS or an Operational Data Store is a frequently misused term. There would be few users in your organization that want to report on most current data and can’t afford to miss a single transaction for their report. Then there is another set of users that typically don’t care how current the data is. Mostly senior level executives who are interesting in trending, mining, forecasting, strategizing, etc. don’t care for that one specific transaction. This is where an ODS can come in handy. ODS can use the same star schema and the OLAP cubes we saw earlier. The only difference is that the data inside an ODS would be short lived, i.e. for few months and ODS would sync with OLTP system every few minutes. Data warehouse can periodically sync with ODS either daily or weekly depending on business drivers. Data marts are another frequently talked about topic in data warehousing. They are subject-specific data warehouse. Data warehouses that try to span over an enterprise are normally too big to scope, build, manage, track, etc. Hence they are often scaled down to something called Data mart that supports a specific segment of business like sales, marketing, or support. Data marts too, are often designed using star schema model discussed earlier. Industry is divided when it comes to use of data marts. Some experts prefer having data marts along with a central data warehouse. Data warehouse here acts as information staging and distribution hub with spokes being data marts connected via data feeds serving summarized data. Others eliminate the need for a centralized data warehouse citing that most users want to report on detailed data. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Business Intelligence, Data Warehousing, Database, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
LLBLGen Pro feature highlights: grouping model elements

- by FransBouma

(This post is part of a series of posts about features of the LLBLGen Pro system) When working with an entity model which has more than a few entities, it's often convenient to be able to group entities together if they belong to a semantic sub-model. For example, if your entity model has several entities which are about 'security', it would be practical to group them together under the 'security' moniker. This way, you could easily find them back, yet they can be left inside the complete entity model altogether so their relationships with entities outside the group are kept. In other situations your domain consists of semi-separate entity models which all target tables/views which are located in the same database. It then might be convenient to have a single project to manage the complete target database, yet have the entity models separate of each other and have them result in separate code bases. LLBLGen Pro can do both for you. This blog post will illustrate both situations. The feature is called group usage and is controllable through the project settings. This setting is supported on all supported O/R mapper frameworks. Situation one: grouping entities in a single model. This situation is common for entity models which are dense, so many relationships exist between all sub-models: you can't split them up easily into separate models (nor do you likely want to), however it's convenient to have them grouped together into groups inside the entity model at the project level. A typical example for this is the AdventureWorks example database for SQL Server. This database, which is a single catalog, has for each sub-group a schema, however most of these schemas are tightly connected with each other: adding all schemas together will give a model with entities which indirectly are related to all other entities. LLBLGen Pro's default setting for group usage is AsVisualGroupingMechanism which is what this situation is all about: we group the elements for visual purposes, it has no real meaning for the model nor the code generated. Let's reverse engineer AdventureWorks to an entity model. By default, LLBLGen Pro uses the target schema an element is in which is being reverse engineered, as the group it will be in. This is convenient if you already have categorized tables/views in schemas, like which is the case in AdventureWorks. Of course this can be switched off, or corrected on the fly. When reverse engineering, we'll walk through a wizard which will guide us with the selection of the elements which relational model data should be retrieved, which we can later on use to reverse engineer to an entity model. The first step after specifying which database server connect to is to select these elements. below we can see the AdventureWorks catalog as well as the different schemas it contains. We'll include all of them. After the wizard completes, we have all relational model data nicely in our catalog data, with schemas. So let's reverse engineer entities from the tables in these schemas. We select in the catalog explorer the schemas 'HumanResources', 'Person', 'Production', 'Purchasing' and 'Sales', then right-click one of them and from the context menu, we select Reverse engineer Tables to Entity Definitions.... This will bring up the dialog below. We check all checkboxes in one go by checking the checkbox at the top to mark them all to be added to the project. As you can see LLBLGen Pro has already filled in the group name based on the schema name, as this is the default and we didn't change the setting. If you want, you can select multiple rows at once and set the group name to something else using the controls on the dialog. We're fine with the group names chosen so we'll simply click Add to Project. This gives the following result: (I collapsed the other groups to keep the picture small ;)). As you can see, the entities are now grouped. Just to see how dense this model is, I've expanded the relationships of Employee: As you can see, it has relationships with entities from three other groups than HumanResources. It's not doable to cut up this project into sub-models without duplicating the Employee entity in all those groups, so this model is better suited to be used as a single model resulting in a single code base, however it benefits greatly from having its entities grouped into separate groups at the project level, to make work done on the model easier. Now let's look at another situation, namely where we work with a single database while we want to have multiple models and for each model a separate code base. Situation two: grouping entities in separate models within the same project. To get rid of the entities to see the second situation in action, simply undo the reverse engineering action in the project. We still have the AdventureWorks relational model data in the catalog. To switch LLBLGen Pro to see each group in the project as a separate project, open the Project Settings, navigate to General and set Group usage to AsSeparateProjects. In the catalog explorer, select Person and Production, right-click them and select again Reverse engineer Tables to Entities.... Again check the checkbox at the top to mark all entities to be added and click Add to Project. We get two groups, as expected, however this time the groups are seen as separate projects. This means that the validation logic inside LLBLGen Pro will see it as an error if there's e.g. a relationship or an inheritance edge linking two groups together, as that would lead to a cyclic reference in the code bases. To see this variant of the grouping feature, seeing the groups as separate projects, in action, we'll generate code from the project with the two groups we just created: select from the main menu: Project -> Generate Source-code... (or press F7 ;)). In the dialog popping up, select the target .NET framework you want to use, the template preset, fill in a destination folder and click Start Generator (normal). This will start the code generator process. As expected the code generator has simply generated two code bases, one for Person and one for Production: The group name is used inside the namespace for the different elements. This allows you to add both code bases to a single solution and use them together in a different project without problems. Below is a snippet from the code file of a generated entity class. //... using System.Xml.Serialization; using AdventureWorks.Person; using AdventureWorks.Person.HelperClasses; using AdventureWorks.Person.FactoryClasses; using AdventureWorks.Person.RelationClasses; using SD.LLBLGen.Pro.ORMSupportClasses; namespace AdventureWorks.Person.EntityClasses { //... /// <summary>Entity class which represents the entity 'Address'.<br/><br/></summary> [Serializable] public partial class AddressEntity : CommonEntityBase //... The advantage of this is that you can have two code bases and work with them separately, yet have a single target database and maintain everything in a single location. If you decide to move to a single code base, you can do so with a change of one setting. It's also useful if you want to keep the groups as separate models (and code bases) yet want to add relationships to elements from another group using a copy of the entity: you can simply reverse engineer the target table to a new entity into a different group, effectively making a copy of the entity. As there's a single target database, changes made to that database are reflected in both models which makes maintenance easier than when you'd have a separate project for each group, with its own relational model data. Conclusion LLBLGen Pro offers a flexible way to work with entities in sub-models and control how the sub-models end up in the generated code.

Read the article
SQL – Migrate Database from SQL Server to NuoDB – A Quick Tutorial

- by Pinal Dave

Data is growing exponentially and every organization with growing data is thinking of next big innovation in the world of Big Data. Big data is a indeed a future for every organization at one point of the time. Just like every other next big thing, big data has its own challenges and issues. The biggest challenge associated with the big data is to find the ideal platform which supports the scalability and growth of the data. If you are a regular reader of this blog, you must be familiar with NuoDB. I have been working with NuoDB for a while and their recent release is the best thus far. NuoDB is an elastically scalable SQL database that can run on local host, datacenter and cloud-based resources. A key feature of the product is that it does not require sharding (read more here). Last week, I was able to install NuoDB in less than 90 seconds and have explored their Explorer and Admin sections. You can read about my experiences in these posts: SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB SQL – Quick Start with Admin Sections of NuoDB – Manage NuoDB Database SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database Many SQL Authority readers have been following me in my journey to evaluate NuoDB. One of the frequently asked questions I’ve received from you is if there is any way to migrate data from SQL Server to NuoDB. The fact is that there is indeed a way to do so and NuoDB provides a fantastic tool which can help users to do it. NuoDB Migrator is a command line utility that supports the migration of Microsoft SQL Server, MySQL, Oracle, and PostgreSQL schemas and data to NuoDB. The migration to NuoDB is a three-step process: NuoDB Migrator generates a schema for a target NuoDB database It loads data into the target NuoDB database It dumps data from the source database Let’s see how we can migrate our data from SQL Server to NuoDB using a simple three-step approach. But before we do that we will create a sample database in MSSQL and later we will migrate the same database to NuoDB: Setup Step 1: Build a sample data CREATE DATABASE [Test]; CREATE TABLE [Department]( [DepartmentID] [smallint] NOT NULL, [Name] VARCHAR(100) NOT NULL, [GroupName] VARCHAR(100) NOT NULL, [ModifiedDate] [datetime] NOT NULL, CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED ( [DepartmentID] ASC ) ) ON [PRIMARY]; INSERT INTO Department SELECT * FROM AdventureWorks2012.HumanResources.Department; Note that I am using the SQL Server AdventureWorks database to build this sample table but you can build this sample table any way you prefer. Setup Step 2: Install Java 64 bit Before you can begin the migration process to NuoDB, make sure you have 64-bit Java installed on your computer. This is due to the fact that the NuoDB Migrator tool is built in Java. You can download 64-bit Java for Windows, Mac OSX, or Linux from the following link: http://java.com/en/download/manual.jsp. One more thing to remember is that you make sure that the path in your environment settings is set to your JAVA_HOME directory or else the tool will not work. Here is how you can do it: Go to My Computer >> Right Click >> Select Properties >> Click on Advanced System Settings >> Click on Environment Variables >> Click on New and enter the following values. Variable Name: JAVA_HOME Variable Value: C:\Program Files\Java\jre7 Make sure you enter your Java installation directory in the Variable Value field. Setup Step 3: Install JDBC driver for SQL Server. There are two JDBC drivers available for SQL Server. Select the one you prefer to use by following one of the two links below: Microsoft JDBC Driver jTDS JDBC Driver In this example we will be using jTDS JDBC driver. Once you download the driver, move the driver to your NuoDB installation folder. In my case, I have moved the JAR file of the driver into the C:\Program Files\NuoDB\tools\migrator\jar folder as this is my NuoDB installation directory. Now we are all set to start the three-step migration process from SQL Server to NuoDB: Migration Step 1: NuoDB Schema Generation Here is the command I use to generate a schema of my SQL Server Database in NuoDB. First I go to the folder C:\Program Files\NuoDB\tools\migrator\bin and execute the nuodb-migrator.bat file. Note that my database name is ‘test’. Additionally my username and password is also ‘test’. You can see that my SQL Server database is running on my localhost on port 1433. Additionally, the schema of the table is ‘dbo’. nuodb-migrator schema –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.path=/tmp/schema.sql The above script will generate a schema of all my SQL Server tables and will put it in the folder C:\tmp\schema.sql . You can open the schema.sql file and execute this file directly in your NuoDB instance. You can follow the link here to see how you can execute the SQL script in NuoDB. Please note that if you have not yet created the schema in the NuoDB database, you should create it before executing this step. Step 2: Generate the Dump File of the Data Once you have recreated your schema in NuoDB from SQL Server, the next step is very easy. Here we create a CSV format dump file, which will contain all the data from all the tables from the SQL Server database. The command to do so is very similar to the above command. Be aware that this step may take a bit of time based on your database size. nuodb-migrator dump –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.type=csv –output.path=/tmp/dump.cat Once the above command is successfully executed you can find your CSV file in the C:\tmp\ folder. However, you do not have to do anything manually. The third and final step will take care of completing the migration process. Migration Step 3: Load the Data into NuoDB After building schema and taking a dump of the data, the very next step is essential and crucial. It will take the CSV file and load it into the NuoDB database. nuodb-migrator load –target.url=jdbc:com.nuodb://localhost:48004/mytest –target.schema=dbo –target.username=test –target.password=test –input.path=/tmp/dump.cat Please note that in the above script we are now targeting the NuoDB database, which we have already created with the name of “MyTest”. If the database does not exist, create it manually before executing the above script. I have kept the username and password as “test”, but please make sure that you create a more secure password for your database for security reasons. Voila! You’re Done That’s it. You are done. It took 3 setup and 3 migration steps to migrate your SQL Server database to NuoDB. You can now start exploring the database and build excellent, scale-out applications. In this blog post, I have done my best to come up with simple and easy process, which you can follow to migrate your app from SQL Server to NuoDB. Download NuoDB I strongly encourage you to download NuoDB and go through my 3-step migration tutorial from SQL Server to NuoDB. Additionally here are two very important blog post from NuoDB CTO Seth Proctor. He has written excellent blog posts on the concept of the Administrative Domains. NuoDB has this concept of an Administrative Domain, which is a collection of hosts that can run one or multiple databases. Each database has its own TEs and SMs, but all are managed within the Admin Console for that particular domain. http://www.nuodb.com/techblog/2013/03/11/getting-started-provisioning-a-domain/ http://www.nuodb.com/techblog/2013/03/14/getting-started-running-a-database/ Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – A Quick Look at Logging and Ideas around Logging

- by pinaldave

This blog post is written in response to the T-SQL Tuesday post on Logging. When someone talks about logging, personally I get lots of ideas about it. I have seen logging as a very generic term. Let me ask you this question first before I continue writing about logging. What is the first thing comes to your mind when you hear word “Logging”? Now ask the same question to the guy standing next to you. I am pretty confident that you will get a different answer from different people. I decided to do this activity and asked 5 SQL Server person the same question. Question: What is the first thing comes to your mind when you hear the word “Logging”? Strange enough I got a different answer every single time. Let me just list what answer I got from my friends. Let us go over them one by one. Output Clause The very first person replied output clause. Pretty interesting answer to start with. I see what exactly he was thinking. SQL Server 2005 has introduced a new OUTPUT clause. OUTPUT clause has access to inserted and deleted tables (virtual tables) just like triggers. OUTPUT clause can be used to return values to client clause. OUTPUT clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows affected by these statements. Here are some references for Output Clause: OUTPUT Clause Example and Explanation with INSERT, UPDATE, DELETE Reasons for Using Output Clause – Quiz Tips from the SQL Joes 2 Pros Development Series – Output Clause in Simple Examples Error Logs I was expecting someone to mention Error logs when it is about logging. The error log is the most looked place when there is any error either with the application or there is an error with the operating system. I have kept the policy to check my server’s error log every day. The reason is simple – enough time in my career I have figured out that when I am looking at error logs I find something which I was not expecting. There are cases, when I noticed errors in the error log and I fixed them before end user notices it. Other common practices I always tell my DBA friends to do is that when any error happens they should find relevant entries in the error logs and document the same. It is quite possible that they will see the same error in the error log and able to fix the error based on the knowledge base which they have created. There can be many different kinds of error log files exists in SQL Server as well – 1) SQL Server Error Logs 2) Windows Event Log 3) SQL Server Agent Log 4) SQL Server Profile Log 5) SQL Server Setup Log etc. Here are some references for Error Logs: Recycle Error Log – Create New Log file without Server Restart SQL Error Messages Change Data Capture I got surprised with this answer. I think more than the answer I was surprised by the person who had answered me this one. I always thought he was expert in HTML, JavaScript but I guess, one should never assume about others. Indeed one of the cool logging feature is Change Data Capture. Change Data Capture records INSERTs, UPDATEs, and DELETEs applied to SQL Server tables, and makes a record available of what changed, where, and when, in simple relational ‘change tables’ rather than in an esoteric chopped salad of XML. These change tables contain columns that reflect the column structure of the source table you have chosen to track, along with the metadata needed to understand the changes that have been made. Here are some references for Change Data Capture: Introduction to Change Data Capture (CDC) in SQL Server 2008 Tuning the Performance of Change Data Capture in SQL Server 2008 Download Script of Change Data Capture (CDC) CDC and TRUNCATE – Cannot truncate table because it is published for replication or enabled for Change Data Capture Dynamic Management View (DMV) I like this answer. If asked I would have not come up with DMV right away but in the spirit of the original question, I think DMV does log the data. DMV logs or stores or records the various data and activity on the SQL Server. Dynamic management views return server state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance. One can get plethero of information from DMVs – High Availability Status, Query Executions Details, SQL Server Resources Status etc. Here are some references for Dynamic Management View (DMV): SQL SERVER – Denali – DMV Enhancement – sys.dm_exec_query_stats – New Columns DMV – sys.dm_os_windows_info – Information about Operating System DMV – sys.dm_os_wait_stats Explanation – Wait Type – Day 3 of 28 DMV sys.dm_exec_describe_first_result_set_for_object – Describes the First Result Metadata for the Module Transaction Log Impact Detection Using DMV – dm_tran_database_transactions Log Files I almost flipped with this final answer from my friend. This should be probably the first answer. Yes, indeed log file logs the SQL Server activities. One can write infinite things about log file. SQL Server uses log file with the extension .ldf to manage transactions and maintain database integrity. Log file ensures that valid data is written out to database and system is in a consistent state. Log files are extremely useful in case of the database failures as with the help of full backup file database can be brought in the desired state (point in time recovery is also possible). SQL Server database has three recovery models – 1) Simple, 2) Full and 3) Bulk Logged. Each of the model uses the .ldf file for performing various activities. It is very important to take the backup of the log files (along with full backup) as one never knows when backup of the log file come into the action and save the day! How to Stop Growing Log File Too Big Reduce the Virtual Log Files (VLFs) from LDF file Log File Growing for Model Database – model Database Log File Grew Too Big master Database Log File Grew Too Big SHRINKFILE and TRUNCATE Log File in SQL Server 2008 Can I just say I loved this month’s T-SQL Tuesday Question. It really provoked very interesting conversation around me. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – 3 Online SQL Courses at Pluralsight and Free Learning Resources

- by pinaldave

Usain Bolt is an inspiration for all. He broke his own record multiple times because he wanted to do better! Read more about him on wikipedia. He is great and indeed fastest man on the planet. Usain Bolt – World’s Fastest Man “Can you teach me SQL Server Performance Tuning?” This is one of the most popular questions which I receive all the time. The answer is YES. I would love to do performance tuning training for anyone, anywhere. It is my favorite thing to do, and it is my favorite thing to train others in. If possible, I would love to do training 24 hours a day, 7 days a week, 365 days a year. To me, it doesn’t feel like a job. Of course, as much as I would love to do performance tuning 24/7/365, obviously I am just one human being and can only be in one place t one time. It is also very difficult to train more than one person at a time, and it is difficult to train two or more people at a time, especially when the two people are at different levels. I am also limited by geography. I live in India, and adjust to my own time zone. Trying to teach a live course from India to someone whose time zone is 12 or more hours off of mine is very difficult. If I am trying to teach at 2 am, I am sure I am not at my best! There was only one solution to scale – Online Trainings. I have built 3 different courses on SQL Server Performance Tuning with Pluralsight. Now I have no problem – I am 100% scalable and available 24/7 and 365. You can make me say the same things again and again till you find it right. I am in your mobile, PC as well as on XBOX. This is why I am such a big fan of online courses. I have recorded many performance tuning classes and you can easily access them online, at your own time. And don’t think that just because these aren’t live classes you won’t be able to get any feedback from me. I encourage all my viewers to go ahead and ask me questions by e-mail, Twitter, Facebook, or whatever way you can get a hold of me. Here are details of three of my courses with Pluralsight. I suggest you go over the description of the course. As an author of the course, I have few FREE codes for watching the free courses. Please leave a comment with your valid email address, I will send a few of them to random winners. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an art to master – for developers and DBAs alike. This course takes a systematic approach to planning, analyzing, debugging and troubleshooting common query-related performance problems. This includes an introduction to understanding execution plans inside SQL Server. In this almost four hour course we cover following important concepts. Introduction 10:22 Execution Plan Basics 45:59 Essential Indexing Techniques 20:19 Query Design for Performance 50:16 Performance Tuning Tools 01:15:14 Tips and Tricks 25:53 Checklist: Performance Tuning 07:13 The duration of each module is mentioned besides the name of the module. SQL Server Performance: Indexing Basics This course teaches you how to master the art of performance tuning SQL Server by better understanding indexes. In this almost two hour course we cover following important concepts. Introduction 02:03 Fundamentals of Indexing 22:21 Practical Indexing Implementation Techniques 37:25 Index Maintenance 16:33 Introduction to ColumnstoreIndex 08:06 Indexing Practical Performance Tips and Tricks 24:56 Checklist : Index and Performance 07:29 The duration of each module is mentioned besides the name of the module. SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. In this almost 2 hours and 15 minutes course we cover following important concepts. Introduction 00:54 Retrieving IDENTITY value using @@IDENTITY 08:38 Concepts Related to Identity Values 04:15 Difference between WHERE and HAVING 05:52 Order in WHERE clause 07:29 Concepts Around Temporary Tables and Table Variables 09:03 Are stored procedures pre-compiled? 05:09 UNIQUE INDEX and NULLs problem 06:40 DELETE VS TRUNCATE 06:07 Locks and Duration of Transactions 15:11 Nested Transaction and Rollback 09:16 Understanding Date/Time Datatypes 07:40 Differences between VARCHAR and NVARCHAR datatypes 06:38 Precedence of DENY and GRANT security permissions 05:29 Identify Blocking Process 06:37 NULLS usage with Dynamic SQL 08:03 Appendix Tips and Tricks with Tools 20:44 The duration of each module is mentioned besides the name of the module. SQL in Sixty Seconds You will have to login and to get subscribed to the courses to view them. Here are my free video learning resources SQL in Sixty Seconds. These are 60 second video which I have built on various subjects related to SQL Server. Do let me know what you think about them? Here are three of my latest videos: Identify Most Resource Intensive Queries – SQL in Sixty Seconds #028 Copy Column Headers from Resultset – SQL in Sixty Seconds #027 Effect of Collation on Resultset – SQL in Sixty Seconds #026 You can watch and learn at your own pace. Then you can easily ask me any questions you have. E-mail is easiest, but for really tough questions I’m willing to talk on Skype, Gtalk, or even Facebook chat. Please do watch and then talk with me, I am always available on the internet! Here is the video of the world’s fastest man.Usain St. Leo Bolt inspires us that we all do better than best. We can go the next level of our own record. We all can improve if we have a will and dedication. Watch the video from 5:00 mark. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLServer, T SQL, Technology, Video

Read the article
SQL SERVER – SSMS Automatically Generates TOP (100) PERCENT in Query Designer

- by pinaldave

Earlier this week, I was surfing various SQL forums to see what kind of help developer need in the SQL Server world. One of the question indeed caught my attention. I am here regenerating complete question as well scenario to illustrate the point in a precise manner. Additionally, I have added added second part of the question to give completeness. Question: I am trying to create a view in Query Designer (not in the New Query Window). Every time I am trying to create a view it always adds TOP (100) PERCENT automatically on the T-SQL script. No matter what I do, it always automatically adds the TOP (100) PERCENT to the script. I have attempted to copy paste from notepad, build a query and a few other things – there is no success. I am really not sure what I am doing wrong with Query Designer. Here is my query script: (I use AdventureWorks as a sample database) SELECT Person.Address.AddressID FROM Person.Address INNER JOIN Person.AddressType ON Person.Address.AddressID = Person.AddressType.AddressTypeID ORDER BY Person.Address.AddressID This script automatically replaces by following query: SELECT TOP (100) PERCENT Person.Address.AddressID FROM Person.Address INNER JOIN Person.AddressType ON Person.Address.AddressID = Person.AddressType.AddressTypeID ORDER BY Person.Address.AddressID However, when I try to do the same from New Query Window it works totally fine. However, when I attempt to create a view of the same query it gives following error. Msg 1033, Level 15, State 1, Procedure myView, Line 6 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified. It is pretty clear to me now that the script which I have written seems to need TOP (100) PERCENT, so Query . Why do I need it? Is there any work around to this issue. I particularly find this question pretty interesting as it really touches the fundamentals of the T-SQL query writing. Please note that the query which is automatically changed is not in New Query Editor but opened from SSMS using following way. Database >> Views >> Right Click >> New View (see the image below) Answer: The answer to the above question can be very long but I will keep it simple and to the point. There are three things to discuss in above script 1) Reason for Error 2) Reason for Auto generates TOP (100) PERCENT and 3) Potential solutions to the above error. Let us quickly see them in detail. 1) Reason for Error The reason for error is already given in the error. ORDER BY is invalid in the views and a few other objects. One has to use TOP or other keywords along with it. The way semantics of the query works where optimizer only follows(honors) the ORDER BY in the same scope or the same SELECT/UPDATE/DELETE statement. There is a possibility that one can order after the scope of the view again the efforts spend to order view will be wasted. The final resultset of the query always follows the final ORDER BY or outer query’s order and due to the same reason optimizer follows the final order of the query and not of the views (as view will be used in another query for further processing e.g. in SELECT statement). Due to same reason ORDER BY is now allowed in the view. For further accuracy and clear guidance I suggest you read this blog post by Query Optimizer Team. They have explained it very clear manner the same subject. 2) Reason for Auto Generated TOP (100) PERCENT One of the most popular workaround to above error is to use TOP (100) PERCENT in the view. Now TOP (100) PERCENT allows user to use ORDER BY in the query and allows user to overcome above error which we discussed. This gives the impression to the user that they have resolved the error and successfully able to use ORDER BY in the View. Well, this is incorrect as well. The way this works is when TOP (100) PERCENT is used the result is not guaranteed as well it is ignored in our the query where the view is used. Here is the blog post on this subject: Interesting Observation – TOP 100 PERCENT and ORDER BY. Now when you create a new view in the SSMS and build a query with ORDER BY to avoid the error automatically it adds the TOP 100 PERCENT. Here is the connect item for the same issue. I am sure there will be more connect items as well but I could not find them. 3) Potential Solutions If you are reading this post from the beginning in that case, it is clear by now that ORDER BY should not be used in the View as it does not serve any purpose unless there is a specific need of it. If you are going to use TOP 100 PERCENT with ORDER BY there is absolutely no need of using ORDER BY rather avoid using it all together. Here is another blog post of mine which describes the same subject ORDER BY Does Not Work – Limitation of the Views Part 1. It is valid to use ORDER BY in a view if there is a clear business need of using TOP with any other percentage lower than 100 (for example TOP 10 PERCENT or TOP 50 PERCENT etc). In most of the cases ORDER BY is not needed in the view and it should be used in the most outer query for present result in desired order. User can remove TOP 100 PERCENT and ORDER BY from the view before using the view in any query or procedure. In the most outer query there should be ORDER BY as per the business need. I think this sums up the concept in a few words. This is a very long topic and not easy to illustrate in one single blog post. I welcome your comments and suggestions. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, SQL View, T SQL, Technology

Read the article
Thread placement policies on NUMA systems - update

- by Dave

In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

Read the article
SQL SERVER – Weekly Series – Memory Lane – #048

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Order of Result Set of SELECT Statement on Clustered Indexed Table When ORDER BY is Not Used Above theory is true in most of the cases. However SQL Server does not use that logic when returning the resultset. SQL Server always returns the resultset which it can return fastest.In most of the cases the resultset which can be returned fastest is the resultset which is returned using clustered index. Effect of TRANSACTION on Local Variable – After ROLLBACK and After COMMIT One of the Jr. Developer asked me this question (What will be the Effect of TRANSACTION on Local Variable – After ROLLBACK and After COMMIT?) while I was rushing to an important meeting. I was getting late so I asked him to talk with his Application Tech Lead. When I came back from meeting both of them were looking for me. They said they are confused. I quickly wrote down following example for them. 2008 SQL SERVER – Guidelines and Coding Standards Complete List Download Coding standards and guidelines are very important for any developer on the path of a successful career. A coding standard is a set of guidelines, rules and regulations on how to write code. Coding standards should be flexible enough or should take care of the situation where they should not prevent best practices for coding. They are basically the guidelines that one should follow for better understanding. Download Guidelines and Coding Standards complete List Download Get Answer in Float When Dividing of Two Integer Many times we have requirements of some calculations amongst different fields in Tables. One of the software developers here was trying to calculate some fields having integer values and divide it which gave incorrect results in integer where accurate results including decimals was expected. Puzzle – Computed Columns Datatype Explanation SQL Server automatically does a cast to the data type having the highest precedence. So the result of INT and INT will be INT, but INT and FLOAT will be FLOAT because FLOAT has a higher precedence. If you want a different data type, you need to do an EXPLICIT cast. Renaming SP is Not Good Idea – Renaming Stored Procedure Does Not Update sys.procedures I have written many articles about renaming a tables, columns and procedures SQL SERVER – How to Rename a Column Name or Table Name, here I found something interesting about renaming the stored procedures and felt like sharing it with you all. The interesting fact is that when we rename a stored procedure using SP_Rename command, the Stored Procedure is successfully renamed. But when we try to test the procedure using sp_helptext, the procedure will be having the old name instead of new names. 2009 Insert Values of Stored Procedure in Table – Use Table Valued Function It is clear from the result set that , where I have converted stored procedure logic into the table valued function, is much better in terms of logic as it saves a large number of operations. However, this option should be used carefully. The performance of the stored procedure is “usually” better than that of functions. Interesting Observation – Index on Index View Used in Similar Query Recently, I was working on an optimization project for one of the largest organizations. While working on one of the queries, we came across a very interesting observation. We found that there was a query on the base table and when the query was run, it used the index, which did not exist in the base table. On careful examination, we found that the query was using the index that was on another view. This was very interesting as I have personally never experienced a scenario like this. In simple words, “Query on the base table can use the index created on the indexed view of the same base table.” Interesting Observation – Execution Plan and Results of Aggregate Concatenation Queries Working with SQL Server has never seemed to be monotonous – no matter how long one has worked with it. Quite often, I come across some excellent comments that I feel like acknowledging them as blog posts. Recently, I wrote an article on SQL SERVER – Execution Plan and Results of Aggregate Concatenation Queries Depend Upon Expression Location, which is well received in the community. 2010 I encourage all of you to go through complete series and write your own on the subject. If you write an article and send it to me, I will publish it on this blog with due credit to you. If you write on your own blog, I will update this blog post pointing to your blog post. SQL SERVER – ORDER BY Does Not Work – Limitation of the View 1 SQL SERVER – Adding Column is Expensive by Joining Table Outside View – Limitation of the View 2 SQL SERVER – Index Created on View not Used Often – Limitation of the View 3 SQL SERVER – SELECT * and Adding Column Issue in View – Limitation of the View 4 SQL SERVER – COUNT(*) Not Allowed but COUNT_BIG(*) Allowed – Limitation of the View 5 SQL SERVER – UNION Not Allowed but OR Allowed in Index View – Limitation of the View 6 SQL SERVER – Cross Database Queries Not Allowed in Indexed View – Limitation of the View 7 SQL SERVER – Outer Join Not Allowed in Indexed Views – Limitation of the View 8 SQL SERVER – SELF JOIN Not Allowed in Indexed View – Limitation of the View 9 SQL SERVER – Keywords View Definition Must Not Contain for Indexed View – Limitation of the View 10 SQL SERVER – View Over the View Not Possible with Index View – Limitations of the View 11 2011 Startup Parameters Easy to Configure If you are a regular reader of this blog, you must be aware that I have written about SQL Server Denali recently. Here is the quickest way to reach into the screen where we can change the startup parameters. Go to SQL Server Configuration Manager >> SQL Server Services >> Right Click on the Server >> Properties >> Startup Parameters 2012 Validating Unique Columnname Across Whole Database I sometimes come across very strange requirements and often I do not receive a proper explanation of the same. Here is the one of those examples. For example “Our business requirement is when we add new column we want it unique across current database.” Read the solution to this strange request in this blog post. Excel Losing Decimal Values When Value Pasted from SSMS ResultSet It is very common when users are coping the resultset to Excel, the floating point or decimals are missed. The solution is very much simple and it requires a small adjustment in the Excel. By default Excel is very smart and when it detects the value which is getting pasted is numeric it changes the column format to accommodate that. Basic Calculation and PEMDAS Order of Operation Read this interesting blog post for fantastic conversation about the subject. Copy Column Headers from Resultset – SQL in Sixty Seconds #027 – Video http://www.youtube.com/watch?v=x_-3tLqTRv0 Delete From Multiple Table – Update Multiple Table in Single Statement There are two questions which I get every single day multiple times. In my gmail, I have created standard canned reply for them. Let us see the questions here. I want to delete from multiple table in a single statement how will I do it? I want to update multiple table in a single statement how will I do it? Read the answer in the blog post. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Thinking about Deprecated, Discontinued Features and Breaking Changes while Upgrading to SQL Server 2012 – Guest Post by Nakul Vachhrajani

- by pinaldave

Nakul Vachhrajani is a Technical Specialist and systems development professional with iGATE having a total IT experience of more than 7 years. Nakul is an active blogger with BeyondRelational.com (150+ blogs), and can also be found on forums at SQLServerCentral and BeyondRelational.com. Nakul has also been a guest columnist for SQLAuthority.com and SQLServerCentral.com. Nakul presented a webcast on the “Underappreciated Features of Microsoft SQL Server” at the Microsoft Virtual Tech Days Exclusive Webcast series (May 02-06, 2011) on May 06, 2011. He is also the author of a research paper on Database upgrade methodologies, which was published in a CSI journal, published nationwide. In addition to his passion about SQL Server, Nakul also contributes to the academia out of personal interest. He visits various colleges and universities as an external faculty to judge project activities being carried out by the students. Disclaimer: The opinions expressed herein are his own personal opinions and do not represent his employer’s view in anyway. Blog | LinkedIn | Twitter | Google+ Let us hear the thoughts of Nakul in first person - Those who have been following my blogs would be aware that I am recently running a series on the database engine features that have been deprecated in Microsoft SQL Server 2012. Based on the response that I have received, I was quite surprised to know that most of the audience found these to be breaking changes, when in fact, they were not! It was then that I decided to write a little piece on how to plan your database upgrade such that it works with the next version of Microsoft SQL Server. Please note that the recommendations made in this article are high-level markers and are intended to help you think over the specific steps that you would need to take to upgrade your database. Refer the documentation – Understand the terms Change is the only constant in this world. Therefore, whenever customer requirements, newer architectures and designs require software vendors to make a change to the keywords, functions, etc; they ensure that they provide their end users sufficient time to migrate over to the new standards before dropping off the old ones. Microsoft does that too with it’s Microsoft SQL Server product. Whenever a new SQL Server release is announced, it comes with a list of the following features: Breaking changes These are changes that would break your currently running applications, scripts or functionalities that are based on earlier version of Microsoft SQL Server These are mostly features whose behavior has been changed keeping in mind the newer architectures and designs Lesson: These are the changes that you need to be most worried about! Discontinued features These features are no longer available in the associated version of Microsoft SQL Server These features used to be “deprecated” in the prior release Lesson: Without these changes, your database would not be compliant/may not work with the version of Microsoft SQL Server under consideration Deprecated features These features are those that are still available in the current version of Microsoft SQL Server, but are scheduled for removal in a future version. These may be removed in either the next version or any other future version of Microsoft SQL Server The features listed for deprecation will compose the list of discontinued features in the next version of SQL Server Lesson: Plan to make necessary changes required to remove/replace usage of the deprecated features with the latest recommended replacements Once a feature appears on the list, it moves from bottom to the top, i.e. it is first marked as “Deprecated” and then “Discontinued”. We know of “Breaking change” comes later on in the product life cycle. What this means is that if you want to know what features would not work with SQL Server 2012 (and you are currently using SQL Server 2008 R2), you need to refer the list of breaking changes and discontinued features in SQL Server 2012. Use the tools! There are a lot of tools and technologies around us, but it is rarely that I find teams using these tools religiously and to the best of their potential. Below are the top two tools, from Microsoft, that I use every time I plan a database upgrade. The SQL Server Upgrade Advisor Ever since SQL Server 2005 was announced, Microsoft provides a small, very light-weight tool called the “SQL Server upgrade advisor”. The upgrade advisor analyzes installed components from earlier versions of SQL Server, and then generates a report that identifies issues to fix either before or after you upgrade. The analysis examines objects that can be accessed, such as scripts, stored procedures, triggers, and trace files. Upgrade Advisor cannot analyze desktop applications or encrypted stored procedures. Refer the links towards the end of the post to know how to get the Upgrade Advisor. The SQL Server Profiler Another great tool that you can use is the one most SQL Server developers & administrators use often – the SQL Server profiler. SQL Server Profiler provides functionality to monitor the “Deprecation” event, which contains: Deprecation announcement – equivalent to features to be deprecated in a future release of SQL Server Deprecation final support – equivalent to features to be deprecated in the next release of SQL Server You can learn more using the links towards the end of the post. A basic checklist There are a lot of finer points that need to be taken care of when upgrading your database. But, it would be worth-while to identify a few basic steps in order to make your database compliant with the next version of SQL Server: Monitor the current application workload (on a test bed) via the Profiler in order to identify usage of features marked as Deprecated If none appear, you are all set! (This almost never happens) Note down all the offending queries and feature usages Run analysis sessions using the SQL Server upgrade advisor on your database Based on the inputs from the analysis report and Profiler trace sessions, Incorporate solutions for the breaking changes first Next, incorporate solutions for the discontinued features Revisit and document the upgrade strategy for your deployment scenarios Revisit the fall-back, i.e. rollback strategies in case the upgrades fail Because some programming changes are dependent upon the SQL server version, this may need to be done in consultation with the development teams Before any other enhancements are incorporated by the development team, send out the database changes into QA QA strategy should involve a comparison between an environment running the old version of SQL Server against the new one Because minimal application changes have gone in (essential changes for SQL Server version compliance only), this would be possible As an ongoing activity, keep incorporating changes recommended as per the deprecated features list As a DBA, update your coding standards to ensure that the developers are using ANSI compliant code – this code will require a change only if the ANSI standard changes Remember this: Change management is a continuous process. Keep revisiting the product release notes and incorporate recommended changes to stay prepared for the next release of SQL Server. May the power of SQL Server be with you! Links Referenced in this post Breaking changes in SQL Server 2012: Link Discontinued features in SQL Server 2012: Link Get the upgrade advisor from the Microsoft Download Center at: Link Upgrade Advisor page on MSDN: Link Profiler: Review T-SQL code to identify objects no longer supported by Microsoft: Link Upgrading to SQL Server 2012 by Vinod Kumar: Link Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Upgrade

Read the article
SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – Weekly Series – Memory Lane – #032

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Complete Series of Database Coding Standards and Guidelines SQL SERVER Database Coding Standards and Guidelines – Introduction SQL SERVER – Database Coding Standards and Guidelines – Part 1 SQL SERVER – Database Coding Standards and Guidelines – Part 2 SQL SERVER Database Coding Standards and Guidelines Complete List Download Explanation and Example – SELF JOIN When all of the data you require is contained within a single table, but data needed to extract is related to each other in the table itself. Examples of this type of data relate to Employee information, where the table may have both an Employee’s ID number for each record and also a field that displays the ID number of an Employee’s supervisor or manager. To retrieve the data tables are required to relate/join to itself. Insert Multiple Records Using One Insert Statement – Use of UNION ALL This is very interesting question I have received from new developer. How can I insert multiple values in table using only one insert? Now this is interesting question. When there are multiple records are to be inserted in the table following is the common way using T-SQL. Function to Display Current Week Date and Day – Weekly Calendar Straight blog post with script to find current week date and day based on the parameters passed in the function. 2008 In my beginning years, I have almost same confusion as many of the developer had in their earlier years. Here are two of the interesting question which I have attempted to answer in my early year. Even if you are experienced developer may be you will still like to read following two questions: Order Of Column In Index Order of Conditions in WHERE Clauses Example of DISTINCT in Aggregate Functions Have you ever used DISTINCT with the Aggregation Function? Here is a simple example about how users can do it. Create a Comma Delimited List Using SELECT Clause From Table Column Straight to script example where I explained how to do something easy and quickly. Compound Assignment Operators SQL SERVER 2008 has introduced new concept of Compound Assignment Operators. Compound Assignment Operators are available in many other programming languages for quite some time. Compound Assignment Operators is operator where variables are operated upon and assigned on the same line. PIVOT and UNPIVOT Table Examples Here is a very interesting question – the answer to the question can be YES or NO both. “If we PIVOT any table and UNPIVOT that table do we get our original table?” Read the blog post to get the explanation of the question above. 2009 What is Interim Table – Simple Definition of Interim Table The interim table is a table that is generated by joining two tables and not the final result table. In other words, when two tables are joined they create an interim table as resultset but the resultset is not final yet. It may be possible that more tables are about to join on the interim table, and more operations are still to be applied on that table (e.g. Order By, Having etc). Besides, it may be possible that there is no interim table; sometimes final table is what is generated when the query is run. 2010 Stored Procedure and Transactions If Stored Procedure is transactional then, it should roll back complete transactions when it encounters any errors. Well, that does not happen in this case, which proves that Stored Procedure does not only provide just the transactional feature to a batch of T-SQL. Generate Database Script for SQL Azure When talking about SQL Azure the most common complaint I hear is that the script generated from stand-along SQL Server database is not compatible with SQL Azure. This was true for some time for sure but not any more. If you have SQL Server 2008 R2 installed you can follow the guideline below to generate a script which is compatible with SQL Azure. Convert IN to EXISTS – Performance Talk It is NOT necessary that every time when IN is replaced by EXISTS it gives better performance. However, in our case listed above it does for sure give better performance. You can read about this subject in the associated blog post. Subquery or Join – Various Options – SQL Server Engine Knows the Best Every single time whenever there is a performance tuning exercise, I hear the conversation from developer where some prefer subquery and some prefer join. In this two part blog post, I explain the same in the detail with examples. Part 1 | Part 2 Merge Operations – Insert, Update, Delete in Single Execution MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. 2011 Puzzle – Statistics are not updated but are Created Once Here is the quick scenario about my setup. Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated – WHY? Question to You – When to use Function and When to use Stored Procedure Personally, I believe that they are both different things - they cannot be compared. I can say, it will be like comparing apples and oranges. Each has its own unique use. However, they can be used interchangeably at many times and in real life (i.e., production environment). I have personally seen both of these being used interchangeably many times. This is the precise reason for asking this question. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Guess the Next Value – Puzzle 4 Simple Example to Configure Resource Governor – Introduction to Resource Governor Resource Governor is a feature which can manage SQL Server Workload and System Resource Consumption. We can limit the amount of CPU and memory consumption by limiting /governing /throttling on the SQL Server. If there are different workloads running on SQL Server and each of the workload needs different resources or when workloads are competing for resources with each other and affecting the performance of the whole server resource governor is a very important task. Tricks to Replace SELECT * with Column Names – SQL in Sixty Seconds #017 – Video Retrieves unnecessary columns and increases network traffic When a new columns are added views needs to be refreshed manually Leads to usage of sub-optimal execution plan Uses clustered index in most of the cases instead of using optimal index It is difficult to debug SQL SERVER – Load Generator – Free Tool From CodePlex The best part of this SQL Server Load Generator is that users can run multiple simultaneous queries again SQL Server using different login account and different application name. The interface of the tool is extremely easy to use and very intuitive as well. A Puzzle – Swap Value of Column Without Case Statement Let us assume there is a single column in the table called Gender. The challenge is to write a single update statement which will flip or swap the value in the column. For example if the value in the gender column is ‘male’ swap it with ‘female’ and if the value is ‘female’ swap it with ‘male’. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #050

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Executing Remote Stored Procedure – Calling Stored Procedure on Linked Server In this example we see two different methods of how to call Stored Procedures remotely. Connection Property of SQL Server Management Studio SSMS A very simple example of the how to build connection properties for SQL Server with the help of SSMS. Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE SQL Server has a total of 4 ranking functions. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic. T-SQL Script to Add Clustered Primary Key Jr. DBA asked me three times in a day, how to create Clustered Primary Key. I gave him following sample example. That was the last time he asked “How to create Clustered Primary Key to table?” 2008 2008 – TRIM() Function – User Defined Function SQL Server does not have functions which can trim leading or trailing spaces of any string at the same time. SQL does have LTRIM() and RTRIM() which can trim leading and trailing spaces respectively. SQL Server 2008 also does not have TRIM() function. User can easily use LTRIM() and RTRIM() together and simulate TRIM() functionality. http://www.youtube.com/watch?v=1-hhApy6MHM 2009 Earlier I have written two different articles on the subject Remove Bookmark Lookup. This article is as part 3 of original article. Please read the first two articles here before continuing reading this article. Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2 Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3 Interesting Observation – Query Hint – FORCE ORDER SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him. Queries Waiting for Memory Allocation to Execute In one of the recent projects, I was asked to create a report of queries that are waiting for memory allocation. The reason was that we were doubtful regarding whether the memory was sufficient for the application. The following query can be useful in similar cases. Queries that do not have to wait on a memory grant will not appear in the result set of following query. 2010 Quickest Way to Identify Blocking Query and Resolution – Dirty Solution As the title suggests, this is quite a dirty solution; it’s not as elegant as you expect. However, it works totally fine. Simple Explanation of Data Type Precedence While I was working on creating a question for SQL SERVER – SQL Quiz – The View, The Table and The Clustered Index Confusion, I had actually created yet another question along with this question. However, I felt that the one which is posted on the SQL Quiz is much better than this one because what makes that more challenging question is that it has a multiple answer. Encrypted Stored Procedure and Activity Monitor I recently had received questionable if any stored procedure is encrypted can we see its definition in Activity Monitor.Answer is - No. Let us do a quick test. Let us create following Stored Procedure and then launch the Activity Monitor and check the text. Indexed View always Use Index on Table A single table can have maximum 249 non clustered indexes and 1 clustered index. In SQL Server 2008, a single table can have maximum 999 non clustered indexes and 1 clustered index. It is widely believed that a table can have only 1 clustered index, and this belief is true. I have some questions for all of you. Let us assume that I am creating view from the table itself and then create a clustered index on it. In my view, I am selecting the complete table itself. 2011 Detecting Database Case Sensitive Property using fn_helpcollations() I received a question on how to determine the case sensitivity of the database. The quick answer to this is to identify the collation of the database and check the properties of the collation. I have previously written how one can identify database collation. Once you have figured out the collation of the database, you can put that in the WHERE condition of the following T-SQL and then check the case sensitivity from the description. Server Side Paging in SQL Server CE (Compact Edition) SQL Server Denali is coming up with new T-SQL of Paging. I have written about the same earlier.SQL SERVER – Server Side Paging in SQL Server Denali – A Better Alternative, SQL SERVER – Server Side Paging in SQL Server Denali Performance Comparison, SQL SERVER – Server Side Paging in SQL Server Denali – Part2 What is very interesting is that SQL Server CE 4.0 have the same feature introduced. Here is the quick example of the same. To run the script in the example, you will have to do installWebmatrix 4.0 and download sample database. Once done you can run following script. Why I am Going to Attend PASS Summit Unite 2011 The four-day event will be marked by a lot of learning, sharing, and networking, which will help me increase both my knowledge and contacts. Every year, PASS Summit provides me a golden opportunity to build my network as well as to identify and meet potential customers or employees. 2012 Manage Help Settings – CTRL + ALT + F1 This is very interesting read as my daughter once accidently came across a screen in SQL Server Management Studio. It took me 2-3 minutes to figure out how she has created the same screen. Recover the Accidentally Renamed Table “I accidentally renamed table in my SSMS. I was scrolling very fast and I made mistakes. It was either because I double clicked or clicked on F2 (shortcut key for renaming). However, I have made the mistake and now I have no idea how to fix this. If you have renamed the table, I think you pretty much is out of luck. Here are few things which you can do which can give you an idea about what your table name can be if you are lucky. Identify Numbers of Non Clustered Index on Tables for Entire Database Here is the script which will give you numbers of non clustered indexes on any table in entire database. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #029 – Video Here is the complete complete script which I have used in the SQL in Sixty Seconds Video. Thanks Harsh for important Tip in the comment. http://www.youtube.com/watch?v=3kDHC_Tjrns Advanced Data Quality Services with Melissa Data – Azure Data Market For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #034

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 UDF – User Defined Function to Strip HTML – Parse HTML – No Regular Expression The UDF used in the blog does fantastic task – it scans entire HTML text and removes all the HTML tags. It keeps only valid text data without HTML task. This is one of the quite commonly requested tasks many developers have to face everyday. De-fragmentation of Database at Operating System to Improve Performance Operating system skips MDF file while defragging the entire filesystem of the operating system. It is absolutely fine and there is no impact of the same on performance. Read the entire blog post for my conversation with our network engineers. Delay Function – WAITFOR clause – Delay Execution of Commands How do you delay execution of the commands in SQL Server – ofcourse by using WAITFOR keyword. In this blog post, I explain the same with the help of T-SQL script. Find Length of Text Field To measure the length of TEXT fields the function is DATALENGTH(textfield). Len will not work for text field. As of SQL Server 2005, developers should migrate all the text fields to VARCHAR(MAX) as that is the way forward. Retrieve Current Date Time in SQL Server CURRENT_TIMESTAMP, GETDATE(), {fn NOW()} There are three ways to retrieve the current datetime in SQL SERVER. CURRENT_TIMESTAMP, GETDATE(), {fn NOW()} Explanation and Comparison of NULLIF and ISNULL An interesting observation is NULLIF returns null if it comparison is successful, whereas ISNULL returns not null if its comparison is successful. In one way they are opposite to each other. Here is my question to you - How to create infinite loop using NULLIF and ISNULL? If this is even possible? 2008 Introduction to SERVERPROPERTY and example SERVERPROPERTY is a very interesting system function. It returns many of the system values. I use it very frequently to get different server values like Server Collation, Server Name etc. SQL Server Start Time We can use DMV to find out what is the start time of SQL Server in 2008 and later version. In this blog you can see how you can do the same. Find Current Identity of Table Many times we need to know what is the current identity of the column. I have found one of my developers using aggregated function MAX () to find the current identity. However, I prefer following DBCC command to figure out current identity. Create Check Constraint on Column Some time we just need to create a simple constraint over the table but I have noticed that developers do many different things to make table column follow rules than just creating constraint. I suggest constraint is a very useful concept and every SQL Developer should pay good attention to this subject. 2009 List Schema Name and Table Name for Database This is one of the blog post where I straight forward display script. One of the kind of blog posts, which I still love to read and write. Clustered Index on Separate Drive From Table Location A table devoid of primary key index is called heap, and here data is not arranged in a particular order, which gives rise to issues that adversely affect performance. Data must be stored in some kind of order. If we put clustered index on it then the order will be forced by that index and the data will be stored in that particular order. Understanding Table Hints with Examples Hints are options and strong suggestions specified for enforcement by the SQL Server query processor on DML statements. The hints override any execution plan the query optimizer might select for a query. 2010 Data Pages in Buffer Pool – Data Stored in Memory Cache One of my earlier year article, which I still read it many times and point developers to read it again. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. TRANSACTION, DML and Schema Locks Can you create a situation where you can see Schema Lock? Well, this is a very simple question, however during the interview I notice over 50 candidates failed to come up with the scenario. In this blog post, I have demonstrated the situation where we can see the schema lock in database. 2011 Solution – Puzzle – Statistics are not updated but are Created Once In this example I have created following situation: Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated Auto Update Statistics and Auto Create Statistics for database is TRUE Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue? Selecting Domain from Email Address This is a straight to script blog post where I explain how to select only domain name from entire email address. Solution – Generating Zero Without using Any Numbers in T-SQL How to get zero digit without using any digit? This is indeed a very interesting question and the answer is even interesting. Try to come up with answer in next 10 minutes and if you can’t come up with the answer the blog post read this post for solution. 2012 Simple Explanation and Puzzle with SOUNDEX Function and DIFFERENCE Function In simple words - SOUNDEX converts an alphanumeric string to a four-character code to find similar-sounding words or names. DIFFERENCE function returns an integer value. The integer returned is the number of characters in the SOUNDEX values that are the same. Read Only Files and SQL Server Management Studio (SSMS) I have come across a very interesting feature in SSMS related to “Read Only” files. I believe it is a little unknown feature as well so decided to write a blog about the same. Identifying Column Data Type of uniqueidentifier without Querying System Tables How do I know if any table has a uniqueidentifier column and what is its value without using any DMV or System Catalogues? Only information you know is the table name and you are allowed to return any kind of error if the table does not have uniqueidentifier column. Read the blog post to find the answer. Solution – User Not Able to See Any User Created Object in Tables – Security and Permissions Issue Interesting question – “When I try to connect to SQL Server, it lets me connect just fine as well let me open and explore the database. I noticed that I do not see any user created instances but when my colleague attempts to connect to the server, he is able to explore the database as well see all the user created tables and other objects. Can you help me fix it?” Importing CSV File Into Database – SQL in Sixty Seconds #018 – Video Here is interesting small 60 second video on how to import CSV file into Database. ColumnStore Index – Batch Mode vs Row Mode Here is the logic behind when Columnstore Index uses Batch Mode and when it uses Row Mode. A batch typically represents about 1000 rows of data. Batch mode processing also uses algorithms that are optimized for the multicore CPUs and increased memory throughput. Follow up – Usage of $rowguid and $IDENTITY This is an excellent follow up blog post of my earlier blog post where I explain where to use $rowguid and $identity. If you do not know the difference between them, this is a blog with a script example. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Using expressor Composite Types to Enforce Business Rules

- by pinaldave

One of the features that distinguish the expressor Data Integration Platform from other products in the data integration space is its concept of composite types, which provide an effective and easily reusable way to clearly define the structure and characteristics of data within your application. An important feature of the composite type approach is that it allows you to easily adjust the content of a record to its ultimate purpose. For example, a record used to update a row in a database table is easily defined to include only the minimum set of columns, that is, a value for the key column and values for only those columns that need to be updated. Much like a class in higher level programming languages, you can also use the composite type as a way to enforce business rules onto your data by encapsulating a datum’s name, data type, and constraints (for example, maximum, minimum, or acceptable values) as a single entity, which ensures that your data can not assume an invalid value. To what extent you use this functionality is a decision you make when designing your application; the expressor design paradigm does not force this approach on you. Let’s take a look at how these features are used. Suppose you want to create a group of applications that maintain the employee table in your human resources database. Your table might have a structure similar to the HumanResources.Employee table in the AdventureWorks database. This table includes two columns, EmployeID and rowguid, that are maintained by the relational database management system; you cannot provide values for these columns when inserting new rows into the table. Additionally, there are columns such as VacationHours and SickLeaveHours that you might choose to update for all employees on a monthly basis, which justifies creation of a dedicated application. By creating distinct composite types for the read, insert and update operations against this table, you can more easily manage this table’s content. When developing this application within expressor Studio, your first task is to create a schema artifact for the database table. This process is completely driven by a wizard, only requiring that you select the desired database schema and table. The resulting schema artifact defines the mapping of result set records to a record within the expressor data integration application. The structure of the record within the expressor application is a composite type that is given the default name CompositeType1. As you can see in the following figure, all columns from the table are included in the result set and mapped to an identically named attribute in the default composite type. If you are developing an application that needs to read this table, perhaps to prepare a year-end report of employees by department, you would probably not be interested in the data in the rowguid and ModifiedDate columns. A typical approach would be to drop this unwanted data in a downstream operator. But using an alternative composite type provides a better approach in which the unwanted data never enters your application. While working in expressor Studio’s schema editor, simply create a second composite type within the same schema artifact, which you could name ReadTable, and remove the attributes corresponding to the unwanted columns. The value of an alternative composite type is even more apparent when you want to insert into or update the table. In the composite type used to insert rows, remove the attributes corresponding to the EmployeeID primary key and rowguid uniqueidentifier columns since these values are provided by the relational database management system. And to update just the VacationHours and SickLeaveHours columns, use a composite type that includes only the attributes corresponding to the EmployeeID, VacationHours, SickLeaveHours and ModifiedDate columns. By specifying this schema artifact and composite type in a Write Table operator, your upstream application need only deal with the four required attributes and there is no risk of unintentionally overwriting a value in a column that does not need to be updated. Now, what about the option to use the composite type to enforce business rules? If you review the composition of the default composite type CompositeType1, you will note that the constraints defined for many of the attributes mirror the table column specifications. For example, the maximum number of characters in the NationaIDNumber, LoginID and Title attributes is equivalent to the maximum width of the target column, and the size of the MaritalStatus and Gender attributes is limited to a single character as required by the table column definition. If your application code leads to a violation of these constraints, an error will be raised. The expressor design paradigm then allows you to handle the error in a way suitable for your application. For example, a string value could be truncated or a numeric value could be rounded. Moreover, you have the option of specifying additional constraints that support business rules unrelated to the table definition. Let’s assume that the only acceptable values for marital status are S, M, and D. Within the schema editor, double-click on the MaritalStatus attribute to open the Edit Attribute window. Then click the Allowed Values checkbox and enter the acceptable values into the Constraint Value text box. The schema editor is updated accordingly. There is one more option that the expressor semantic type paradigm supports. Since the MaritalStatus attribute now clearly specifies how this type of information should be represented (a single character limited to S, M or D), you can convert this attribute definition into a shared type, which will allow you to quickly incorporate this definition into another composite type or into the description of an output record from a transform operator. Again, double-click on the MaritalStatus attribute and in the Edit Attribute window, click Convert, which opens the Share Local Semantic Type window that you use to name this shared type. There’s no requirement that you give the shared type the same name as the attribute from which it was derived. You should supply a name that makes it obvious what the shared type represents. In this posting, I’ve overviewed the expressor semantic type paradigm and shown how it can be used to make your application development process more productive. The beauty of this feature is that you choose when and to what extent you utilize the functionality, but I’m certain that if you opt to follow this approach your efforts will become more efficient and your work will progress more quickly. As always, I encourage you to download and evaluate expressor Studio for your current and future data integration needs. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, SQL, SQL Authority, SQL Documentation, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
Visiting the Fire Station in Coromandel

Hm, I just tried to remember how we actually came up with this cool idea... but it's already too blurred and it doesn't really matter after all. Anyway, if I remember correctly (IIRC), it happened during one of the Linux meetups at Mugg & Bean, Bagatelle where Ajay and I brought our children along and we had a brief conversation about how cool it would be to check out one of the fire stations here in Mauritius. We both thought that it would be a great experience and adventure for the little ones. An idea takes shape And there we go, down the usual routine these... having an idea, checking out the options and discussing who's doing what. Except this time, it was all up to Ajay, and he did a fantastic job. End of August, he told me that he got in touch with one of his friends which actually works as a fire fighter at the station in Coromandel and that there could be an option to come and visit them (soon). A couple of days later - Confirmed! Be there, and in time... What time? Anyway, doesn't really matter... Everything was settled and arranged. I asked the kids on Friday afternoon if they might be interested to see the fire engines and what a fire fighter is doing. Of course, they were all in! Getting up early on Sunday morning isn't really a regular exercise for all of us but everything went smooth and after a short breakfast it was time to leave. Where are we going? Are we there yet? Now, we are in Bambous. Why do you go this way? The kids were so much into it. Absolutely amazing to see their excitement. Are we there yet? Well, we went through the sugar cane fields towards Chebel and then down into the industrial zone at Coromandel. Honestly, I had a clue where the fire station is located but having Google Maps in reach that shouldn't be a problem in case that we might get lost. But my worries were washed away when our children guided us... "There! Over there are the fire engines! We have to turn left, dad." - No comment, the kids were right! As we were there a little bit too early, we parked the car and the kids started to explore the area and outskirts of the fire station. Some minutes later, as if we had placed an order a unit of two cars had to go out for an alarm and the kids could witness them leaving as closely as possible. Sirens on and wow!!! Ladder truck L32 - MAN truck with Rosenbauer built-up and equipment by Metz Taking the tour Ajay arrived shortly after that and guided us finally inside the station to meet with his pal. The three guys were absolutely well-prepared and showed us around in the hall, explaining that there two units out at the moment. But the ladder truck (with max. 32m expandable height) was still around we all got a great insight into the technique and equipment on the vehicle. It was amazing to see all three kids listening to Mambo as give some figures about the truck and how the fire fighters are actually it. The children and 'our' fire fighters of the day had great fun with the various fire engines Absolutely fantastic that the children were allowed to experience this - we had so much fun! Ajay's son brought two of his toy fire engines along, shared them with ours, and they all played very well together. As a parent it was really amazing to see them at such an ease. Enough theory Shortly afterwards the ladder truck was moved outside, got stabilised and ready to go for 'real-life' exercising. With the additional equipment of safety helmets, security belts and so on, we all got a first-hand impression about how it could be as a fire-fighter. Actually, I was totally amazed by the curiousity and excitement of my BWE. She was really into it and asked lots of interesting questions - in general but also technical. And while our fighters were busy with Ajay and family, I gave her some more details and explanations about the truck, the expandable ladder, the safety cage at the top and other equipment available. Safety first! No exceptions and always be prepared for the worst case... Also, the equipped has been checked prior to excuse - This is your life saver... Hooked up and ready to go... ...of course not too high. This is just a demonstration - and 32 meters above ground isn't for everyone. Well, after that it was me that had the asking looks on me, and I finally revealed to the local fire fighters that I was in the auxiliary fire brigade, more precisely in the hazard department, for more than 10 years. So not a professional fire fighter but at least a passionate and educated one as them. Inside the station Our fire fighters really took their time to explain their daily job to kids, provided them access to operation seat on the ladder truck and how the truck cabin is actually equipped with the different radios and so on. It was really a great time. Later on we had a brief tour through the building itself, and again all of our questions were answered. We had great fun and started to joke about bits and pieces. For me it was also very interesting to see the comparison between the fire station here in Mauritius and the ones I have been to back in Germany. Amazing to see them completely captivated in the play - the children had lots of fun! Also, that there are currently ten fire stations all over the island, plus two additional but private ones at the airport and at the harbour. The newest one is actually down in Black River on the west coast because the time from Quatre Bornes takes too long to have any chance of an effective alarm at all. IMHO, a very good decision as time is the most important factor in getting fire incidents under control. After all it was great experience for all of us, especially for the children to see and understand that their toy trucks are only copies of the real thing and that the job of a (professional) fire fighter is very important in our society. Don't forget that those guys run into the danger zone while you're trying to get away from it as much as possible. Another unit just came back from a grass fire - and shortly after they went out again. No time to rest, too much to do! Mauritian Fire Fighters now and (maybe) in the future... Thank you! It was an honour to be around! Thank you to Ajay for organising and arranging this Sunday morning event, and of course of Big Thank You to the three guys that took some time off to have us at the Fire Station in Coromandel and guide us through their daily job! And remember to call 115 in case of emergencies!

Read the article
NUMA-aware placement of communication variables

- by Dave

For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

Read the article
SQL SERVER – Weekly Series – Memory Lane – #031

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Find Table without Clustered Index – Find Table with no Primary Key Clustered index is very important concept for any table. They impact the performance very heavily. Here is a quick script to find tables without a clustered index. Replace TEXT with VARCHAR(MAX) – Stop using TEXT, NTEXT, IMAGE Data Types Question: “Is VARCHAR (MAX) big enough to store the TEXT field?” Answer: “Yes, VARCHAR(MAX) is big enough to accommodate TEXT field. TEXT, NTEXT and IMAGE data types of SQL Server 2000 will be deprecated in a future version of SQL Server, SQL Server 2005 provides backward compatibility to data types but it is recommended to use new data types which are VARHCAR (MAX), NVARCHAR (MAX) and VARBINARY (MAX).” Limiting Result Sets by Using TABLESAMPLE – Examples Introduced in SQL Server 2005, TABLESAMPLE allows you to extract a sampling of rows from a table in the FROM clause. The rows retrieved are random and they are are not in any order. This sampling can be based on a percentage of number of rows. You can use TABLESAMPLE when only a sampling of rows is necessary for the application instead of a full result set. User Defined Functions (UDF) Limitations UDF have its own advantage and usage but in this article we will see the limitation of UDF. Things UDF can not do and why Stored Procedure are considered as more flexible then UDFs. Stored Procedure are more flexibility then User Defined Functions(UDF). However, this blog post is a good read to know what are the limitations of UDF. Change Database Compatible Level – Backward Compatibility For a long time SQL Server stayed on the compatibility level of 80 which is of SQL Server 2000. However, as soon as SQL Server 2005 introduced the issue of compatibility was quite a major issue. Since that time MS has been releasing the versions at every 2-3 years, changing compatibility is a ever popular topic. In this blog post, we learn how we can do the same using T-SQL. We can also do the same using SSMS and here is the blog post for the same: Change Database Compatible Level – Backward Compatibility – Part 2 – Management Studio. Constraint on VARCHAR(MAX) Field To Limit It Certain Length How can I limit the VARCHAR(MAX) field with maximum length of 12500 characters only. His Question was valid as our application was allowed 12500 characters. First of all – this requirement is bit strange but if someone wants to do the same, they can do it as described in this blog post. 2008 UNPIVOT Table Example Understanding UNPIVOT can be very complicated at times. In this blog post, I have attempted to explain the same concept in very simple words. Create Default Constraint Over Table Column A simple straight to script blog post – I still use this blog quite many times for my own reference. UDF – Get the Day of the Week Function It took me 4 iteration to find this very simple function which can immediately get the day of the week in a single line. 2009 Find Hostname and Current Logged In User Name There are two tricks listed in this blog post where users can find out the hostname and current logged user name immediately and very easily. Interesting Observation of Logon Trigger On All Servers When I was doing a project, I made an interesting observation of executing a logon trigger multiple times. It was absolutely unexpected for me! As I was logging only once, naturally, I was expecting the entry only once. However, it did it multiple times on different threads – indeed an eccentric phenomenon at first sight! Difference Between Candidate Keys and Primary Key One needs to be very careful in selecting the Primary Key as an incorrect selection can adversely impact the database architect and future normalization. For a Candidate Key to qualify as a Primary Key, it should be Non-NULL and unique in any domain. I have observed quite often that Primary Keys are seldom changed. I would like to have your feedback on not changing a Primary Key. Create Multiple Filegroup For Single Database Why should one create multiple file group for any database and what are the advantages of the same. In this blog post, I explain the same in detail. List All Objects Created on All Filegroups in Database In this blog post we discuss the essential question – “How can I find which object belongs to which filegroup. Is there any way to know this?” 2010 DATE and TIME in SQL Server 2008 When DATE is converted to DATETIME it adds the of midnight. When TIME is converted to DATETIME it adds the date of 1900 and it is something one wants to consider if you are going to run scripts from SQL Server 2008 to earlier version with CONVERT. Disabled Index and Update Statistics If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexes are also updated. Precision of SMALLDATETIME – A 1 Minute Precision The precision of the datatype SMALLDATETIME is 1 minute. It discards the seconds by rounding up or rounding down any seconds greater than zero. 2011 Getting Columns Headers without Result Data – SET FMTONLY ON SET FMTONLY ON returns only metadata to the client. It can be used to test the format of the response without actually running the query. When this setting is ON the resultset only have headers of the results but no data. Copy Database from Instance to Another Instance – Copy Paste in SQL Server SQL Server has a feature which copy database from one database to another database and it can be automated as well using SSIS. Make sure you have SQL Server Agent Turned on as this feature will create a job. Puzzle – SELECT * vs SELECT COUNT(*) If you have ever wondered SELECT * gives error when executed alone but SELECT COUNT(*) does not. Why? in that case, you should read this blog post. Creating All New Database with Full Recovery Model This blog post is very based on very interesting story where the user wants to do something by default for every single new database created. Model database is a secret weapon which should be used very carefully and with proper evalution. If used carefully this can be a very much beneficiary when we need a newly created database behave in certain fashion. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Can anyone remember their final day of schooling? This is probably a silly question because – of course you can! Many people mark this as the most exciting, happiest day of their life. It marks the end of testing, the end of following rules set by teachers, and the beginning of finally being able to earn money and work in your chosen field. Read five part series on developer training subject Developer Training - Importance and Significance - Part 1 Developer Training – Employee Morals and Ethics – Part 2 Developer Training – Difficult Questions and Alternative Perspective - Part 3 Developer Training – Various Options for Developer Training – Part 4 Developer Training – A Conclusive Summary- Part 5 Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Shrinking NDF and MDF Files – Readers’ Opinion

- by pinaldave

Previously, I had written a blog post about SQL SERVER – Shrinking NDF and MDF Files – A Safe Operation. After that, I have written the following blog post that talks about the advantage and disadvantage of Shrinking and why one should not be Shrinking a file SQL SERVER – SHRINKFILE and TRUNCATE Log File in SQL Server 2008. On this subject, SQL Server Expert Imran Mohammed left an excellent comment. I just feel that his comment is worth a big article itself. For everybody to read his wonderful explanation, I am posting this blog post here. Thanks Imran! Shrinking Database always creates performance degradation and increases fragmentation in the database. I suggest that you keep that in mind before you start reading the following comment. If you are going to say Shrinking Database is bad and evil, here I am saying it first and loud. Now, the comment of Imran is written while keeping in mind only the process showing how the Shrinking Database Operation works. Imran has already explained his understanding and requests further explanation. I have removed the Best Practices section from Imran’s comments, as there are a few corrections. Comments from Imran - Before I explain to you the concept of Shrink Database, let us understand the concept of Database Files. When we create a new database inside the SQL Server, it is typical that SQl Server creates two physical files in the Operating System: one with .MDF Extension, and another with .LDF Extension. .MDF is called as Primary Data File. .LDF is called as Transactional Log file. If you add one or more data files to a database, the physical file that will be created in the Operating System will have an extension of .NDF, which is called as Secondary Data File; whereas, when you add one or more log files to a database, the physical file that will be created in the Operating System will have the same extension as .LDF. The questions now are, “Why does a new data file have a different extension (.NDF)?”, “Why is it called as a secondary data file?” and, “Why is .MDF file called as a primary data file?” Answers: Note: The following explanation is based on my limited knowledge of SQL Server, so experts please do comment. A data file with a .MDF extension is called a Primary Data File, and the reason behind it is that it contains Database Catalogs. Catalogs mean Meta Data. Meta Data is “Data about Data”. An example for Meta Data includes system objects that store information about other objects, except the data stored by the users. sysobjects stores information about all objects in that database. sysindexes stores information about all indexes and rows of every table in that database. syscolumns stores information about all columns that each table has in that database. sysusers stores how many users that database has. Although Meta Data stores information about other objects, it is not the transactional data that a user enters; rather, it’s a system data about the data. Because Primary Data File (.MDF) contains important information about the database, it is treated as a special file. It is given the name Primary Data file because it contains the Database Catalogs. This file is present in the Primary File Group. You can always create additional objects (Tables, indexes etc.) in the Primary data file (This file is present in the Primary File group), by mentioning that you want to create this object under the Primary File Group. Any additional data file that you add to the database will have only transactional data but no Meta Data, so that’s why it is called as the Secondary Data File. It is given the extension name .NDF so that the user can easily identify whether a specific data file is a Primary Data File or a Secondary Data File(s). There are many advantages of storing data in different files that are under different file groups. You can put your read only in the tables in one file (file group) and read-write tables in another file (file group) and take a backup of only the file group that has read the write data, so that you can avoid taking the backup of a read-only data that cannot be altered. Creating additional files in different physical hard disks also improves I/O performance. A real-time scenario where we use Files could be this one: Let’s say you have created a database called MYDB in the D-Drive which has a 50 GB space. You also have 1 Database File (.MDF) and 1 Log File on D-Drive and suppose that all of that 50 GB space has been used up and you do not have any free space left but you still want to add an additional space to the database. One easy option would be to add one more physical hard disk to the server, add new data file to MYDB database and create this new data file in a new hard disk then move some of the objects from one file to another, and put the file group under which you added new file as default File group, so that any new object that is created gets into the new files, unless specified. Now that we got a basic idea of what data files are, what type of data they store and why they are named the way they are, let’s move on to the next topic, Shrinking. First of all, I disagree with the Microsoft terminology for naming this feature as “Shrinking”. Shrinking, in regular terms, means to reduce the size of a file by means of compressing it. BUT in SQL Server, Shrinking DOES NOT mean compressing. Shrinking in SQL Server means to remove an empty space from database files and release the empty space either to the Operating System or to SQL Server. Let’s examine this through an example. Let’s say you have a database “MYDB” with a size of 50 GB that has a free space of about 20 GB, which means 30GB in the database is filled with data and the 20 GB of space is free in the database because it is not currently utilized by the SQL Server (Database); it is reserved and not yet in use. If you choose to shrink the database and to release an empty space to Operating System, and MIND YOU, you can only shrink the database size to 30 GB (in our example). You cannot shrink the database to a size less than what is filled with data. So, if you have a database that is full and has no empty space in the data file and log file (you don’t have an extra disk space to set Auto growth option ON), YOU CANNOT issue the SHRINK Database/File command, because of two reasons: There is no empty space to be released because the Shrink command does not compress the database; it only removes the empty space from the database files and there is no empty space. Remember, the Shrink command is a logged operation. When we perform the Shrink operation, this information is logged in the log file. If there is no empty space in the log file, SQL Server cannot write to the log file and you cannot shrink a database. Now answering your questions: (1) Q: What are the USEDPAGES & ESTIMATEDPAGES that appear on the Results Pane after using the DBCC SHRINKDATABASE (NorthWind, 10) ? A: According to Books Online (For SQL Server 2000): UsedPages: the number of 8-KB pages currently used by the file. EstimatedPages: the number of 8-KB pages that SQL Server estimates the file could be shrunk down to. Important Note: Before asking any question, make sure you go through Books Online or search on the Google once. The reasons for doing so have many advantages: 1. If someone else already has had this question before, chances that it is already answered are more than 50 %. 2. This reduces your waiting time for the answer. (2) Q: What is the difference between Shrinking the Database using DBCC command like the one above & shrinking it from the Enterprise Manager Console by Right-Clicking the database, going to TASKS & then selecting SHRINK Option, on a SQL Server 2000 environment? A: As far as my knowledge goes, there is no difference, both will work the same way, one advantage of using this command from query analyzer is, your console won’t be freezed. You can do perform your regular activities using Enterprise Manager. (3) Q: What is this .NDF file that is discussed above? I have never heard of it. What is it used for? Is it used by end-users, DBAs or the SERVER/SYSTEM itself? A: .NDF File is a secondary data file. You never heard of it because when database is created, SQL Server creates database by default with only 1 data file (.MDF) and 1 log file (.LDF) or however your model database has been setup, because a model database is a template used every time you create a new database using the CREATE DATABASE Command. Unless you have added an extra data file, you will not see it. This file is used by the SQL Server to store data which are saved by the users. Hope this information helps. I would like to as the experts to please comment if what I understand is not what the Microsoft guys meant. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Readers Contribution, Readers Question, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Guest Post – Jonathan Kehayias – Wait Type – Day 16 of 28

- by pinaldave

Jonathan Kehayias (Blog | Twitter) is a MCITP Database Administrator and Developer, who got started in SQL Server in 2004 as a database developer and report writer in the natural gas industry. After spending two and a half years working in TSQL, in late 2006, he transitioned to the role of SQL Database Administrator. His primary passion is performance tuning, where he frequently rewrites queries for better performance and performs in depth analysis of index implementation and usage. Jonathan blogs regularly on SQLBlog, and was a coauthor of Professional SQL Server 2008 Internals and Troubleshooting. On a personal note, I think Jonathan is extremely positive person. In every conversation with him I have found that he is always eager to help and encourage. Every time he finds something needs to be approved, he has contacted me without hesitation and guided me to improve, change and learn. During all the time, he has not lost his focus to help larger community. I am honored that he has accepted to provide his views on complex subject of Wait Types and Queues. Currently I am reading his series on Extended Events. Here is the guest blog post by Jonathan: SQL Server troubleshooting is all about correlating related pieces of information together to indentify where exactly the root cause of a problem lies. In my daily work as a DBA, I generally get phone calls like, “So and so application is slow, what’s wrong with the SQL Server.” One of the funny things about the letters DBA is that they go so well with Default Blame Acceptor, and I really wish that I knew exactly who the first person was that pointed that out to me, because it really fits at times. A lot of times when I get this call, the problem isn’t related to SQL Server at all, but every now and then in my initial quick checks, something pops up that makes me start looking at things further. The SQL Server is slow, we see a number of tasks waiting on ASYNC_IO_COMPLETION, IO_COMPLETION, or PAGEIOLATCH_* waits in sys.dm_exec_requests and sys.dm_exec_waiting_tasks. These are also some of the highest wait types in sys.dm_os_wait_stats for the server, so it would appear that we have a disk I/O bottleneck on the machine. A quick check of sys.dm_io_virtual_file_stats() and tempdb shows a high write stall rate, while our user databases show high read stall rates on the data files. A quick check of some performance counters and Page Life Expectancy on the server is bouncing up and down in the 50-150 range, the Free Page counter consistently hits zero, and the Free List Stalls/sec counter keeps jumping over 10, but Buffer Cache Hit Ratio is 98-99%. Where exactly is the problem? In this case, which happens to be based on a real scenario I faced a few years back, the problem may not be a disk bottleneck at all; it may very well be a memory pressure issue on the server. A quick check of the system spec’s and it is a dual duo core server with 8GB RAM running SQL Server 2005 SP1 x64 on Windows Server 2003 R2 x64. Max Server memory is configured at 6GB and we think that this should be enough to handle the workload; or is it? This is a unique scenario because there are a couple of things happening inside of this system, and they all relate to what the root cause of the performance problem is on the system. If we were to query sys.dm_exec_query_stats for the TOP 10 queries, by max_physical_reads, max_logical_reads, and max_worker_time, we may be able to find some queries that were using excessive I/O and possibly CPU against the system in their worst single execution. We can also CROSS APPLY to sys.dm_exec_sql_text() and see the statement text, and also CROSS APPLY sys.dm_exec_query_plan() to get the execution plan stored in cache. Ok, quick check, the plans are pretty big, I see some large index seeks, that estimate 2.8GB of data movement between operators, but everything looks like it is optimized the best it can be. Nothing really stands out in the code, and the indexing looks correct, and I should have enough memory to handle this in cache, so it must be a disk I/O problem right? Not exactly! If we were to look at how much memory the plan cache is taking by querying sys.dm_os_memory_clerks for the CACHESTORE_SQLCP and CACHESTORE_OBJCP clerks we might be surprised at what we find. In SQL Server 2005 RTM and SP1, the plan cache was allowed to take up to 75% of the memory under 8GB. I’ll give you a second to go back and read that again. Yes, you read it correctly, it says 75% of the memory under 8GB, but you don’t have to take my word for it, you can validate this by reading Changes in Caching Behavior between SQL Server 2000, SQL Server 2005 RTM and SQL Server 2005 SP2. In this scenario the application uses an entirely adhoc workload against SQL Server and this leads to plan cache bloat, and up to 4.5GB of our 6GB of memory for SQL can be consumed by the plan cache in SQL Server 2005 SP1. This in turn reduces the size of the buffer cache to just 1.5GB, causing our 2.8GB of data movement in this expensive plan to cause complete flushing of the buffer cache, not just once initially, but then another time during the queries execution, resulting in excessive physical I/O from disk. Keep in mind that this is not the only query executing at the time this occurs. Remember the output of sys.dm_io_virtual_file_stats() showed high read stalls on the data files for our user databases versus higher write stalls for tempdb? The memory pressure is also forcing heavier use of tempdb to handle sorting and hashing in the environment as well. The real clue here is the Memory counters for the instance; Page Life Expectancy, Free List Pages, and Free List Stalls/sec. The fact that Page Life Expectancy is fluctuating between 50 and 150 constantly is a sign that the buffer cache is experiencing constant churn of data, once every minute to two and a half minutes. If you add to the Page Life Expectancy counter, the consistent bottoming out of Free List Pages along with Free List Stalls/sec consistently spiking over 10, and you have the perfect memory pressure scenario. All of sudden it may not be that our disk subsystem is the problem, but is instead an innocent bystander and victim. Side Note: The Page Life Expectancy counter dropping briefly and then returning to normal operating values intermittently is not necessarily a sign that the server is under memory pressure. The Books Online and a number of other references will tell you that this counter should remain on average above 300 which is the time in seconds a page will remain in cache before being flushed or aged out. This number, which equates to just five minutes, is incredibly low for modern systems and most published documents pre-date the predominance of 64 bit computing and easy availability to larger amounts of memory in SQL Servers. As food for thought, consider that my personal laptop has more memory in it than most SQL Servers did at the time those numbers were posted. I would argue that today, a system churning the buffer cache every five minutes is in need of some serious tuning or a hardware upgrade. Back to our problem and its investigation: There are two things really wrong with this server; first the plan cache is excessively consuming memory and bloated in size and we need to look at that and second we need to evaluate upgrading the memory to accommodate the workload being performed. In the case of the server I was working on there were a lot of single use plans found in sys.dm_exec_cached_plans (where usecounts=1). Single use plans waste space in the plan cache, especially when they are adhoc plans for statements that had concatenated filter criteria that is not likely to reoccur with any frequency. SQL Server 2005 doesn’t natively have a way to evict a single plan from cache like SQL Server 2008 does, but MVP Kalen Delaney, showed a hack to evict a single plan by creating a plan guide for the statement and then dropping that plan guide in her blog post Geek City: Clearing a Single Plan from Cache. We could put that hack in place in a job to automate cleaning out all the single use plans periodically, minimizing the size of the plan cache, but a better solution would be to fix the application so that it uses proper parameterized calls to the database. You didn’t write the app, and you can’t change its design? Ok, well you could try to force parameterization to occur by creating and keeping plan guides in place, or we can try forcing parameterization at the database level by using ALTER DATABASE <dbname> SET PARAMETERIZATION FORCED and that might help. If neither of these help, we could periodically dump the plan cache for that database, as discussed as being a problem in Kalen’s blog post referenced above; not an ideal scenario. The other option is to increase the memory on the server to 16GB or 32GB, if the hardware allows it, which will increase the size of the plan cache as well as the buffer cache. In SQL Server 2005 SP1, on a system with 16GB of memory, if we set max server memory to 14GB the plan cache could use at most 9GB [(8GB*.75)+(6GB*.5)=(6+3)=9GB], leaving 5GB for the buffer cache. If we went to 32GB of memory and set max server memory to 28GB, the plan cache could use at most 16GB [(8*.75)+(20*.5)=(6+10)=16GB], leaving 12GB for the buffer cache. Thankfully we have SQL Server 2005 Service Pack 2, 3, and 4 these days which include the changes in plan cache sizing discussed in the Changes to Caching Behavior between SQL Server 2000, SQL Server 2005 RTM and SQL Server 2005 SP2 blog post. In real life, when I was troubleshooting this problem, I spent a week trying to chase down the cause of the disk I/O bottleneck with our Server Admin and SAN Admin, and there wasn’t much that could be done immediately there, so I finally asked if we could increase the memory on the server to 16GB, which did fix the problem. It wasn’t until I had this same problem occur on another system that I actually figured out how to really troubleshoot this down to the root cause. I couldn’t believe the size of the plan cache on the server with 16GB of memory when I actually learned about this and went back to look at it. SQL Server is constantly telling a story to anyone that will listen. As the DBA, you have to sit back and listen to all that it’s telling you and then evaluate the big picture and how all the data you can gather from SQL about performance relate to each other. One of the greatest tools out there is actually a free in the form of Diagnostic Scripts for SQL Server 2005 and 2008, created by MVP Glenn Alan Berry. Glenn’s scripts collect a majority of the information that SQL has to offer for rapid troubleshooting of problems, and he includes a lot of notes about what the outputs of each individual query might be telling you. When I read Pinal’s blog post SQL SERVER – ASYNC_IO_COMPLETION – Wait Type – Day 11 of 28, I noticed that he referenced Checking Memory Related Performance Counters in his post, but there was no real explanation about why checking memory counters is so important when looking at an I/O related wait type. I thought I’d chat with him briefly on Google Talk/Twitter DM and point this out, and offer a couple of other points I noted, so that he could add the information to his blog post if he found it useful. Instead he asked that I write a guest blog for this. I am honored to be a guest blogger, and to be able to share this kind of information with the community. The information contained in this blog post is a glimpse at how I do troubleshooting almost every day of the week in my own environment. SQL Server provides us with a lot of information about how it is running, and where it may be having problems, it is up to us to play detective and find out how all that information comes together to tell us what’s really the problem. This blog post is written by Jonathan Kehayias (Blog | Twitter). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: MVP, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

Read the article
SQL – NuoDB and Third Party Explorer – SQuirreL SQL Client, SQL Workbench/J and DbVisualizer

- by Pinal Dave

I recently wrote a four-part series on how I started to learn about and begin my journey with NuoDB. Big Data is indeed a big world and the learning of the Big Data is like spaghetti – no one knows in reality where to start, so I decided to learn it with the help of NuoDB. You can download NuoDB and continue your journey with me as well. Part 1 – Install NuoDB in 90 Seconds Part 2 – Manage NuoDB Installation Part 3 – Explore NuoDB Database Part 4 – Migrate from SQL Server to NuoDB …and in this blog post we will try to answer the most asked question about NuoDB. “I like the NuoDB Explorer but can I connect to NuoDB from my preferred Graphical User Interface?” Honestly, I did not expect this question to be asked of me so many times but from the question it is clear that we developers absolutely want to learn new things and along with that we do want to continue to use our most efficient developer tools. Now here is the answer to the question: “Absolutely, you can continue to use any of the following most popular SQL clients.” NuoDB supports the three most popular 3rd-party SQL clients. In all the leading development environments there are always more than one database installed and managing each of them with a different tool is often a very difficult task. Developers like to use one tool, which can control most of the databases. Once developers are familiar with one database tool it is very difficult for them to switch to another tool. This is particularly difficult when we developers find that tool to be the key reason for our efficiency. Let us see how to install each of the NuoDB supported 3rd party tools along with a quick tutorial on how to go about using them. SQuirreL SQL Client First download SQuirreL Universal SQL client. On the Windows platform you can double-click on the file and it will install the SQuirrel client. Once it is installed, open the application and it will bring up the following screen. Now go to the Drivers tab on the left side and scroll it down. You will find NuoDB mentioned there. Now right click over it and click on Modify Driver. Now here is where you need to make sure that you make proper entries or your client will not work with the database. Enter following values: Name: NuoDB Example URL: jdbc:com:nuodb://localhost:48004/test Website URL: http://www.nuodb.com Now click on the Extra Class Path tab and Add the location of the nuodbjdbc.jar file. If you are following my blog posts and have installed NuoDB in the default location, you will find the default path as C:\Program Files\NuoDB\jar\nuodbjdbc.jar. The class name of the driver is automatically populated. Once you click OK you will see that there is a small icon displayed to the left of NuoDB, which shows that you have successfully configured and installed the NuoDB driver. Now click on the tab of Alias tab and you can notice that it is empty. Now click on the big Plus icon and it will open screen of adding an alias. “Alias” means nothing more than adding a database to your system. The database name of the original installation can be anything and, if you wish, you can register the database with any other alternative name. Here are the details you should fill into the Alias screen below. Name: Test (or your preferred alias) Driver: NuoDB URL: jdbc:com:nuodb://localhost:48004/test (This is for test database) User Name: dba (This is the username which I entered for test Database) Password: goalie (This is the password which I entered for test Database) Check Auto Logon and Connect at Startup and click on OK. That’s it! You are done. On the right side you will see a table name and on the left side you will see various tabs with all the relevant details from respective table. You can see various metadata, schemas, data types and other information in the table. In addition, you can also generate script and do various important tasks related to database. You can see how easy it is to configure NuoDB with the SQuirreL Client and get going with it immediately. SQL Workbench/J This is another wonderful client tool, which works very well with NuoDB. The best part is that in the Driver dropdown you will see NuoDB being mentioned there. Click here to download SQL Workbench/J Universal SQL client. The download process is straight forward and the installation is a very easy process for SQL Workbench/J. As soon as you open the client, you will notice on following screen the NuoDB driver when selecting a New Connection Profile. Select NuoDB from the drop down and click on OK. In the driver information, enter following details: Driver: NuoDB (com.nuodb.jdbc.Driver) URL: jdbc:com.nuodb://localhost/test Username: dba Password: goalie While clicking on OK, it will bring up the following pop-up. Click Yes to edit the driver information. Click on OK and it will bring you to following screen. This is the screen where you can perform various tasks. You can write any SQL query you want and it will instantly show you the results. Now click on the database icon, which you see right on the left side of the word User=dba. Once you click on Database Explorer, you can perform various database related tasks. As a developer, one of my favorite tasks is to look at the source of the table as it gives me a proper view of the structure of the database. I find SQL Workbench/J very efficient in doing the same. DbVisualizer DBVisualizer is another great tool, which helps you to connect to NuoDB and retrieve database information in your desired format. A developer who is familiar with DBVisualizer will find this client to be very easy to work with. The installation of the DBVisualizer is very pretty straight forward. When we open the client, it will bring us to the following screen. As a first step we need to set up the driver. Go to Tools >> Driver Manager. It will bring up following screen where we set up the diver. Click on Create Driver and it will open up the driver settings on the right side. On the right side of the area where it displays Driver Settings please enter the following values- Name: NuoDB URL Format: jdbc:com.nuodb://localhost:48004/test Now under the driver path, click on the folder icon and it will ask for the location of the jar file. Provide the path as a C:\Program Files\NuoDB\jar\nuodbjdbc.jar and click OK. You will notice there is a green button displayed at the bottom right corner. This means the driver is configured properly. Once driver is configured properly, we can go to Create Database Connection and create a database. If the pop up show up for the Wizard. Click on No Wizard and continue to enter the settings manually. Here is the Database Connection screen. This screen can be bit tricky. Here are the settings you need to remember to enter. Name: NuoDB Database Type: Generic Driver: NuoDB Database URL: jdbc:com.nuodb://localhost:48004/test Database Userid: dba Database Password: goalie Once you enter the values, click on Connect. Once Connect is pressed, it will change the button value to Reconnect if the connection is successfully established and it will show the connection details on lthe eft side. When we further explore the NuoDB, we can see various tables created in our test application. We can further click on the right side screen and see various details on the table. If you click on the Data Tab, it will display the entire data of the table. The Tools menu also has some very interesting and cool features like Driver Manager, Data Monitor and SQL History. Summary Well, this was a relatively long post but I find it is extremely essential to cover all the three important clients, which we developers use in our daily database development. Here is my question to you? Which one of the following is your favorite NuoDB 3rd-Party Database Client? (Pick One) SQuirreL SQL Client SQL Workbench/J DbVisualizer I will be very much eager to read your experience about NuoDB. You can download NuoDB from here. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – Import CSV into Database – Transferring File Content into a Database Table using CSVexpress

- by pinaldave

One of the most common data integration tasks I run into is a desire to move data from a file into a database table. Generally the user is familiar with his data, the structure of the file, and the database table, but is unfamiliar with data integration tools and therefore views this task as something that is difficult. What these users really need is a point and click approach that minimizes the learning curve for the data integration tool. This is what CSVexpress (www.CSVexpress.com) is all about! It is based on expressor Studio, a data integration tool I’ve been reviewing over the last several months. With CSVexpress, moving data between data sources can be as simple as providing the database connection details, describing the structure of the incoming and outgoing data and then connecting two pre-programmed operators. There’s no need to learn the intricacies of the data integration tool or to write code. Let’s look at an example. Suppose I have a comma separated value data file with data similar to the following, which is a listing of terminated employees that includes their hiring and termination date, department, job description, and final salary. EMP_ID,STRT_DATE,END_DATE,JOB_ID,DEPT_ID,SALARY 102,13-JAN-93,24-JUL-98 17:00,Programmer,60,"$85,000" 101,21-SEP-89,27-OCT-93 17:00,Account Representative,110,"$65,000" 103,28-OCT-93,15-MAR-97 17:00,Account Manager,110,"$75,000" 304,17-FEB-96,19-DEC-99 17:00,Marketing,20,"$45,000" 333,24-MAR-98,31-DEC-99 17:00,Data Entry Clerk,50,"$35,000" 100,17-SEP-87,17-JUN-93 17:00,Administrative Assistant,90,"$40,000" 334,24-MAR-98,31-DEC-98 17:00,Sales Representative,80,"$40,000" 400,01-JAN-99,31-DEC-99 17:00,Sales Manager,80,"$55,000" Notice the concise format used for the date values, the fact that the termination date includes both date and time information, and that the salary is clearly identified as money by the dollar sign and digit grouping. In moving this data to a database table I want to express the dates using a format that includes the century since it’s obvious that this listing could include employees who left the company in both the 20th and 21st centuries, and I want the salary to be stored as a decimal value without the currency symbol and grouping character. Most data integration tools would require coding within a transformation operation to effect these changes, but not expressor Studio. Directives for these modifications are included in the description of the incoming data. Besides starting the expressor Studio tool and opening a project, the first step is to create connection artifacts, which describe to expressor where data is stored. For this example, two connection artifacts are required: a file connection, which encapsulates the file system location of my file; and a database connection, which encapsulates the database connection information. With expressor Studio, I use wizards to create these artifacts. First click New Connection > File Connection in the Home tab of expressor Studio’s ribbon bar, which starts the File Connection wizard. In the first window, I enter the path to the directory that contains the input file. Note that the file connection artifact only specifies the file system location, not the name of the file. Then I click Next and enter a meaningful name for this connection artifact; clicking Finish closes the wizard and saves the artifact. To create the Database Connection artifact, I must know the location of, or instance name, of the target database and have the credentials of an account with sufficient privileges to write to the target table. To use expressor Studio’s features to the fullest, this account should also have the authority to create a table. I click the New Connection > Database Connection in the Home tab of expressor Studio’s ribbon bar, which starts the Database Connection wizard. expressor Studio includes high-performance drivers for many relational database management systems, so I can simply make a selection from the “Supplied database drivers” drop down control. If my desired RDBMS isn’t listed, I can optionally use an existing ODBC DSN by selecting the “Existing DSN” radio button. In the following window, I enter the connection details. With Microsoft SQL Server, I may choose to use Windows Authentication rather than rather than account credentials. After clicking Next, I enter a meaningful name for this connection artifact and clicking Finish closes the wizard and saves the artifact. Now I create a schema artifact, which describes the structure of the file data. When expressor reads a file, all data fields are typed as strings. In some use cases this may be exactly what is needed and there is no need to edit the schema artifact. But in this example, editing the schema artifact will be used to specify how the data should be transformed; that is, reformat the dates to include century designations, change the employee and job ID’s to integers, and convert the salary to a decimal value. Again a wizard is used to create the schema artifact. I click New Schema > Delimited Schema in the Home tab of expressor Studio’s ribbon bar, which starts the Database Connection wizard. In the first window, I click Get Data from File, which then displays a listing of the file connections in the project. When I click on the file connection I previously created, a browse window opens to this file system location; I then select the file and click Open, which imports 10 lines from the file into the wizard. I now view the file’s content and confirm that the appropriate delimiter characters are selected in the “Field Delimiter” and “Record Delimiter” drop down controls; then I click Next. Since the input file includes a header row, I can easily indicate that fields in the file should be identified through the corresponding header value by clicking “Set All Names from Selected Row. “ Alternatively, I could enter a different identifier into the Field Details > Name text box. I click Next and enter a meaningful name for this schema artifact; clicking Finish closes the wizard and saves the artifact. Now I open the schema artifact in the schema editor. When I first view the schema’s content, I note that the types of all attributes in the Semantic Type (the right-hand panel) are strings and that the attribute names are the same as the field names in the data file. To change an attribute’s name and type, I highlight the attribute and click Edit in the Attributes grouping on the Schema > Edit tab of the editor’s ribbon bar. This opens the Edit Attribute window; I can change the attribute name and select the desired type from the “Data type” drop down control. In this example, I change the name of each attribute to the name of the corresponding database table column (EmployeeID, StartingDate, TerminationDate, JobDescription, DepartmentID, and FinalSalary). Then for the EmployeeID and DepartmentID attributes, I select Integer as the data type, for the StartingDate and TerminationDate attributes, I select Datetime as the data type, and for the FinalSalary attribute, I select the Decimal type. But I can do much more in the schema editor. For the datetime attributes, I can set a constraint that ensures that the data adheres to some predetermined specifications; a starting date must be later than January 1, 1980 (the date on which the company began operations) and a termination date must be earlier than 11:59 PM on December 31, 1999. I simply select the appropriate constraint and enter the value (1980-01-01 00:00 as the starting date and 1999-12-31 11:59 as the termination date). As a last step in setting up these datetime conversions, I edit the mapping, describing the format of each datetime type in the source file. I highlight the mapping line for the StartingDate attribute and click Edit Mapping in the Mappings grouping on the Schema > Edit tab of the editor’s ribbon bar. This opens the Edit Mapping window in which I either enter, or select, a format that describes how the datetime values are represented in the file. Note the use of Y01 as the syntax for the year. This syntax is the indicator to expressor Studio to derive the century by setting any year later than 01 to the 20th century and any year before 01 to the 21st century. As each datetime value is read from the file, the year values are transformed into century and year values. For the TerminationDate attribute, my format also indicates that the datetime value includes hours and minutes. And now to the Salary attribute. I open its mapping and in the Edit Mapping window select the Currency tab and the “Use currency” check box. This indicates that the file data will include the dollar sign (or in Europe the Pound or Euro sign), which should be removed. And on the Grouping tab, I select the “Use grouping” checkbox and enter 3 into the “Group size” text box, a comma into the “Grouping character” text box, and a decimal point into the “Decimal separator” character text box. These entries allow the string to be properly converted into a decimal value. By making these entries into the schema that describes my input file, I’ve specified how I want the data transformed prior to writing to the database table and completely removed the requirement for coding within the data integration application itself. Assembling the data integration application is simple. Onto the canvas I drag the Read File and Write Table operators, connecting the output of the Read File operator to the input of the Write Table operator. Next, I select the Read File operator and its Properties panel opens on the right-hand side of expressor Studio. For each property, I can select an appropriate entry from the corresponding drop down control. Clicking on the button to the right of the “File name” text box opens the file system location specified in the file connection artifact, allowing me to select the appropriate input file. I indicate also that the first row in the file, the header row, should be skipped, and that any record that fails one of the datetime constraints should be skipped. I then select the Write Table operator and in its Properties panel specify the database connection, normal for the “Mode,” and the “Truncate” and “Create Missing Table” options. If my target table does not yet exist, expressor will create the table using the information encapsulated in the schema artifact assigned to the operator. The last task needed to complete the application is to create the schema artifact used by the Write Table operator. This is extremely easy as another wizard is capable of using the schema artifact assigned to the Read Table operator to create a schema artifact for the Write Table operator. In the Write Table Properties panel, I click the drop down control to the right of the “Schema” property and select “New Table Schema from Upstream Output…” from the drop down menu. The wizard first displays the table description and in its second screen asks me to select the database connection artifact that specifies the RDBMS in which the target table will exist. The wizard then connects to the RDBMS and retrieves a list of database schemas from which I make a selection. The fourth screen gives me the opportunity to fine tune the table’s description. In this example, I set the width of the JobDescription column to a maximum of 40 characters and select money as the type of the LastSalary column. I also provide the name for the table. This completes development of the application. The entire application was created through the use of wizards and the required data transformations specified through simple constraints and specifications rather than through coding. To develop this application, I only needed a basic understanding of expressor Studio, a level of expertise that can be gained by working through a few introductory tutorials. expressor Studio is as close to a point and click data integration tool as one could want and I urge you to try this product if you have a need to move data between files or from files to database tables. Check out CSVexpress in more detail. It offers a few basic video tutorials and a preview of expressor Studio 3.5, which will support the reading and writing of data into Salesforce.com. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #038

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 CASE Statement in ORDER BY Clause – ORDER BY using Variable This article is as per request from the Application Development Team Leader of my company. His team encountered code where the application was preparing string for ORDER BY clause of the SELECT statement. Application was passing this string as variable to Stored Procedure (SP) and SP was using EXEC to execute the SQL string. This is not good for performance as Stored Procedure has to recompile every time due to EXEC. sp_executesql can do the same task but still not the best performance. SSMS – View/Send Query Results to Text/Grid/Files Results to Text – CTRL + T Results to Grid – CTRL + D Results to File – CTRL + SHIFT + F 2008 Introduction to SPARSE Columns Part 2 I wrote about Introduction to SPARSE Columns Part 1. Let us understand the concept of the SPARSE column in more detail. I suggest you read the first part before continuing reading this article. All SPARSE columns are stored as one XML column in the database. Let us see some of the advantage and disadvantage of SPARSE column. Deferred Name Resolution How come when table name is incorrect SP can be created successfully but when an incorrect column is used SP cannot be created? 2009 Backup Timeline and Understanding of Database Restore Process in Full Recovery Model In general, databases backup in full recovery mode is taken in three different kinds of database files. Full Database Backup Differential Database Backup Log Backup Restore Sequence and Understanding NORECOVERY and RECOVERY While doing RESTORE Operation if you restoring database files, always use NORECOVER option as that will keep the database in a state where more backup file are restored. This will also keep database offline also to prevent any changes, which can create itegrity issues. Once all backup file is restored run RESTORE command with a RECOVERY option to get database online and operational. Four Different Ways to Find Recovery Model for Database Perhaps, the best thing about technical domain is that most of the things can be executed in more than one ways. It is always useful to know about the various methods of performing a single task. Two Methods to Retrieve List of Primary Keys and Foreign Keys of Database When Information Schema is used, we will not be able to discern between primary key and foreign key; we will have both the keys together. In the case of sys schema, we can query the data in our preferred way and can join this table to another table, which can retrieve additional data from the same. Get Last Running Query Based on SPID PID is returns sessions ID of the current user process. The acronym SPID comes from the name of its earlier version, Server Process ID. 2010 SELECT * FROM dual – Dual Equivalent Dual is a table that is created by Oracle together with data dictionary. It consists of exactly one column named “dummy”, and one record. The value of that record is X. You can check the content of the DUAL table using the following syntax. SELECT * FROM dual Identifying Statistics Used by Query Someone asked this question in my training class of query optimization and performance tuning. “Can I know which statistics were used by my query?” 2011 SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 14 of 31 What are the basic functions for master, msdb, model, tempdb and resource databases? What is the Maximum Number of Index per Table? Explain Few of the New Features of SQL Server 2008 Management Studio Explain IntelliSense for Query Editing Explain MultiServer Query Explain Query Editor Regions Explain Object Explorer Enhancements Explain Activity Monitors SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 15 of 31 What is Service Broker? Where are SQL server Usernames and Passwords Stored in the SQL server? What is Policy Management? What is Database Mirroring? What are Sparse Columns? What does TOP Operator Do? What is CTE? What is MERGE Statement? What is Filtered Index? Which are the New Data Types Introduced in SQL SERVER 2008? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 16 of 31 What are the Advantages of Using CTE? How can we Rewrite Sub-Queries into Simple Select Statements or with Joins? What is CLR? What are Synonyms? What is LINQ? What are Isolation Levels? What is Use of EXCEPT Clause? What is XPath? What is NOLOCK? What is the Difference between Update Lock and Exclusive Lock? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 17 of 31 How will you Handle Error in SQL SERVER 2008? What is RAISEERROR? What is RAISEERROR? How to Rebuild the Master Database? What is the XML Datatype? What is Data Compression? What is Use of DBCC Commands? How to Copy the Tables, Schema and Views from one SQL Server to Another? How to Find Tables without Indexes? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 18 of 31 How to Copy Data from One Table to Another Table? What is Catalog Views? What is PIVOT and UNPIVOT? What is a Filestream? What is SQLCMD? What do you mean by TABLESAMPLE? What is ROW_NUMBER()? What are Ranking Functions? What is Change Data Capture (CDC) in SQL Server 2008? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 19 of 31 How can I Track the Changes or Identify the Latest Insert-Update-Delete from a Table? What is the CPU Pressure? How can I Get Data from a Database on Another Server? What is the Bookmark Lookup and RID Lookup? What is Difference between ROLLBACK IMMEDIATE and WITH NO_WAIT during ALTER DATABASE? What is Difference between GETDATE and SYSDATETIME in SQL Server 2008? How can I Check that whether Automatic Statistic Update is Enabled or not? How to Find Index Size for Each Index on Table? What is the Difference between Seek Predicate and Predicate? What are Basics of Policy Management? What are the Advantages of Policy Management? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 20 of 31 What are Policy Management Terms? What is the ‘FILLFACTOR’? Where in MS SQL Server is ’100’ equal to ‘0’? What are Points to Remember while Using the FILLFACTOR Argument? What is a ROLLUP Clause? What are Various Limitations of the Views? What is a Covered index? When I Delete any Data from a Table, does the SQL Server reduce the size of that table? What are Wait Types? How to Stop Log File Growing too Big? If any Stored Procedure is Encrypted, then can we see its definition in Activity Monitor? 2012 Example of Width Sensitive and Width Insensitive Collation Width Sensitive Collation: A single-byte character (half-width) represented as single-byte and the same character represented as a double-byte character (full-width) are when compared are not equal the collation is width sensitive. In this example we have one table with two columns. One column has a collation of width sensitive and the second column has a collation of width insensitive. Find Column Used in Stored Procedure – Search Stored Procedure for Column Name Very interesting conversation about how to find column used in a stored procedure. There are two different characters in the story and both are having a conversation about how to find column in the stored procedure. Here are two part story Part 1 | Part 2 SQL SERVER – 2012 Functions – FORMAT() and CONCAT() – An Interesting Usage Generate Script for Schema and Data – SQL in Sixty Seconds #021 – Video In simple words, in many cases the database move from one place to another place. It is not always possible to back up and restore databases. There are possibilities when only part of the database (with schema and data) has to be moved. In this video we learn that we can easily generate script for schema for data and move from one server to another one. INFORMATION_SCHEMA.COLUMNS and Value Character Maximum Length -1 I often see the value -1 in the CHARACTER_MAXIMUM_LENGTH column of INFORMATION_SCHEMA.COLUMNS table. I understand that the length of any column can be between 0 to large number but I do not get it when I see value in negative (i.e. -1). Any insight on this subject? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #049

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Two Connections Related Global Variables Explained – @@CONNECTIONS and @@MAX_CONNECTIONS @@CONNECTIONS Returns the number of attempted connections, either successful or unsuccessful since SQL Server was last started. @@MAX_CONNECTIONS Returns the maximum number of simultaneous user connections allowed on an instance of SQL Server. The number returned is not necessarily the number currently configured. Query Editor – Microsoft SQL Server Management Studio This post may be very simple for most of the users of SQL Server 2005. Earlier this year, I have received one question many times – Where is Query Analyzer in SQL Server 2005? I wrote small post about it and pointed many users to that post – SQL SERVER – 2005 Query Analyzer – Microsoft SQL SERVER Management Studio. Recently I have been receiving similar question. OUTPUT Clause Example and Explanation with INSERT, UPDATE, DELETE SQL Server 2005 has a new OUTPUT clause, which is quite useful. OUTPUT clause has access to insert and deleted tables (virtual tables) just like triggers. OUTPUT clause can be used to return values to client clause. OUTPUT clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows affected by these statements. OUTPUT clause can generate a table variable, a permanent table, or temporary table. Even though, @@Identity will still work with SQL Server 2005, however I find the OUTPUT clause very easy and powerful to use. Let us understand the OUTPUT clause using an example. Find Name of The SQL Server Instance Based on database server stored procedures has to run different logic. We came up with two different solutions. 1) When database schema is very much changed, we wrote completely new stored procedure and deprecated older version once it was not needed. 2) When logic depended on Server Name we used global variable @@SERVERNAME. It was very convenient while writing migrating script which depended on the server name for the same database. Explanation of TRY…CATCH and ERROR Handling With RAISEERROR Function One of the developers at my company thought that we can not use the RAISEERROR function in new feature of SQL Server 2005 TRY… CATCH. When asked for an explanation he suggested SQL SERVER – 2005 Explanation of TRY… CATCH and ERROR Handling article as excuse suggesting that I did not give example of RAISEERROR with TRY…CATCH. We all thought it was funny. Just to keep records straight, TRY… CATCH can sure use RAISEERROR function. Different Types of Cache Objects Serveral kinds of objects can be stored in the procedure cache: Compiled Plans: When the query optimizer finishes compiling a query plan, the principal output is compiled plan. Execution contexts: While executing a compiled plan, SQL Server has to keep track of information about the state of execution. Cursors: Cursors track the execution state of server-side cursors, including the cursor’s current location within a resultset. Algebrizer trees: The Algebrizer’s job is to produce an algebrizer tree, which represents the logic structure of a query. Open SSMS From Command Prompt – sqlwb.exe Example This article is written by request and suggestion of Sr. Web Developer at my organization. Due to the nature of this article most of the content is referred from Book On-Line. sqlwbcommand prompt utility which opens SQL Server Management Studio. Squib command does not run queries from the command prompt. sqlcmd utility runs queries from command prompt, read for more information. 2008 Puzzle – Solution – Computed Columns Datatype Explanation Just a day before I wrote article SQL SERVER – Puzzle – Computed Columns Datatype Explanation which was inspired by SQL Server MVP Jacob Sebastian. I suggest that before continuing this article read the original puzzle question SQL SERVER – Puzzle – Computed Columns Datatype Explanation.The question was if the computed column was of datatype TINYINT how to create a Computed Column of datatype INT? 2008 – Find If Index is Being Used in Database It is very often I get a query that how to find if any index is being used in the database or not. If any database has many indexes and not all indexes are used it can adversely affect performance. If the number of indices are higher it reduces the INSERT / UPDATE / DELETE operation but increase the SELECT operation. It is recommended to drop any unused indexes from table to improve the performance. 2009 Interesting Observation – Execution Plan and Results of Aggregate Concatenation Queries If you want to see what’s going on here, I think you need to shift your point of view from an implementation-centric view to an ANSI point of view. ANSI does not guarantee processing the order. Figure 2 is interesting, but it will be potentially misleading if you don’t understand the ANSI rule-set SQL Server operates under in most cases. Implementation thinking can certainly be useful at times when you really need that multi-million row query to finish before the backup fire off, but in this case, it’s counterproductive to understanding what is going on. SQL Server Management Studio and Client Statistics Client Statistics are very important. Many a times, people relate queries execution plan to query cost. This is not a good comparison. Both parameters are different, and they are not always related. It is possible that the query cost of any statement is less, but the amount of the data returned is considerably larger, which is causing any query to run slow. How do we know if any query is retrieving a large amount data or very little data? 2010 I encourage all of you to go through complete series and write your own on the subject. If you write an article and send it to me, I will publish it on this blog with due credit to you. If you write on your own blog, I will update this blog post pointing to your blog post. SQL SERVER – ORDER BY Does Not Work – Limitation of the View 1 SQL SERVER – Adding Column is Expensive by Joining Table Outside View – Limitation of the View 2 SQL SERVER – Index Created on View not Used Often – Limitation of the View 3 SQL SERVER – SELECT * and Adding Column Issue in View – Limitation of the View 4 SQL SERVER – COUNT(*) Not Allowed but COUNT_BIG(*) Allowed – Limitation of the View 5 SQL SERVER – UNION Not Allowed but OR Allowed in Index View – Limitation of the View 6 SQL SERVER – Cross Database Queries Not Allowed in Indexed View – Limitation of the View 7 SQL SERVER – Outer Join Not Allowed in Indexed Views – Limitation of the View 8 SQL SERVER – SELF JOIN Not Allowed in Indexed View – Limitation of the View 9 SQL SERVER – Keywords View Definition Must Not Contain for Indexed View – Limitation of the View 10 SQL SERVER – View Over the View Not Possible with Index View – Limitations of the View 11 SQL SERVER – Get Query Running in Session I was recently looking for syntax where I needed a query running in any particular session. I always remembered the syntax and ha d actually written it down before, but somehow it was not coming to mind quickly this time. I searched online and I ended up on my own article written last year SQL SERVER – Get Last Running Query Based on SPID. I felt that I am getting old because I forgot this really simple syntax. Find Total Number of Transaction on Interval In one of my recent Performance Tuning assignments I was asked how do someone know how many transactions are happening on a server during certain interval. I had a handy script for the same. Following script displays transactions happened on the server at the interval of one minute. You can change the WAITFOR DELAY to any other interval and it should work. 2011 Here are two DMV’s which are newly introduced in SQL Server 2012 and provides vital information about SQL Server. DMV – sys.dm_os_volume_stats – Information about operating system volume DMV – sys.dm_os_windows_info – Information about Operating System SQL Backup and FTP – A Quick and Handy Tool I have used this tool extensively since 2009 at numerous occasion and found it to be very impressive. What separates it from the crowd the most – it is it’s apparent simplicity and speed. When I install SQLBackupAndFTP and configure backups – all in 1 or 2 minutes, my clients are always impressed. Quick Note about JOIN – Common Questions and Simple Answers In this blog post we are going to talk about join and lots of things related to the JOIN. I recently started office hours to answer questions and issues of the community. I receive so many questions that are related to JOIN. I will share a few of the same over here. Most of them are basic, but note that the basics are of great importance. 2012 Importance of User Without Login Question: “In recent version of SQL Server we can create user without login. What is the use of it?” Great question indeed. Let me first attempt to answer this question but after reading my answer I need your help. I want you to help him as well with adding more value to it. Preserve Leading Zero While Coping to Excel from SSMS Earlier I wrote two articles about how to efficiently copy data from SSMS to Excel. Since I wrote that post there are plenty of interest generated on this subject. There are a few questions I keep on getting over this subject. One of the question is how to get the leading zero preserved while copying the data from SSMS to Excel. Well it is almost the same way as my earlier post SQL SERVER – Excel Losing Decimal Values When Value Pasted from SSMS ResultSet. The key here is in EXCEL and not in SQL Server. Solution – 2 T-SQL Puzzles – Display Star and Shortest Code to Display 1 Earlier on this blog we had asked two puzzles. The response from all of you is nothing but Amazing. I have received 350+ responses. Many are valid and many were indeed something I had not thought about it. I strongly suggest you read all the puzzles and their answers here - trust me if you start reading the comments you will not stop till you read every single comment. Seriously trust me on it. Personally I have learned a lot from it. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #028 – Video http://www.youtube.com/watch?v=TvlYy-TGaaA Importance of User Without Login – T-SQL Demo Script Earlier I wrote a blog post about SQL SERVER – Importance of User Without Login and my friend and SQL Expert Vinod Kumar has written excellent follow up blog post about Contained Databases inside SQL Server 2012. Now lots of people asked me if I can also explain the same concept again so here is the small demonstration for it. Let me show you how login without user can help. Before we continue on this subject I strongly recommend that you read my earlier blog post here. In following demo I am going to demonstrate following situation. Login using the System Admin account Create a user without login Checking Access Impersonate the user without login Checking Access Revert Impersonation Give Permission to user without login Impersonate the user without login Checking Access Revert Impersonation Clean up Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #039

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 FQL – Facebook Query Language Facebook list following advantages of FQL: Condensed XML reduces bandwidth and parsing costs. More complex requests can reduce the number of requests necessary. Provides a single consistent, unified interface for all of your data. It’s fun! UDF – Get the Day of the Week Function The day of the week can be retrieved in SQL Server by using the DatePart function. The value returned by the function is between 1 (Sunday) and 7 (Saturday). To convert this to a string representing the day of the week, use a CASE statement. UDF – Function to Get Previous And Next Work Day – Exclude Saturday and Sunday While reading ColdFusion blog of Ben Nadel Getting the Previous Day In ColdFusion, Excluding Saturday And Sunday, I realize that I use similar function on my SQL Server Database. This function excludes the Weekends (Saturday and Sunday), and it gets previous as well as next work day. Complete Series of SQL Server Interview Questions and Answers Data Warehousing Interview Questions and Answers – Introduction Data Warehousing Interview Questions and Answers – Part 1 Data Warehousing Interview Questions and Answers – Part 2 Data Warehousing Interview Questions and Answers – Part 3 Data Warehousing Interview Questions and Answers Complete List Download 2008 Introduction to Log Viewer In SQL Server all the windows event logs can be seen along with SQL Server logs. Interface for all the logs is same and can be launched from the same place. This log can be exported and filtered as well. DBCC SHRINKFILE Takes Long Time to Run If you are DBA who are involved with Database Maintenance and file group maintenance, you must have experience that many times DBCC SHRINKFILE operations takes a long time but any other operations with Database are relatively quicker. mssqlsystemresource – Resource Database The purpose of resource database is to facilitates upgrading to the new version of SQL Server without any hassle. In previous versions whenever version of SQL Server was upgraded all the previous version system objects needs to be dropped and new version system objects to be created. 2009 Puzzle – Write Script to Generate Primary Key and Foreign Key In SQL Server Management Studio (SSMS), there is no option to script all the keys. If one is required to script keys they will have to manually script each key one at a time. If database has many tables, generating one key at a time can be a very intricate task. I want to throw a question to all of you if any of you have scripts for the same purpose. Maximizing View of SQL Server Management Studio – Full Screen – New Screen I had explained the following two different methods: 1) Open Results in Separate Tab - This is a very interesting method as result pan shows up in a different tab instead of the splitting screen horizontally. 2) Open SSMS in Full Screen - This works always and to its best. Not many people are aware of this method; hence, very few people use it to enhance performance. 2010 Find Queries using Parallelism from Cached Plan T-SQL script gets all the queries and their execution plan where parallelism operations are kicked up. Pay attention there is TOP 10 is used, if you have lots of transactional operations, I suggest that you change TOP 10 to TOP 50 This is the list of the all the articles in the series of computed columns. SQL SERVER – Computed Column – PERSISTED and Storage This article talks about how computed columns are created and why they take more storage space than before. SQL SERVER – Computed Column – PERSISTED and Performance This article talks about how PERSISTED columns give better performance than non-persisted columns. SQL SERVER – Computed Column – PERSISTED and Performance – Part 2 This article talks about how non-persisted columns give better performance than PERSISTED columns. SQL SERVER – Computed Column and Performance – Part 3 This article talks about how Index improves the performance of Computed Columns. SQL SERVER – Computed Column – PERSISTED and Storage – Part 2 This article talks about how creating index on computed column does not grow the row length of table. SQL SERVER – Computed Columns – Index and Performance This article summarized all the articles related to computed columns. 2011 SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 21 of 31 What is Data Warehousing? What is Business Intelligence (BI)? What is a Dimension Table? What is Dimensional Modeling? What is a Fact Table? What are the Fundamental Stages of Data Warehousing? What are the Different Methods of Loading Dimension tables? Describes the Foreign Key Columns in Fact Table and Dimension Table? What is Data Mining? What is the Difference between a View and a Materialized View? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 22 of 31 What is OLTP? What is OLAP? What is the Difference between OLTP and OLAP? What is ODS? What is ER Diagram? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 23 of 31 What is ETL? What is VLDB? Is OLTP Database is Design Optimal for Data Warehouse? If denormalizing improves Data Warehouse Processes, then why is the Fact Table is in the Normal Form? What are Lookup Tables? What are Aggregate Tables? What is Real-Time Data-Warehousing? What are Conformed Dimensions? What is a Conformed Fact? How do you Load the Time Dimension? What is a Level of Granularity of a Fact Table? What are Non-Additive Facts? What is a Factless Facts Table? What are Slowly Changing Dimensions (SCD)? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 24 of 31 What is Hybrid Slowly Changing Dimension? What is BUS Schema? What is a Star Schema? What Snow Flake Schema? Differences between the Star and Snowflake Schema? What is Difference between ER Modeling and Dimensional Modeling? What is Degenerate Dimension Table? Why is Data Modeling Important? What is a Surrogate Key? What is Junk Dimension? What is a Data Mart? What is the Difference between OLAP and Data Warehouse? What is a Cube and Linked Cube with Reference to Data Warehouse? What is Snapshot with Reference to Data Warehouse? What is Active Data Warehousing? What is the Difference between Data Warehousing and Business Intelligence? What is MDS? Explain the Paradigm of Bill Inmon and Ralph Kimball. SQL SERVER – Azure Interview Questions and Answers – Guest Post by Paras Doshi – Day 25 of 31 Paras Doshi has submitted 21 interesting question and answers for SQL Azure. 1.What is SQL Azure? 2.What is cloud computing? 3.How is SQL Azure different than SQL server? 4.How many replicas are maintained for each SQL Azure database? 5.How can we migrate from SQL server to SQL Azure? 6.Which tools are available to manage SQL Azure databases and servers? 7.Tell me something about security and SQL Azure. 8.What is SQL Azure Firewall? 9.What is the difference between web edition and business edition? 10.How do we synchronize On Premise SQL server with SQL Azure? 11.How do we Backup SQL Azure Data? 12.What is the current pricing model of SQL Azure? 13.What is the current limitation of the size of SQL Azure DB? 14.How do you handle datasets larger than 50 GB? 15.What happens when the SQL Azure database reaches Max Size? 16.How many databases can we create in a single server? 17.How many servers can we create in a single subscription? 18.How do you improve the performance of a SQL Azure Database? 19.What is code near application topology? 20.What were the latest updates to SQL Azure service? 21.When does a workload on SQL Azure get throttled? SQL SERVER – Interview Questions and Answers – Guest Post by Malathi Mahadevan – Day 26 of 31 Malachi had asked a simple question which has several answers. Each answer makes you think and ponder about the reality of the IT world. Look at the simple question – ‘What is the toughest challenge you have faced in your present job and how did you handle it’? and its various answers. Each answer has its own story. SQL SERVER – Interview Questions and Answers – Guest Post by Rick Morelan – Day 27 of 31 Rick Morelan of Joes2Pros has written an excellent blog post on the subject how to find top N values. Most people are fully aware of how the TOP keyword works with a SELECT statement. After years preparing so many students to pass the SQL Certification I noticed they were pretty well prepared for job interviews too. Yes, they would do well in the interview but not great. There seemed to be a few questions that would come up repeatedly for almost everyone. Rick addresses similar questions in his lucid writing skills. 2012 Observation of Top with Index and Order of Resultset SQL Server has lots of things to learn and share. It is amazing to see how people evaluate and understand different techniques and styles differently when implementing. The real reason may be absolutely different but we may blame something totally different for the incorrect results. Read the blog post to learn more. How do I Record Video and Webcast How to Convert Hex to Decimal or INT Earlier I asked regarding a question about how to convert Hex to Decimal. I promised that I will post an answer with Due Credit to the author but never got around to post a blog post around it. Read the original post over here SQL SERVER – Question – How to Convert Hex to Decimal. Query to Get Unique Distinct Data Based on Condition – Eliminate Duplicate Data from Resultset The natural reaction will be to suggest DISTINCT or GROUP BY. However, not all the questions can be solved by DISTINCT or GROUP BY. Let us see the following example, where a user wanted only latest records to be displayed. Let us see the example to understand further. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

Search Results

Search found 14485 results on 580 pages for 'general technology'.

Page 49/580 | < Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56 | Next Page >

- by pinaldave

- by pinaldave

- by FransBouma

- by Pinal Dave

- by pinaldave

- by pinaldave

- by pinaldave

- by Dave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

- by Dave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

< Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56 | Next Page >