Search Results

Search found 124721 results on 4989 pages for 'internal server error'.

Page 165/4989 | < Previous Page | 161 162 163 164 165 166 167 168 169 170 171 172  | Next Page >

  • SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

    - by Pinal Dave
    Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Weekend Project – Experimenting with ACID Transactions, SQL Compliant, Elastically Scalable Database

    - by pinaldave
    Database technology is huge and big world. I like to explore always beyond what I know and share the learning. Weekend is the best time when I sit around download random software on my machine which I like to call as a lab machine (it is a pretty old laptop, hardly a quality as lab machine) and experiment it. There are so many free betas available for download that it’s hard to keep track and even harder to find the time to play with very many of them.  This blog is about one you shouldn’t miss if you are interested in the learning various relational databases. NuoDB just released their Beta 7.  I had already downloaded their Beta 6 and yesterday did the same for 7.   My impression is that they are onto something very very interesting.  In fact, it might be something really promising in terms of database elasticity, scale and operational cost reduction. The folks at NuoDB say they are working on the world’s first “emergent” database which they tout as a brand new transitional database that is intended to dramatically change what’s possible with OLTP.  It is SQL compliant, guarantees ACID transactions, yet scales elastically on heterogeneous and decentralized cloud-based resources. Interesting note for sure, making me explore more. Based on what I’ve seen so far, they are solving the architectural challenge that exists between elastic, cloud-based compute infrastructures designed to scale out in response to workload requirements versus the traditional relational database management system’s architecture of central control. Here’s my experience with the NuoDB Beta 6 so far: First they pretty much threw away all the features you’d associate with existing RDBMS architectures except the SQL and ACID transactions which they were smart to keep.  It looks like they have incorporated a number of the big ideas from various algorithms, systems and techniques to achieve maximum DB scalability. From a user’s perspective, the NuoDB Beta software behaves like any other traditional SQL database and seems to offer all the benefits users have come to expect from standards-based SQL solutions. One of the interesting feature is that one can run a transactional node and a storage node on my Windows laptop as well on other platforms – indeed interesting for sure. It’s quite amazing to see a database elastically scale across machine boundaries. So, one of the basic NuoDB concepts is that as you need to scale out, you can easily use more inexpensive hardware when/where you need it.  This is unlike what we have traditionally done to scale a database for an application – we replace the hardware with something more powerful (faster CPU and Disks). This is where I started to feel like NuoDB is on to something that has the potential to elastically scale on commodity hardware while reducing operational expense for a big OLTP database to a degree we’ve never seen before. NuoDB is able to fully leverage the cloud in an asynchronous and highly decentralized manner – while providing both SQL compliance and ACID transactions. Basically what NuoDB is doing is so new that it is all hard to believe until you’ve experienced it in action.  I will keep you up to date as I test the NuoDB Beta 7 but if you are developing a web-scale application or have an on-premise app you are thinking of moving to the cloud, testing this beta is worth your time. If you do try it, let me know what you think.  Before I say anything more, I am going to do more experiments and more test on this product and compare it with other existing similar products. For me it was a weekend worth spent on learning something new. I encourage you to download Beta 7 version and share your opinions here. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Fundamentals of Columnstore Index

    - by pinaldave
    There are two kind of storage in database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data is relevant or not, column store queries need only to search much lesser number of the columns. This means major increases in search speed and hard drive use. Additionally, the column store indexes are heavily compressed, which translates to even greater memory and faster searches. I am sure this looks very exciting and it does not mean that you convert every single index from row store to column store index. One has to understand the proper places where to use row store or column store indexes. Let us understand in this article what is the difference in Columnstore type of index. Column store indexes are run by Microsoft’s VertiPaq technology. However, all you really need to know is that this method of storing data is columns on a single page is much faster and more efficient. Creating a column store index is very easy, and you don’t have to learn new syntax to create them. You just need to specify the keyword “COLUMNSTORE” and enter the data as you normally would. Keep in mind that once you add a column store to a table, though, you cannot delete, insert or update the data – it is READ ONLY. However, since column store will be mainly used for data warehousing, this should not be a big problem. You can always use partitioning to avoid rebuilding the index. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. The difference between column store and row store approaches is illustrated below: In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. In case of column store indexes multiple pages will contain multiple single columns. This will lead only the columns needed to solve a query will be fetched from disk. Additionally there is good chance that there will be redundant data in a single column which will further help to compress the data, this will have positive effect on buffer hit rate as most of the data will be in memory and due to same it will not need to be retrieved. Let us see small example of how columnstore index improves the performance of the query on a large table. As a first step let us create databaseset which is large enough to show performance impact of columnstore index. The time taken to create sample database may vary on different computer based on the resources. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 Now let us do quick performance test. I have kept STATISTICS IO ON for measuring how much IO following queries take. In my test first I will run query which will use regular index. We will note the IO usage of the query. After that we will create columnstore index and will measure the IO of the same. -- Performance Test -- Comparing Regular Index with ColumnStore Index USE AdventureWorks GO SET STATISTICS IO ON GO -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO -- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0. -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Select Table with Columnstore Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO It is very clear from the results that query is performance extremely fast after creating ColumnStore Index. The amount of the pages it has to read to run query is drastically reduced as the column which are needed in the query are stored in the same page and query does not have to go through every single page to read those columns. If we enable execution plan and compare we can see that column store index performance way better than regular index in this case. Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In future posts we will see cases where Columnstore index is not appropriate solution as well few other tricks and tips of the columnstore index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Date and Time Support in SQL Server 2008

    - by Aamir Hasan
      Using the New Date and Time Data Types Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} 1.       The new date and time data types in SQL Server 2008 offer increased range and precision and are ANSI SQL compatible. 2.       Separate date and time data types minimize storage space requirements for applications that need only date or time information. Moreover, the variable precision of the new time data type increases storage savings in exchange for reduced accuracy. 3.       The new data types are mostly compatible with the original date and time data types and use the same Transact-SQL functions. 4.       The datetimeoffset data type allows you to handle date and time information in global applications that use data that originates from different time zones. SELECT c.name, p.* FROM politics pJOIN country cON p.country = c.codeWHERE YEAR(Independence) < 1753ORDER BY IndependenceGO8.    Highlight the SELECT statement and click Execute ( ) to show the use of some of the date functions.T-SQLSELECT c.name AS [Country Name],        CONVERT(VARCHAR(12), p.Independence, 107) AS [Independence Date],       DATEDIFF(YEAR, p.Independence, GETDATE()) AS [Years Independent (appox)],       p.GovernmentFROM politics pJOIN country cON p.country = c.codeWHERE YEAR(Independence) < 1753ORDER BY IndependenceGO10.    Select the SET DATEFORMAT statement and click Execute ( ) to change the DATEFORMAT to day-month-year.T-SQLSET DATEFORMAT dmyGO11.    Select the DECLARE and SELECT statements and click Execute ( ) to show how the datetime and datetime2 data types interpret a date literal.T-SQLSET DATEFORMAT dmyDECLARE @dt datetime = '2008-12-05'DECLARE @dt2 datetime2 = '2008-12-05'SELECT MONTH(@dt) AS [Month-Datetime], DAY(@dt)     AS [Day-Datetime]SELECT MONTH(@dt2) AS [Month-Datetime2], DAY(@dt2)     AS [Day-Datetime2]GO12.    Highlight the DECLARE and SELECT statements and click Execute ( ) to use integer arithmetic on a datetime variable.T-SQLDECLARE @dt datetime = '2008-12-05'SELECT @dt + 1GO13.    Highlight the DECLARE and SELECT statements and click Execute ( ) to show how integer arithmetic is not allowed for datetime2 variables.T-SQLDECLARE @dt2 datetime = '2008-12-05'SELECT @dt2 + 1GO14.    Highlight the DECLARE and SELECT statements and click Execute ( ) to show how to use DATE functions to do simple arithmetic on datetime2 variables.T-SQLDECLARE @dt2 datetime2(7) = '2008-12-05'SELECT DATEADD(d, 1, @dt2)GO15.    Highlight the DECLARE and SELECT statements and click Execute ( ) to show how the GETDATE function can be used with both datetime and datetime2 data types.T-SQLDECLARE @dt datetime = GETDATE();DECLARE @dt2 datetime2(7) = GETDATE();SELECT @dt AS [GetDate-DateTime], @dt2 AS [GetDate-DateTime2]GO16.    Draw attention to the values returned for both columns and how they are equal.17.    Highlight the DECLARE and SELECT statements and click Execute ( ) to show how the SYSDATETIME function can be used with both datetime and datetime2 data types.T-SQLDECLARE @dt datetime = SYSDATETIME();DECLARE @dt2 datetime2(7) = SYSDATETIME();SELECT @dt AS [Sysdatetime-DateTime], @dt2     AS [Sysdatetime-DateTime2]GO18.    Draw attention to the values returned for both columns and how they are different.Programming Global Applications with DateTimeOffset 2.    If you have not previously created the SQLTrainingKitDB database while completing another demo in this training kit, highlight the CREATE DATABASE statement and click Execute ( ) to do so now.T-SQLCREATE DATABASE SQLTrainingKitDBGO3.    Select the USE and CREATE TABLE statements and click Execute ( ) to create table datetest in the SQLTrainingKitDB database.T-SQLUSE SQLTrainingKitDBGOCREATE TABLE datetest (  id integer IDENTITY PRIMARY KEY,  datetimecol datetimeoffset,  EnteredTZ varchar(40)); Reference:http://www.microsoft.com/downloads/details.aspx?FamilyID=E9C68E1B-1E0E-4299-B498-6AB3CA72A6D7&displaylang=en   

    Read the article

  • SQL SERVER – master Database Log File Grew Too Big

    - by pinaldave
    Couple of the days ago, I received following email and I find this email very interesting and I feel like sharing with all of you. Note: Please read the whole email before providing your suggestions. “Hi Pinal, If you can share these details on your blog, it will help many. We understand the value of the master database and we take its regular back up (everyday midnight). Yesterday we noticed that our master database log file has grown very large. This is very first time that we have encountered such an issue. The master database is in simple recovery mode; so we assumed that it will never grow big; however, we now have a big log file. We ran the following command USE [master] GO DBCC SHRINKFILE (N'mastlog' , 0, TRUNCATEONLY) GO We know this command will break the chains of LSN but as per our understanding; it should not matter as we are in simple recovery model.     After running this, the log file becomes very small. Just to be cautious, we took full backup of the master database right away. We totally understand that this is not the normal practice; so if you are going to tell us the same, we are aware of it. However, here is the question for you? What operation in master database would have caused our log file to grow too large? Thanks, [name and company name removed as per request]“ Here was my response to them: “Hi [name removed], It is great that you are aware of all the right steps and method. Taking full backup when you are not sure is always a good practice. Regarding your question what could have caused your master database log to grow larger, let me try to guess what could have happened. Do you have any user table in the master database? If yes, this is not recommended and also NOT a good practice. If have user tables in master database and you are doing any long operation (may be lots of insert, update, delete or rebuilding them), then it can cause this situation. You have made me curious about your scenario; do revert back. Kind Regards, Pinal” Within few minutes I received reply: “That was it Pinal. We had one of the maintenance task log tables created in the master table, which had many long transactions during the night. We moved it to newly created database named ‘maintenance’, and we will keep you updated.” I was very glad to receive the email. I do not suggest that any user table should be created in the master database. It should be left alone from user objects. Now here is the question for you – can you think of any other reason for master log file growth? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – OVER clause with FIRST _VALUE and LAST_VALUE – Analytic Functions Introduced in SQL Server 2012 – ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

    - by pinaldave
    Yesterday I had discussed two analytical functions FIRST_VALUE and LAST_VALUE. After reading the blog post I received very interesting question. “Don’t you think there is bug in your first example where FIRST_VALUE is remain same but the LAST_VALUE is changing every line. I think the LAST_VALUE should be the highest value in the windows or set of result.” I find this question very interesting because this is very commonly made mistake. No there is no bug in the code. I think what we need is a bit more explanation. Let me attempt that first. Before you do that I suggest you read yesterday’s blog post as this question is related to that blog post. Now let’s have fun following query: USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO The above query will give us the following result: As per the reader’s question the value of the LAST_VALUE function should be always 114 and not increasing as the rows are increased. Let me re-write the above code once again with bit extra T-SQL Syntax. Please pay special attention to the ROW clause which I have added in the above syntax. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Now once again check the result of the above query. The result of both the query is same because in OVER clause the default ROWS selection is always UNBOUNDED PRECEDING AND CURRENT ROW. If you want the maximum value of the windows with OVER clause you need to change the syntax to UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING for ROW clause. Now run following query and pay special attention to ROW clause again. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Here is the resultset of the above query which is what questioner was asking. So in simple word, there is no bug but there is additional syntax needed to add to get your desired answer. The same logic also applies to PARTITION BY clause when used. Here is quick example of how we can further partition the query by SalesOrderDetailID with this new functions. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us windowed resultset on SalesOrderDetailsID as well give us FIRST and LAST value for the windowed resultset. There are lots to discuss for this two functions and we have just explored tip of the iceberg. In future post I will discover it further deep. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Function, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Introduction to Big Data – Guest Post

    - by pinaldave
    BIG Data – such a big word – everybody talks about this now a days. It is the word in the database world. In one of the conversation I asked my friend Jasjeet Sigh the same question – what is Big Data? He instantly came up with a very effective write-up.  Jasjeet is working as a Technical Manager with Koenig Solutions. He leads the SQL domain, and holds rich IT industry experience. Talking about Koenig, it is a 19 year old IT training company that offers several certification choices. Some of its courses include SharePoint Training, Project Management certifications, Microsoft Trainings, Business Intelligence programs, Web Design and Development courses etc. Big Data, as the name suggests, is about data that is BIG in nature. The data is BIG in terms of size, and it is difficult to manage such enormous data with relational database management systems that are quite popular these days. Big Data is not just about being large in size, it is also about the variety of the data that differs in form or type. Some examples of Big Data are given below : Scientific data related to weather and atmosphere, Genetics etc Data collected by various medical procedures, such as Radiology, CT scan, MRI etc Data related to Global Positioning System Pictures and Videos Radio Frequency Data Data that may vary very rapidly like stock exchange information Apart from difficulties in managing and storing such data, it is difficult to query, analyze and visualize it. The characteristics of Big Data can be defined by four Vs: Volume: It simply means a large volume of data that may span Petabyte, Exabyte and so on. However it also depends organization to organization that what volume of data they consider as Big Data. Variety: As discussed above, Big Data is not limited to relational information or structured Data. It can also include unstructured data like pictures, videos, text, audio etc. Velocity:  Velocity means the speed by which data changes. The higher is the velocity, the more efficient should be the system to capture and analyze the data. Missing any important point may lead to wrong analysis or may even result in loss. Veracity: It has been recently added as the fourth V, and generally means truthfulness or adherence to the truth. In terms of Big Data, it is more of a challenge than a characteristic. It is difficult to ascertain the truth out of the enormous amount of data and the one that has high velocity. There are always chances of having un-precise and uncertain data. It is a challenging task to clean such data before it is analyzed. Big Data can be considered as the next big thing in the IT sector in terms of innovation and development. If appropriate technologies are developed to analyze and use the information, it can be the driving force for almost all industrial segments. These include Retail, Manufacturing, Service, Finance, Healthcare etc. This will help them to automate business decisions, increase productivity, and innovate and develop new products. Thanks Jasjeet Singh for an excellent write up.  Jasjeet Sign is working as a Technical Manager with Koenig Solutions. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Database, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Big Data

    Read the article

  • SQL SERVER – Asynchronous Update and Timestamp – Check if Row Values are Changed Since Last Retrieve

    - by pinaldave
    Here is the question received just this morning. “Pinal, Our application is much different than other application you might have come across. In simple words, I would like to call it Asynchronous Updated Application. We need your quick opinion about one of the situation which we are facing. From business side: We have bidding system (similar to eBay but not exactly) and where multiple parties bid on one item, during the last few minutes of bidding many parties try to bid at the same time with the same price. When they hit submit, we would like to check if the original data which they retrieved is changed or not. If the original data which they have retrieved is the same, we will accept their new proposed price. If original data are changed, they will have to resubmit the data with new price. From technical side: We have a row which we retrieve in our application. Multiple users are retrieving the same row. Some of the users will update the value of the row and submit. However, only the very first user should be allowed to update the row and remaining all the users will have to re-fetch the row and updated it once again. We do not want to lock any record as that will create other problems. Do you have any solution for this kind of situation?” Fantastic Question. I believe there is good chance that we can use timestamp datatype in this kind of application. Before we continue let us see following simple example. USE tempdb GO CREATE TABLE SampleTable (ID INT, Col1 VARCHAR(100), TimeStampCol TIMESTAMP) GO INSERT INTO SampleTable (ID, Col1) VALUES (1, 'FirstVal') GO SELECT ID, Col1, TimeStampCol FROM SampleTable st GO UPDATE SampleTable SET Col1 = 'NextValue' GO SELECT ID, Col1, TimeStampCol FROM SampleTable st GO DROP TABLE SampleTable GO Now let us see the resultset. Here is the simple explanation of the scenario. We created a table with simple column with TIMESTAMP datatype. When we inserted a very first value the timestamp was generated. When we updated any value in that row, the timestamp was updated with the new value. Every single time when we update any value in the row, it will generate new timestamp value. Now let us apply this in an original question’s scenario. In that case multiple users are retrieving the same row. Everybody will have the same now same TimeStamp with them. Before any user update any value they should once again retrieve the timestamp from the table and compare with the timestamp they have with them. If both of the timestamp have the same value – the original row has not been updated and we can safely update the row with the new value. After initial update, now the row will contain a new timestamp. Any subsequent update to the same row should also go to the same process of checking the value of the timestamp they have in their memory. In this case, the timestamp from memory will be different from the timestamp in the row. This indicates that row in the table has changed and new updates should not be allowed. I believe timestamp can be very very useful in this kind of scenario. Is there any better alternative? Please leave a comment with the suggestion and I will post on the blog with due credit. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Renaming Index – Index Naming Conventions

    - by pinaldave
    If you are regular reader of this blog, you must be aware of that there are two kinds of blog posts 1) I share what I learn recently 2) I share what I learn and request your participation. Today’s blog post is where I need your opinion to make this blog post a good reference for future. Background Story Recently I came across system where users have changed the name of the few of the table to match their new standard naming convention. The name of the table should be self explanatory and they should have explain their purpose without either opening it or reading documentations. Well, not every time this is possible but again this should be the goal of any database modeler. Well, I no way encourage the name of the tables to be too long like ‘ContainsDetailsofNewInvoices’. May be the name of the table should be ‘Invoices’ and table should contain a column with New/Processed bit filed to indicate if the invoice is processed or not (if necessary). Coming back to original story, the database had several tables of which the name were changed. Story Continues… To continue the story let me take simple example. There was a table with the name  ’ReceivedInvoices’, it was changed to new name as ‘TblInvoices’. As per their new naming standard they had to prefix every talbe with the words ‘Tbl’ and prefix every view with the letters ‘Vw’. Personally I do not see any need of the prefix but again, that issue is not here to discuss.  Now after changing the name of the table they faced very interesting situation. They had few indexes on the table which had name of the table. Let us take an example. Old Name of Table: ReceivedInvoice Old Name of Index: Index_ReceivedInvoice1 Here is the new names New Name of Table: TblInvoices New Name of Index: ??? Well, their dilemma was what should be the new naming convention of the Indexes. Here is a quick proposal of the Index naming convention. Do let me know your opinion. If Index is Primary Clustered Index: PK_TableName If Index is  Non-clustered Index: IX_TableName_ColumnName1_ColumnName2… If Index is Unique Non-clustered Index: UX_TableName_ColumnName1_ColumnName2… If Index is Columnstore Non-clustered Index: CL_TableName Here ColumnName is the column on which index is created. As there can be only one Primary Key Index and Columnstore Index per table, they do not require ColumnName in the name of the index. The purpose of this new naming convention is to increase readability. When any user come across this index, without opening their properties or definition, user can will know the details of the index. T-SQL script to Rename Indexes Here is quick T-SQL script to rename Indexes EXEC sp_rename N'SchemaName.TableName.IndexName', N'New_IndexName', N'INDEX'; GO Your Contribute Please Well, the organization has already defined above four guidelines, personally I follow very similar guidelines too. I have seen many variations like adding prefixes CL for Clustered Index and NCL for Non-clustered Index. I have often seen many not using UX prefix for Unique Index but rather use generic IX prefix only. Now do you think if they have missed anything in the coding standard. Is NCI and CI prefixed required to additionally describe the index names. I have once received suggestion to even add fill factor in the index name – which I do not recommend at all. What do you think should be ideal name of the index, so it explains all the most important properties? Additionally, you are welcome to vote if you believe changing the name of index is just waste of time and energy.  Note: The purpose of the blog post is to encourage all to participate with their ideas. I will write follow up blog posts in future compiling all the suggestions. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Finding Different ColumnName From Almost Identitical Tables

    - by pinaldave
    I have mentioned earlier on this blog that I love social media – Facebook and Twitter. I receive so many interesting questions that sometimes I wonder how come I never faced them in my real life scenario. Well, let us see one of the similar situation. Here is one of the questions which I received on my social media handle. “Pinal, I have a large database. I did not develop this database but I have inherited this database. In our database we have many tables but all the tables are in pairs. We have one archive table and one current table. Now here is interesting situation. For a while due to some reason our organization has stopped paying attention to archive data. We did not archive anything for a while. If this was not enough we  even changed the schema of current table but did not change the corresponding archive table. This is now becoming a huge huge problem. We know for sure that in current table we have added few column but we do not know which ones. Is there any way we can figure out what are the new column added in the current table and does not exist in the archive tables? We cannot use any third party tool. Would you please guide us?” Well here is the interesting example of how we can use sys.column catalogue views and get the details of the newly added column. I have previously written about EXCEPT over here which is very similar to MINUS of Oracle. In following example we are going to create two tables. One of the tables has extra column. In our resultset we will get the name of the extra column as we are comparing the catalogue view of the column name. USE AdventureWorks2012 GO CREATE TABLE ArchiveTable (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(100), Col3 VARCHAR(100)); CREATE TABLE CurrentTable (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(100), Col3 VARCHAR(100), ExtraCol INT); GO -- Columns in ArchiveTable but not in CurrentTable SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'ArchiveTable' EXCEPT SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'CurrentTable' GO -- Columns in CurrentTable but not in ArchiveTable SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'CurrentTable' EXCEPT SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'ArchiveTable' GO DROP TABLE ArchiveTable; DROP TABLE CurrentTable; GO The above query will return us following result. I hope this solves the problems. It is not the most elegant solution ever possible but it works. Here is the puzzle back to you – what native T-SQL solution would you have provided in this situation? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL System Table, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Another Call to a member function get_segment() on a non-object question

    - by hogofwar
    I get the above error when calling this code: <? class Test1 extends Core { function home(){ ?> This is the INDEX of test1 <? } function test2(){ echo $this->uri->get_segment(1); //this is where the error comes from ?> This is the test2 of test1 testing URI <? } } ?> I get the error where commentated. This class extends this class: <?php class Core { public function start() { require("funk/funks/libraries/uri.php"); $this->uri = new uri(); require("funk/core/loader.php"); $this->load = new loader(); if($this->uri->get_segment(1) != "" and file_exists("funk/pages/".$this->uri->get_segment(1).".php")){ include("funk/pages/". $this->uri->get_segment(1).".php"); $var = $this->uri->get_segment(2); if ($var != ""){ $home= $this->uri->get_segment(1); $Index= new $home(); $Index->$var(); }else{ $home= $this->uri->get_segment(1); $Index = new $home(); $Index->home(); } }elseif($this->uri->get_segment(1) and ! file_exists("funk/pages/".$this->uri->get_segment(1).".php")){ echo "404 Error"; }else{ include("funk/pages/index.php"); $Index = new Index(); $Index->home(); //$this->Index->index(); echo "<!--This page was created with FunkyPHP!-->"; } } } ?> And here is the contents of uri.php: <?php class uri { private $server_path_info = ''; private $segment = array(); private $segments = 0; public function __construct() { $segment_temp = array(); $this->server_path_info = preg_replace("/\?/", "", $_SERVER["PATH_INFO"]); $segment_temp = explode("/", $this->server_path_info); foreach ($segment_temp as $key => $seg) { if (!preg_match("/([a-zA-Z0-9\.\_\-]+)/", $seg) || empty($seg)) unset($segment_temp[$key]); } foreach ($segment_temp as $k => $value) { $this->segment[] = $value; } unset($segment_temp); $this->segments = count($this->segment); } public function segment_exists($id = 0) { $id = (int)$id; if (isset($this->segment[$id])) return true; else return false; } public function get_segment($id = 0) { $id--; $id = (int)$id; if ($this->segment_exists($id) === true) return $this->segment[$id]; else return false; } } ?> i have asked a similar question to this before but the answer does not apply here. I have rewritten my code 3 times to KILL and Delimb this godforsaken error! but nooooooo....

    Read the article

  • Over 200 active requests like "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)"

    - by Stefan Lasiewski
    Some details: Webserver: Apache/2.2.13 (FreeBSD) mod_ssl/2.2.13 OpenSSL/0.9.8e OS: FreeBSD 7.2-RELEASE This is a FreeBSD Jail. I believe I use the Apache 'prefork' MPM (I run the default for FreeBSD). I use the default values for MaxClients (256) I have enabled mod_status, with "ExtendedStatus On". When I view /server-status , I see a handful of regular requests. I also see over 230 requests from the 'localhost', like these: 37-0 - 0/0/1 . 0.00 1510 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0 38-0 - 0/0/1 . 0.00 1509 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0 39-0 - 0/0/3 . 0.00 1482 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0 40-0 - 0/0/6 . 0.00 1445 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0 I also see about 2417 requests yesterday from the localhost, like these: Apr 14 11:16:40 192.168.16.127 httpd[431]: www.example.gov 127.0.0.2 - - [15/Apr/2010:11:16:40 -0700] "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)" The page at http://wiki.apache.org/httpd/InternalDummyConnection says "These requests are perfectly normal and you do not, in general, need to worry about them", but I'm not so sure. Why are there over 230 of these? Are these active connections? If I have "MaxClients 256", and over 230 of these connections, it seems that my webserver is dangerously close to running out of available connections. It also seems like Apache should only need a handful of these "internal dummy connections" We actually had two unexplained outages last night, and I am wondering if these "internal dummy connection" caused us to run out of available connections.

    Read the article

  • Overriding some DNS entries in BIND for internal networks

    - by Remy Blank
    I have an internal network with a DNS server running BIND, connected to the internet through a single gateway. My domain "example.com" is managed by an external DNS provider. Some of the entries in that domain, say "host1.example.com" and "host2.example.com", as well as the top-level entry "example.com", point to the public IP address of the gateway. I would like hosts located on the internal network to resolve "host1.example.com", "host2.example.com" and "example.com" to internal IP addresses instead of that of the gateway. Other hosts like "otherhost.example.com" should still be resolved by the external DNS provider. I have succeeded in doing that for the host1 and host2 entries, by defining two single-entry zones in BIND for "host1.example.com" and "host2.example.com". However, if I add a zone for "example.com", all queries for that domain are resolved by my local DNS server, and e.g. querying "otherhost.example.com" results in an error. Is it possible to configure BIND to override only some entries of a domain, and to resolve the rest recursively?

    Read the article

  • Anonymizing OpenVPN Allow SSH Access to Internal Server

    - by Lionel
    I'm using an anonymizing VPN, but want SSH access to internal computer. How do I access my internal computer through SSH? When I do ssh 98.123.45.6, the connection times out. IP address from cable provider: 98.123.45.6 Anonymous IP through VPN: 50.1.2.3 Internal computer: 192.168.1.123 When searching around, I found recommendations to either set up iptables rules, routing rules, or to add ListenAddress to sshd_config. Which of these applies to my case? Here is my route: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 10.115.81.1 10.115.81.9 255.255.255.255 UGH 0 0 0 tun0 10.115.81.9 * 255.255.255.255 UH 0 0 0 tun0 50.1.2.3-sta ddwrt 255.255.255.255 UGH 0 0 0 eth0 192.168.1.0 * 255.255.255.0 U 202 0 0 eth0 169.254.0.0 * 255.255.0.0 U 204 0 0 vboxnet0 loopback * 255.0.0.0 U 0 0 0 lo default 10.115.81.9 128.0.0.0 UG 0 0 0 tun0 128.0.0.0 10.115.81.9 128.0.0.0 UG 0 0 0 tun0 default ddwrt 0.0.0.0 UG 202 0 0 eth0

    Read the article

  • How to create a simple server/client application using boost.asio?

    - by the_drow
    I was going over the examples of boost.asio and I am wondering why there isn't an example of a simple server/client example that prints a string on the server and then returns a response to the client. I tried to modify the echo server but I can't really figure out what I'm doing at all. Can anyone find me a template of a client and a template of a server? I would like to eventually create a server/client application that receives binary data and just returns an acknowledgment back to the client that the data is received.

    Read the article

  • Will .NET 4.0 apps work on Win 2008 R2 Server Core?

    - by markus
    When Windows Server 2008 R2 was launched, the "server core" edition started to become useful to me, because it lets me deploy .NET background applications isolated on their own virtual machine instance with only a small fraction of all the disk space overhead of a default Windows Server installation, and very few Windows Updates. It comes with a subset of .NET 3.5 SP1 integrated (as an optional feature). Now that .NET 4.0 is released, the redistributables explicitly state that it's not support on Server Core. Any chance that there will be a separate download available for Server Core (e. g. without WPF) any time soon, has anybody heard about it?

    Read the article

  • SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

    - by pinaldave
    I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB  an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling)  workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

    Read the article

  • Fraud Detection with the SQL Server Suite Part 2

    - by Dejan Sarka
    This is the second part of the fraud detection whitepaper. You can find the first part in my previous blog post about this topic. My Approach to Data Mining Projects It is impossible to evaluate the time and money needed for a complete fraud detection infrastructure in advance. Personally, I do not know the customer’s data in advance. I don’t know whether there is already an existing infrastructure, like a data warehouse, in place, or whether we would need to build one from scratch. Therefore, I always suggest to start with a proof-of-concept (POC) project. A POC takes something between 5 and 10 working days, and involves personnel from the customer’s site – either employees or outsourced consultants. The team should include a subject matter expert (SME) and at least one information technology (IT) expert. The SME must be familiar with both the domain in question as well as the meaning of data at hand, while the IT expert should be familiar with the structure of data, how to access it, and have some programming (preferably Transact-SQL) knowledge. With more than one IT expert the most time consuming work, namely data preparation and overview, can be completed sooner. I assume that the relevant data is already extracted and available at the very beginning of the POC project. If a customer wants to have their people involved in the project directly and requests the transfer of knowledge, the project begins with training. I strongly advise this approach as it offers the establishment of a common background for all people involved, the understanding of how the algorithms work and the understanding of how the results should be interpreted, a way of becoming familiar with the SQL Server suite, and more. Once the data has been extracted, the customer’s SME (i.e. the analyst), and the IT expert assigned to the project will learn how to prepare the data in an efficient manner. Together with me, knowledge and expertise allow us to focus immediately on the most interesting attributes and identify any additional, calculated, ones soon after. By employing our programming knowledge, we can, for example, prepare tens of derived variables, detect outliers, identify the relationships between pairs of input variables, and more, in only two or three days, depending on the quantity and the quality of input data. I favor the customer’s decision of assigning additional personnel to the project. For example, I actually prefer to work with two teams simultaneously. I demonstrate and explain the subject matter by applying techniques directly on the data managed by each team, and then both teams continue to work on the data overview and data preparation under our supervision. I explain to the teams what kind of results we expect, the reasons why they are needed, and how to achieve them. Afterwards we review and explain the results, and continue with new instructions, until we resolve all known problems. Simultaneously with the data preparation the data overview is performed. The logic behind this task is the same – again I show to the teams involved the expected results, how to achieve them and what they mean. This is also done in multiple cycles as is the case with data preparation, because, quite frankly, both tasks are completely interleaved. A specific objective of the data overview is of principal importance – it is represented by a simple star schema and a simple OLAP cube that will first of all simplify data discovery and interpretation of the results, and will also prove useful in the following tasks. The presence of the customer’s SME is the key to resolving possible issues with the actual meaning of the data. We can always replace the IT part of the team with another database developer; however, we cannot conduct this kind of a project without the customer’s SME. After the data preparation and when the data overview is available, we begin the scientific part of the project. I assist the team in developing a variety of models, and in interpreting the results. The results are presented graphically, in an intuitive way. While it is possible to interpret the results on the fly, a much more appropriate alternative is possible if the initial training was also performed, because it allows the customer’s personnel to interpret the results by themselves, with only some guidance from me. The models are evaluated immediately by using several different techniques. One of the techniques includes evaluation over time, where we use an OLAP cube. After evaluating the models, we select the most appropriate model to be deployed for a production test; this allows the team to understand the deployment process. There are many possibilities of deploying data mining models into production; at the POC stage, we select the one that can be completed quickly. Typically, this means that we add the mining model as an additional dimension to an existing DW or OLAP cube, or to the OLAP cube developed during the data overview phase. Finally, we spend some time presenting the results of the POC project to the stakeholders and managers. Even from a POC, the customer will receive lots of benefits, all at the sole risk of spending money and time for a single 5 to 10 day project: The customer learns the basic patterns of frauds and fraud detection The customer learns how to do the entire cycle with their own people, only relying on me for the most complex problems The customer’s analysts learn how to perform much more in-depth analyses than they ever thought possible The customer’s IT experts learn how to perform data extraction and preparation much more efficiently than they did before All of the attendees of this training learn how to use their own creativity to implement further improvements of the process and procedures, even after the solution has been deployed to production The POC output for a smaller company or for a subsidiary of a larger company can actually be considered a finished, production-ready solution It is possible to utilize the results of the POC project at subsidiary level, as a finished POC project for the entire enterprise Typically, the project results in several important “side effects” Improved data quality Improved employee job satisfaction, as they are able to proactively contribute to the central knowledge about fraud patterns in the organization Because eventually more minds get to be involved in the enterprise, the company should expect more and better fraud detection patterns After the POC project is completed as described above, the actual project would not need months of engagement from my side. This is possible due to our preference to transfer the knowledge onto the customer’s employees: typically, the customer will use the results of the POC project for some time, and only engage me again to complete the project, or to ask for additional expertise if the complexity of the problem increases significantly. I usually expect to perform the following tasks: Establish the final infrastructure to measure the efficiency of the deployed models Deploy the models in additional scenarios Through reports By including Data Mining Extensions (DMX) queries in OLTP applications to support real-time early warnings Include data mining models as dimensions in OLAP cubes, if this was not done already during the POC project Create smart ETL applications that divert suspicious data for immediate or later inspection I would also offer to investigate how the outcome could be transferred automatically to the central system; for instance, if the POC project was performed in a subsidiary whereas a central system is available as well Of course, for the actual project, I would repeat the data and model preparation as needed It is virtually impossible to tell in advance how much time the deployment would take, before we decide together with customer what exactly the deployment process should cover. Without considering the deployment part, and with the POC project conducted as suggested above (including the transfer of knowledge), the actual project should still only take additional 5 to 10 days. The approximate timeline for the POC project is, as follows: 1-2 days of training 2-3 days for data preparation and data overview 2 days for creating and evaluating the models 1 day for initial preparation of the continuous learning infrastructure 1 day for presentation of the results and discussion of further actions Quite frequently I receive the following question: are we going to find the best possible model during the POC project, or during the actual project? My answer is always quite simple: I do not know. Maybe, if we would spend just one hour more for data preparation, or create just one more model, we could get better patterns and predictions. However, we simply must stop somewhere, and the best possible way to do this, according to my experience, is to restrict the time spent on the project in advance, after an agreement with the customer. You must also never forget that, because we build the complete learning infrastructure and transfer the knowledge, the customer will be capable of doing further investigations independently and improve the models and predictions over time without the need for a constant engagement with me.

    Read the article

  • RHEL - blocked FC remote port time out: saving binding

    - by Dev G
    My Server went into a faulty state since the database could not write on the partition. I found out that the partition went into Read Only mode. Finally to fix it, I had to do a hard reboot. Linux 2.6.18-164.el5PAE #1 SMP Tue Aug 18 15:59:11 EDT 2009 i686 i686 i386 GNU/Linux /var/log/messages Oct 31 00:56:45 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 00:57:05 ota3g1 Had[17275]: VCS CRITICAL V-16-1-50086 CPU usage on ota3g1.mtsallstream.com is 100% Oct 31 01:01:47 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:06:50 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:11:52 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x2 received Data: x2 x20 x80000 x0 x0 Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1303 Link Up Event x3 received Data: x3 x1 x10 x1 x0 x0 0 Oct 31 01:12:12 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x4 received Data: x4 x20 x80000 x0 x0 Oct 31 01:12:40 ota3g1 kernel: rport-8:0-0: blocked FC remote port time out: saving binding Oct 31 01:12:40 ota3g1 kernel: lpfc 0000:29:00.1: 1:(0):0203 Devloss timeout on WWPN 20:25:00:a0:b8:74:f5:65 NPort x0000e4 Data: x0 x7 x0 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 38617577 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283532153 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 90825 Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-16. Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 868841 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-10. Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37759889 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283349449 Oct 31 01:12:40 ota3g1 kernel: printk: 6 messages suppressed. Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-12. Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_reserve_inode_write: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 1545 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 12745 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-10, logical block 1545 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_reserve_inode_write: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-10 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_dirty_inode: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37757897 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 1097 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_dirty_inode: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0 Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:41 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089 Oct 31 01:12:41 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0 Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/cciss-root 4.9G 730M 3.9G 16% / /dev/mapper/cciss-home 9.7G 1.2G 8.1G 13% /home /dev/mapper/cciss-var 9.7G 494M 8.8G 6% /var /dev/mapper/cciss-usr 15G 2.6G 12G 19% /usr /dev/mapper/cciss-tmp 3.9G 153M 3.6G 5% /tmp /dev/sda1 996M 43M 902M 5% /boot tmpfs 5.9G 0 5.9G 0% /dev/shm /dev/mapper/cciss-product 25G 16G 7.4G 68% /product /dev/mapper/cciss-opt 20G 4.5G 14G 25% /opt /dev/mapper/dg_db1-vol_db1_system 18G 2.2G 15G 14% /database/OTADB/sys /dev/mapper/dg_db1-vol_db1_undo 18G 5.8G 12G 35% /database/OTADB/undo /dev/mapper/dg_db1-vol_db1_redo 8.9G 4.3G 4.2G 51% /database/OTADB/redo /dev/mapper/dg_db1-vol_db1_sgbd 8.9G 654M 7.8G 8% /database/OTADB/admin /dev/mapper/dg_db1-vol_db1_arch 98G 24G 69G 26% /database/OTADB/arch /dev/mapper/dg_db1-vol_db1_indexes 240G 14G 214G 6% /database/OTADB/index /dev/mapper/dg_db1-vol_db1_data 275G 47G 215G 18% /database/OTADB/data /dev/mapper/dg_dbrman-vol_db_rman 8.9G 351M 8.1G 5% /database/RMAN /dev/mapper/dg_app1-vol_app1 151G 113G 31G 79% /files/ota /etc/fstab /dev/cciss/root / ext3 defaults 1 1 /dev/cciss/home /home ext3 defaults 1 2 /dev/cciss/var /var ext3 defaults 1 2 /dev/cciss/usr /usr ext3 defaults 1 2 /dev/cciss/tmp /tmp ext3 defaults 1 2 LABEL=/boot /boot ext3 defaults 1 2 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/cciss/swap swap swap defaults 0 0 /dev/cciss/product /product ext3 defaults 1 2 /dev/cciss/opt /opt ext3 defaults 1 2 /dev/dg_db1/vol_db1_system /database/OTADB/sys ext3 defaults 1 2 /dev/dg_db1/vol_db1_undo /database/OTADB/undo ext3 defaults 1 2 /dev/dg_db1/vol_db1_redo /database/OTADB/redo ext3 defaults 1 2 /dev/dg_db1/vol_db1_sgbd /database/OTADB/admin ext3 defaults 1 2 /dev/dg_db1/vol_db1_arch /database/OTADB/arch ext3 defaults 1 2 /dev/dg_db1/vol_db1_indexes /database/OTADB/index ext3 defaults 1 2 /dev/dg_db1/vol_db1_data /database/OTADB/data ext3 defaults 1 2 /dev/dg_dbrman/vol_db_rman /database/RMAN ext3 defaults 1 2 /dev/dg_app1/vol_app1 /files/ota ext3 defaults 1 2 Thanks for all the help.

    Read the article

  • Cannot enable network discovery on Windows Server 2008 R2

    - by dariom
    I'm trying to enable the Network Discovery feature on a newly installed Windows Server 2008 R2 instance. The network connection is in the Home or Work profile (it is not domain joined). These are the steps I've followed: Within the Network and Sharing Center I select Change advanced sharing settings Then I select the Turn on network discovery option for the current network profile (Home or Work) I then click Save changes If I then go back to the Advanced sharing settings screen the Turn off network discovery option is selected and the machine is not visible to others within the Network node in Windows Explorer. Things I've checked: I can ping the server and connect to it using the machine name/IP address. The Windows Firewall has exceptions for Network Discovery for both Private and Public networks. File and Printer sharing is enabled and I can transfer files to/from the server by connecting to the server using a UNC path. What am I missing here?

    Read the article

  • openVPN GUI does not run error about error opening registry for reading HKLM\SOFTWARE\OpenVPN

    - by Coder
    I'm trying to run OpenVPN as a portable application and to that effect i have installed it on a Windows 7 machine, copied the files to another windows 7 machine and manually restored the registry settings using a .reg file. Whenever i try to run open vpn GUI i get the following error error opening registry for reading HKLM\SOFTWARE\OpenVPN I have verified that the key mentioned is indeed in the registry at the correct location with the correct values yet the GUI still complains. I have tried running the gui as an administrator (i'm logged in as an administrator) and also the compatibility modes but none helped. I have also tried openVPN portable "OpenVPNPortable_1.6.6.paf.exe" and it has the same problem. Can anybody help me with this issue?

    Read the article

  • Install server 2012 on HP ML110 G7 with B110i controller, no disks found

    - by Molotch
    I'm trying to install Server 2012 on a HP ML110 G7 with a B110i controller and four non hot-swappable SATA drives. I just can't get the Server 2012 boot disk PE environment to find any disks. I have downloaded the latest SPP (Service Pack for Proliant 2012.10) and flashed the BIOS. I have tried two different HP drivers for B110i and Windows X64, 6.18.0.64 and 6.18.2.64 to no avail. I have tried setting the controller to both AHCI and legacy mode in the BIOS, no difference. HP:s SmartStart disc for G7 servers only support installation of up to Windows Server 2008R2. HP:s installation instructions for Server 2012 Essentials says boot from the windows disk and use the storage drivers found on the SPP (I can't find any storage drivers on the SPP disk).

    Read the article

  • IIS7.5 Domain Account Application Pool Identity for SQL Server Authentication

    - by Gareth Hill
    In Windows Server 2003/IIS6 land we typically create an app pool that runs as the identity of an AD account created with minimal privileges simply for that purpose. This same domain user would also be granted access to SQL Server so that any ASP.NET application in that app pool would be able to connect to SQL Server with Integrated Security=SSPI. We are making a brave move to the world of Windows Server 2008 R2/IIS7.5 and are looking to replicate this model, but I am struggling with how to make the application pool in IIS7.5 run as the identity of an AD account? I know this sounds simple and hopefully it is, but my attempts so far have been fruitless. Should the application pool identity be a 'Custom account' for a domain account? Does the domain account need to be added to any groups?

    Read the article

  • Windows Backup to network share (Server 2008)

    - by Joe
    I'm trying to setup Windows Backup on a Server 2008 machine to backup to a network share. When I run the wizard to setup the backup I get an error message "The user name being used for accessing the remote share folder is not recognized by the local computer". I have no idea what this means. Help? The server with the network share is a domain controller (also server 2008). The server I am trying to back up is not and is not part of the domain.

    Read the article

  • Problem with Email Notifications in VisualSVN Server

    - by emzero
    Hey guys! I have a dedicated server running windows 2003 server and Visual SVN Server 2.0.8. I'm trying to configure it to send email notifications on commit. So I found this article on Visual SVN site. It says I have to edit the Post-commit hook and set it to the following: "%VISUALSVN_SERVER%\bin\VisualSVNServerHooks.exe" ^ commit-notification "%1" -r %2 ^ --from <from-email> --to <to-email> ^ --smtp-server <smtp-server> Of course I've replaced the variables there. The problem is when someone commits something, the svn client throws the following error: post-commit hook failed (exit code 1) with no output. The commit process runs with no problems, I mean it does commit the files. But it won't send any email notification. If I remove the post-commit hook, then I don't get the error (and of course I don't get any notification). Could you help me out with it? The error doesn't tell too much =S Thank you!

    Read the article

< Previous Page | 161 162 163 164 165 166 167 168 169 170 171 172  | Next Page >