Daily Archives

Articles indexed Thursday February 17 2011

Page 2/12 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

SQL SERVER – WRITELOG – Wait Type – Day 17 of 28

- by pinaldave

WRITELOG is one of the most interesting wait types. So far we have seen a lot of different wait types, but this log type is associated with log file which makes it interesting to deal with. From Book On-Line: WRITELOG Occurs while waiting for a log flush to complete. Common operations that cause log flushes are checkpoints and transaction commits. WRITELOG Explanation: This wait type is usually seen in the heavy transactional database. When data is modified, it is written both on the log cache and buffer cache. This wait type occurs when data in the log cache is flushing to the disk. During this time, the session has to wait due to WRITELOG. I have recently seen this wait type’s persistence at my client’s place, where one of the long-running transactions was stopped by the user causing it to roll back. In the future, I will see if I could re-create this situation once again on my machine to validate the relation. Reducing WRITELOG wait: There are several suggestions to reduce this wait stats: Move Transaction Log to Separate Disk from mdf and other files. Avoid cursor-like coding methodology and frequent committing of statements. Find the most active file based on IO stall time based on the script written over here. You can also use fn_virtualfilestats to find IO-related issues using the script mentioned over here. Check the IO-related counters (PhysicalDisk:Avg.Disk Queue Length, PhysicalDisk:Disk Read Bytes/sec and PhysicalDisk :Disk Write Bytes/sec) for additional details. Read about them over here. There are two excellent resources by Paul Randal, I suggest you understand the subject from those videos. The links to videos are here and here. Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All the discussion of Wait Stats in this blog is generic and varies from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

Read the article
MCM Preperations - how it's going

- by NeilHambly

Since the announcement in November 2010 that the MCM SQL Server 2008 Training program had been revamped, read more on that here http://www.sqlskills.com/BLOGS/PAUL/post/Big-changes-to-the-MCM-program-and-how-SQLskills-can-help-you.aspx Experienced SQL Professionals now have more opportunity to undertake this advanced certification, Where they previously might not have been able to undertake for a variety of reasons, {time, money, location etc..} With a few announcements of those who has recently...(read more)

Read the article
Sponsor sessions - why should you attend?

- by Testas

At the Manchester SQL Server User Group we have had a number of sponser sessions, likewise at SQLBits too You may think that it would be an hour promoting the software that that a particular vendor has to offer. This is often not the case. many session spend time focusing on the tools, native to SQL Server that can be used for performance tuning and finish off by providing an overview of vendors software and how it can make it easier to perform performance tuning operations on your SQL Server. Many of you will be attending SQLBits this April. Many of the sponsors will perform a lunchtime lecture surrounding many areas of SQL Server. Event sponsors play a very important role in supporting events such as SQLBits and some of the SQL Server User group events Based on the presentations I have seen, I would recommend attending one of the lunchtime sessions at SQLBits. I have no doubt you will pick up golden nuggets of information that will help you in your work. I know I have Chris

Read the article
Using SQL Source Control and Vault Professional Part 4

- by Ajarn Mark Caldwell

Two weeks ago I upgraded our installation of Fortress to the latest version, which is now named Vault Professional. This is the version of Vault (i.e. Vault Standard 5.1 / Vault Professional 5.1) that will be officially supported with Red-Gate SQL Source Control 2.1. While the folks at Red-Gate did a fantastic job of working with me to get SQL Source Control to work with the older Fortress version, we weren’t going to just sit on that. There are a couple of things that Vault Professional cleaned up for us, such as improved integration with Visual Studio 2010, so it was a win all around. Shortly after that upgrade, I received notice from Red-Gate that they had a new Early Access version of SQL Source Control available that included the ability to source control static data. The idea here is that you probably have a few fairly static lookup tables in your system, and those data values are similar in concept to source code, and should be versioned in your source control management system also. I agree with this, but please be wise…somebody out there is bound to try to use this feature as their disaster recovery for their entire database, and that is NOT the purpose. First off, you should never have your PROD (or LIVE, whatever you call it) system attached to source control. Source Control is for development, not for PROD systems. Second, use the features that are intended for this purpose, such as BACKUP and RESTORE. Laying that tangent aside, it is great that now you can include these critical values in your repository and make them part of a deployment process. As you would guess, SQL Source Control uses SQL Data Compare to create the data change scripts just like it uses SQL Compare to create the schema change scripts. Once again, they did a very good job with the integration to their other products. At this point we are really starting to see some good payback on our investment in the full SQL Developer Bundle. Those products were worth the investment back when we only used them sporadically for troubleshooting and DBA analysis, but now with SQL Source Control, they are becoming everyday-use products for the development team. I like this software (SQL Source Control) so much that I am about to break my own rules and distribute it to my team to use even though it is still in beta. This is the first time that I have approved the use of any beta software in a production scenario (actively building our next versions of internal software) but I predict that the usability and productivity gain of using SQL Source Control over manual scripting is worth the risk. Of course, I have also put this beta software through its paces pretty well to be comfortable with it, and Red-Gate has proven their responsiveness to issues that came up in my early beta testing, and so I am willing to bet on their continued support. Likewise, SourceGear, the maker of Vault Professional, has proven itself to me as well, and so the combination of SQL Source Control with Vault Professional is the new standard for my development team.

Read the article
Solaris 11 Express:????????&Web????????

- by Yusuke.Yamamoto

????????????????? Solaris 11 Express ?? 2011?????????? Oracle Solaris 11 ???????????????????????OS Solaris 11??2011?1?19????????? ???????????????????????????????????????????????? ?????????????OS Solaris 11?:???????? ?????????????OS Solaris 11?:?? ?????17?13?30????Solaris 11 Express ??????????????????????????????? ??????????????? ??? ???? ???? 2011?02?17?(?)13:30~14:30 ?? ?????? Solaris 11 Express ???? ???? Oracle Solaris|??????????? Oracle Solaris Solaris ZFS|??????????? ?????Solaris 11??????????????? - ???? Watch

Read the article
??W-1????????????

- by Masa.Sasaki

??(2?16?)?14?WebLogic Server@????????WebLogic Server??????????W-1???????????????????????????????????????????????????????????????????? ???????????????WebLogic Server???????????????????????????????WebLogic Server???·??·?????????????????????????????????????????????? ???????JVM??????JDBC??????????????????????????????????4?????????????????????????????????????????????????? ??????W-1????????????????????Tiwitter4J??????(12????10?)??WebLogic Server 11g??????????????????????(???????????????????????????????) ????9?WebLogic Server???@???3?9????15?WebLogic Server???@??(??????????)?3?16???? ???W-1?????(?????)??????????????????????????????????

Read the article
Wanna be a Rock Star?

- by lydia.smyers

We just launched the Oracle PartnerNetwork Specialized Partner Premier site, a new site specifically targeted to customers, partners and Oracle employees. The site is all about "You", our partners, who have achieved one or more of the over 50 specializations available today. Specialized. Recognized. Preferred. Yes, that's you - the chosen vendors. And just for you we're going to produce videos starring "You" and your stories which will available everywhere Oracle customers, partners and employees are. Sundance may be over but we're just getting started. This is just one of the new benefits, want more? Industry Analyst Firm, IDC offers kudos and says Oracle's premier "approach to putting top partners front and center with customers and prospects" has "raised the bar." We'll be advertising this site all over Oracle's major websites to give our specialized partners their deserved recognition. Bring on the buttered popcorn! Lights! Camera! What are you waiting for? http://www.oracle.com/specialized. Need to get specialized? Check out how: http://www.oracle.com/partners/en/opn-program/specialize/index.html

Read the article
Neighboring Siblings?

- by Ramkumar Menon

Found an Interesting observation on C.M.Spielberg McQueen’s Blog – XPath 1.0 describes, amongst other axes, ones that allow access to immediate parent and immediate child nodes, as well as access to ancestor and descendant node-sets, but does not provide for immediate siblings – The only way to access these are via predicates – preceding-sibling::*[1] or following-sibling::*[1], and not explicit next-sibling and a previous-sibling axes.

Read the article
Seeking on a Heap, and Two Useful DMVs

- by Paul White

So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much more complex than it first appears. A seek can contain one or more seek predicates – each of which can either identify at most one row in a unique index (a singleton lookup) or a range of values (a range scan). When looking at a query plan, we will often need to look at the details of the seek operator in the Properties window to see how many operations it is performing, and what type of operation each one is. As you saw in the first post in this series, the number of hidden seeking operations can have an appreciable impact on performance. Measuring Seeks and Scans I mentioned in my last post that there is no way to tell from a graphical query plan whether you are seeing a singleton lookup or a range scan. You can work it out – if you happen to know that the index is defined as unique and the seek predicate is an equality comparison, but there’s no separate property that says ‘singleton lookup’ or ‘range scan’. This is a shame, and if I had my way, the query plan would show different icons for range scans and singleton lookups – perhaps also indicating whether the operation was one or more of those operations underneath the covers. In light of all that, you might be wondering if there is another way to measure how many seeks of either type are occurring in your system, or for a particular query. As is often the case, the answer is yes – we can use a couple of dynamic management views (DMVs): sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats. Index Usage Stats The index usage stats DMV contains counts of index operations from the perspective of the Query Executor (QE) – the SQL Server component that is responsible for executing the query plan. It has three columns that are of particular interest to us: user_seeks – the number of times an Index Seek operator appears in an executed plan user_scans – the number of times a Table Scan or Index Scan operator appears in an executed plan user_lookups – the number of times an RID or Key Lookup operator appears in an executed plan An operator is counted once per execution (generating an estimated plan does not affect the totals), so an Index Seek that executes 10,000 times in a single plan execution adds 1 to the count of user seeks. Even less intuitively, an operator is also counted once per execution even if it is not executed at all. I will show you a demonstration of each of these things later in this post. Index Operational Stats The index operational stats DMV contains counts of index and table operations from the perspective of the Storage Engine (SE). It contains a wealth of interesting information, but the two columns of interest to us right now are: range_scan_count – the number of range scans (including unrestricted full scans) on a heap or index structure singleton_lookup_count – the number of singleton lookups in a heap or index structure This DMV counts each SE operation, so 10,000 singleton lookups will add 10,000 to the singleton lookup count column, and a table scan that is executed 5 times will add 5 to the range scan count. The Test Rig To explore the behaviour of seeks and scans in detail, we will need to create a test environment. The scripts presented here are best run on SQL Server 2008 Developer Edition, but the majority of the tests will work just fine on SQL Server 2005. A couple of tests use partitioning, but these will be skipped if you are not running an Enterprise-equivalent SKU. Ok, first up we need a database: USE master; GO IF DB_ID('ScansAndSeeks') IS NOT NULL DROP DATABASE ScansAndSeeks; GO CREATE DATABASE ScansAndSeeks; GO USE ScansAndSeeks; GO ALTER DATABASE ScansAndSeeks SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE ScansAndSeeks SET AUTO_CLOSE OFF, AUTO_SHRINK OFF, AUTO_CREATE_STATISTICS OFF, AUTO_UPDATE_STATISTICS OFF, PARAMETERIZATION SIMPLE, READ_COMMITTED_SNAPSHOT OFF, RESTRICTED_USER ; Notice that several database options are set in particular ways to ensure we get meaningful and reproducible results from the DMVs. In particular, the options to auto-create and update statistics are disabled. There are also three stored procedures, the first of which creates a test table (which may or may not be partitioned). The table is pretty much the same one we used yesterday: The table has 100 rows, and both the key_col and data columns contain the same values – the integers from 1 to 100 inclusive. The table is a heap, with a non-clustered primary key on key_col, and a non-clustered non-unique index on the data column. The only reason I have used a heap here, rather than a clustered table, is so I can demonstrate a seek on a heap later on. The table has an extra column (not shown because I am too lazy to update the diagram from yesterday) called padding – a CHAR(100) column that just contains 100 spaces in every row. It’s just there to discourage SQL Server from choosing table scan over an index + RID lookup in one of the tests. The first stored procedure is called ResetTest: CREATE PROCEDURE dbo.ResetTest @Partitioned BIT = 'false' AS BEGIN SET NOCOUNT ON ; IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; IF @Partitioned = 'true' BEGIN -- Enterprise, Trial, or Developer -- required for partitioning tests IF SERVERPROPERTY('EngineEdition') = 3 BEGIN EXECUTE (' DROP TABLE dbo.Example ; IF EXISTS ( SELECT 1 FROM sys.partition_schemes WHERE name = N''PS'' ) DROP PARTITION SCHEME PS ; IF EXISTS ( SELECT 1 FROM sys.partition_functions WHERE name = N''PF'' ) DROP PARTITION FUNCTION PF ; CREATE PARTITION FUNCTION PF (INTEGER) AS RANGE RIGHT FOR VALUES (20, 40, 60, 80, 100) ; CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]) ; CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ON PS (key_col); '); END ELSE BEGIN RAISERROR('Invalid SKU for partition test', 16, 1); RETURN; END; END ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; END; GO The second stored procedure, ShowStats, displays information from the Index Usage Stats and Index Operational Stats DMVs: CREATE PROCEDURE dbo.ShowStats @Partitioned BIT = 'false' AS BEGIN -- Index Usage Stats DMV (QE) SELECT index_name = ISNULL(I.name, I.type_desc), scans = IUS.user_scans, seeks = IUS.user_seeks, lookups = IUS.user_lookups FROM sys.dm_db_index_usage_stats AS IUS JOIN sys.indexes AS I ON I.object_id = IUS.object_id AND I.index_id = IUS.index_id WHERE IUS.database_id = DB_ID(N'ScansAndSeeks') AND IUS.object_id = OBJECT_ID(N'dbo.Example', N'U') ORDER BY I.index_id ; -- Index Operational Stats DMV (SE) IF @Partitioned = 'true' SELECT index_name = ISNULL(I.name, I.type_desc), partitions = COUNT(IOS.partition_number), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; ELSE SELECT index_name = ISNULL(I.name, I.type_desc), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; END; The final stored procedure, RunTest, executes a query written against the example table: CREATE PROCEDURE dbo.RunTest @SQL VARCHAR(8000), @Partitioned BIT = 'false' AS BEGIN -- No execution plan yet SET STATISTICS XML OFF ; -- Reset the test environment EXECUTE dbo.ResetTest @Partitioned ; -- Previous call will throw an error if a partitioned -- test was requested, but SKU does not support it IF @@ERROR = 0 BEGIN -- IO statistics and plan on SET STATISTICS XML, IO ON ; -- Test statement EXECUTE (@SQL) ; -- Plan and IO statistics off SET STATISTICS XML, IO OFF ; EXECUTE dbo.ShowStats @Partitioned; END; END; The Tests The first test is a simple scan of the heap table: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example'; The top result set comes from the Index Usage Stats DMV, so it is the Query Executor’s (QE) view. The lower result is from Index Operational Stats, which shows statistics derived from the actions taken by the Storage Engine (SE). We see that QE performed 1 scan operation on the heap, and SE performed a single range scan. Let’s try a single-value equality seek on a unique index next: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 32'; This time we see a single seek on the non-clustered primary key from QE, and one singleton lookup on the same index by the SE. Now for a single-value seek on the non-unique non-clustered index: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32'; QE shows a single seek on the non-clustered non-unique index, but SE shows a single range scan on that index – not the singleton lookup we saw in the previous test. That makes sense because we know that only a single-value seek into a unique index is a singleton seek. A single-value seek into a non-unique index might retrieve any number of rows, if you think about it. The next query is equivalent to the IN list example seen in the first post in this series, but it is written using OR (just for variety, you understand): EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32 OR data = 33'; The plan looks the same, and there’s no difference in the stats recorded by QE, but the SE shows two range scans. Again, these are range scans because we are looking for two values in the data column, which is covered by a non-unique index. I’ve added a snippet from the Properties window to show that the query plan does show two seek predicates, not just one. Now let’s rewrite the query using BETWEEN: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data BETWEEN 32 AND 33'; Notice the seek operator only has one predicate now – it’s just a single range scan from 32 to 33 in the index – as the SE output shows. For the next test, we will look up four values in the key_col column: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col IN (2,4,6,8)'; Just a single seek on the PK from the Query Executor, but four singleton lookups reported by the Storage Engine – and four seek predicates in the Properties window. On to a more complex example: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WITH (INDEX([PK dbo.Example key_col])) WHERE key_col BETWEEN 1 AND 8'; This time we are forcing use of the non-clustered primary key to return eight rows. The index is not covering for this query, so the query plan includes an RID lookup into the heap to fetch the data and padding columns. The QE reports a seek on the PK and a lookup on the heap. The SE reports a single range scan on the PK (to find key_col values between 1 and 8), and eight singleton lookups on the heap. Remember that a bookmark lookup (RID or Key) is a seek to a single value in a ‘unique index’ – it finds a row in the heap or cluster from a unique RID or clustering key – so that’s why lookups are always singleton lookups, not range scans. Our next example shows what happens when a query plan operator is not executed at all: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 8 AND @@TRANCOUNT < 0'; The Filter has a start-up predicate which is always false (if your @@TRANCOUNT is less than zero, call CSS immediately). The index seek is never executed, but QE still records a single seek against the PK because the operator appears once in an executed plan. The SE output shows no activity at all. This next example is 2008 and above only, I’m afraid: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WHERE key_col BETWEEN 1 AND 30', @Partitioned = 'true'; This is the first example to use a partitioned table. QE reports a single seek on the heap (yes – a seek on a heap), and the SE reports two range scans on the heap. SQL Server knows (from the partitioning definition) that it only needs to look at partitions 1 and 2 to find all the rows where key_col is between 1 and 30 – the engine seeks to find the two partitions, and performs a range scan seek on each partition. The final example for today is another seek on a heap – try to work out the output of the query before running it! EXECUTE dbo.RunTest @SQL = 'SELECT TOP (2) WITH TIES * FROM Example WHERE key_col BETWEEN 1 AND 50 ORDER BY $PARTITION.PF(key_col) DESC', @Partitioned = 'true'; Notice the lack of an explicit Sort operator in the query plan to enforce the ORDER BY clause, and the backward range scan. © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

Read the article
The SSIS tuning tip that everyone misses

- by Rob Farley

I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

Read the article
My boss is feuding with his boss. My workload is expanding What should I do?

- by steve

These two have always had a somewhat shaky relationship when they were on the same level. The other guy was recently promoted to director and now my boss reports to him. On the surface, they appear to get along when they get together, but my boss despises the man and badmouths him every chance that he gets (to peers, subordinates, etc). He believe that the director is setting him up to fail. The Director and upper management is holding my boss responsible for the not-so-great performance by the team as of late. He's been playing games to make my boss look bad. Due to lay offs, we don't have the manpower to deliever the results that we did before...but expectations have not lowered...and my boss is taking the heat for it. Now he's on the warpath and starting to micromanage. He's giving everyone more work. He's forcing us midlevel guys to take responsibility for the level one techs' performance. I'm spending less and less time coding....and more time babysitting vendors, techs, etc. I'm not so sure that's a bad thing because I'm sorta burnt out on coding, but I don't really care for the idea of having to be responsible for others poor performance....isn't that the manager's job? Anyway, do you guys have any suggestions on dealing with the situation?

Read the article
What issues carry the highest risk in a software project?

- by Mehrdad

Clearly, software projects are different from other industries in terms of many things like for instance, quality assurance, project progress measurement, and many other things. Unique characteristics of software projects also makes the risk management process unique. Lots of issues in a project might lead it to unacceptable delay or failure to deliver business value. They might even make a complete disaster in the project. What are the deadliest risk factors in a software project? How to analyze, prevent and handle them? Particularly, I'm interested in the issues that you can detect from the beginning and you should keep an eye on (for example, you might be told about a third-party API that the current application uses and lacks documentation). Please share your experiences if they are relevant.

Read the article
What design patters are the worst or most narrowly defined?

- by Akku

For every programming project, Managers with past programming experience try to shine when they recommend some design patterns for your project. I like design patterns when they make sense or if you need a scalbale solution. I've used Proxies, Observers and Command patterns in a positive way for example, and do so every day. But I'm really hesitant to use say a Factory pattern if there's only one way to create an object, as a factory might make it all easier in the future, but complicates the code and is pure overhead. So, my question is in respect to my future career and my answer to manager types throwing random pattern-names around: Which design patterns did you use, that threw you back overall? Which are the worst design patterns, that you shouldn't have a look at if it's not that only single situation where it makes sense (read: which design patterns are very narrowly defined)? (It's like I was looking for the negative reviews of an overall good product of amazon to see what bugged people most in using design patterns). And I'm not talking about Anti-Patterns here, but about Patterns that are usually thought of as "good" patterns.

Read the article
How to prepare for an online job interview (maybe through Skype)?

- by phunehehe

I'm applying for a company far away and if I get an interview it will probably be done remotely. I have been searching for advices regarding this but all tips seem to be directed at face-to-face meetings (things like "shake hands firmly"). What are the differences? How can I make the best out of those differences? Update: This is a software developer position, so there's also something about technical questions (such as, I can Google anything that they ask ;) This question also applies to any freelancers who are dealing with customers, or recruiters who are interviewing remotely. I hope that makes it relevant to this site. It may also help if you keep answers programers-related.

Read the article
Developer Compensation System

- by Graviton

Joel wrote excellent articles on developer compensation system used at Fogcreek. As a team lead and business owner, I would like to device a system that would work best for my team. And here's the catch: I have no experience in managing a team before, and I don't know what works and what doesn't. So I would like to get as many references as I can on this matter. Is there other developer compensation systems that you find is working for you and your company?

Read the article
How long was Microsoft working on .NET before they released it?

- by Richard DesLonde

With the whole CLI, CTS, CLS, etc., not only did they release a powerful platform/infrastructure, but they released all the specs that describe it etc. It supports potentially infinite myriad languages, platforms, etc. This seems like an insane amount of work, even for a behemoth like Microsoft - especially since it turns out they did a damn good job. How long were they working on this before releasing it (.NET 1.0)?

Read the article
Python interview questions

- by Andy

I am going to interview within two weeks for an internship that would involve Python programming. Can anyone suggest what possible areas should I polish? I am looking for commonly asked stuff in interviews for Python openings. Apart from the fact that I have already been doing the language for over a year now, I fail to perceive what they can ask me. Like for a C or C++ interview, there are lots of questions ranging from reversing of strings to building linked lists, but for a Python interview, I am clueless. Personal experiences and/ or suggestions are welcomed.

Read the article
Do you have a plan for your digital assets after you die?

- by pablo

After reading this question I remembered of a news article about some websites that manage your online identity after you pass away. Have you planned what to do with your digital assets once you go? I'd imagine that your online footprint is as important as anything you leave of material value. I mean, what would be the difference of that open-source project that you created to the money and savings that you had? How would you like to have your identity managed after you pass away? Would you prefer to go "off the grid"? It's a sensitive topic and I never met anyone who prepared for it.

Read the article
A list of pros and cons to giving developers “Local Admin” privileges to their machines? [closed]

- by Boden

Possible Duplicate: Is local “User” rights enough or do developers need Local Administrator or Power User while coding? I currently work for a large utilities company which currently does not grant “Local Admin” access to developers. This is causing a lot of grief as anything that requires elevated privileges needs to be done by the Desktop Support/Server Teams. In some cases this can take several days and requires our developers to have to show why they need this access. I personally think that all developers should have local administration rights and are currently fighting with management to achieve this but I would like to know what other people think about this. To achieve this I would like to hear what people believe are the pros and cons of letting developers have local admin access to their machines. Here are some I have come up with: Pros Loss time is keep low as developers can resolve issues that would normally require Local Admin Evaluation of tools and software are possible to improve productivity Desktop support time not wasted installing services and software on developers PC Cons Developers install software on local PC that could be malicious to others or inappropriate in a business environment Desktop Support required to support a PC that is not the norm Development done with admin access that then fails when promoted to another environment that does not have the same access level

Read the article
What's your experience with female programmers?

- by Rachel

Let me start by saying I'm female, but every single other female programmer I've known has been pretty terrible. The extent of their knowledge seems to be copy/paste and modify some values. Quite often they don't even try to learn new concepts, or understand what they're doing. I'm not saying good female programmers aren't out there, just that the ratio of good/bad programmers seems much worse then males. Perhaps its because everyone feels they have to give female programmers a chance to prove they are not biased? Or is this just me?? What has your experiences been with them? UPDATE: Just want to say thanks for all the responses. I've learned some interesting things and am happy to know that female programmers have such support :) My experience has been very limited with them but all bad, and I agree that it is probably due to my small sample size (around 5). I wasn't trying to be sexist with such a question, I just wanted to find out if it was really that abnormal to be a female programmer. I'm abnormal about a lot of things you'd expect from a female... I play video games in most of my spare time, I liked Math so much I completed my entire math book during christmas break one year (What can I say, I found the subject interesting), I'm not very social, I dislike shopping, I only have 2 pairs of shoes, my significant other doesn't work but does all the housework/laundry/etc... but anyways, thanks :)

Read the article
Would you make your website's source code public?

- by Karpie

Back story: My best friend is a self-taught coder for a community art site, written in PHP. Some time ago he mentioned he wanted to make the source code of the site public, to which my response was total horror - surely it was going to be full of security holes waiting to be found, and it was going to lead to hacking and errors on a huge scale. He never ended up doing it. Current story: I'm starting development of a community website built in Rails, and for ease of use I was going to use Github for version control. Then I realized it was pretty much exactly the same thing as my friend making his source code public - which made me stop and think. Would you make your website's completely-custom source code public? Or is this a case of open source gone too far? (note: I don't think this applies to people who run things like Wordpress. Or does it?)

Read the article
what are the duties of a software control management (scm) engineer in a large company?

- by Alex. S.

I'm curious about what are the canonical responsibilities of such specialized role. Normally, I expected this to be part of the tasks of a normal developer, but in large companies I know this role is to be fulfilled by an engineer in his own. In my current company, there is a possibility for a new opening in a SCM position, so I could apply, but first I would like to hear about what, in your experience, characterize best this role.

Read the article
Should I choose Doctrine 2 or Propel 1.5/1.6, and why?

- by Billy ONeal

I'd like to hear from those who have used Doctrine 2 (or later) and Propel 1.5 (or later). Most comparisons between these two object relational mappers are based on old versions -- Doctrine 1 versus Propel 1.3/1.4, and both ORMs went through significant redesigns in their recent revisions. For example, most of the criticism of Propel seems to center around the "ModelName Peer" classes, which are deprecated in 1.5 in any case. Here's what I've accumulated so far (And I've tried to make this list as balanced as possible...): Propel Pros Extremely IDE friendly, because actual code is generated, instead of relying on PHP magic methods. This means IDE features like code completion are actually helpful. Fast (In terms of database usage -- no runtime introspection is done on the database) Clean migration between schema versions (at least in the 1.6 beta) Can generate PHP 5.3 models (i.e. namespaces) Easy to chain a lot of things into a single database query with things like useXxx methods. (See the "code completion" video above) Cons Requires an extra build step, namely building the model classes. Generated code needs rebuilt whenever Propel version is changed, a setting is changed, or the schema changes. This might be unintuitive to some and custom methods applied to the model are lost. (I think?) Some useful features (i.e. version behavior, schema migrations) are in beta status. Doctrine Pros More popular Doctrine Query Language can express potentially more complicated relationships between data than easily possible with Propel's ActiveRecord strategy. Easier to add reusable behaviors when compared with Propel. DocBlock based commenting for building the schema is embedded in the actual PHP instead of a separate XML file. Uses PHP 5.3 Namespaces everywhere Cons Requires learning an entirely new programming language (Doctrine Query Language) Implemented in terms of "magic methods" in several places, making IDE autocomplete worthless. Requires database introspection and thus is slightly slower than Propel by default; caching can remove this but the caching adds considerable complexity. Fewer behaviors are included in the core codebase. Several features Propel provides out of the box (such as Nested Set) are available only through extensions. Freakin' HUGE :) This I have gleaned though only through reading the documentation available for both tools -- I've not actually built anything yet. I'd like to hear from those who have used both tools though, to share their experience on pros/cons of each library, and what their recommendation is at this point :)

Read the article
Add Matlab to main menu

- by Tim

I was trying to add the installed matlab to the menu of Applications under Ubuntu 10.10. I clicked System-Preference-Main Menu - Programming - New Item, where I input the Matlab file .../MatlabR2010b/bin/matlab as the command, and selected the type to be "Application". Then I finished. But when i click the item in the menu of Applications, the Matlab icon shows up a few seconds and then nothing else happens. If I select the type to be "Application in Terminal" in the last step of adding Matlab to the menu of Applications, then when I click the item in the menu of Applications, there will be firstly a terminal window and then the Matlab command window. So I was wondering how to solve the problem of Matlab not starting when the type has been selected to be "Application"? Also is there a way to eliminate the terminal appearing when the type has been selected to be "Application in Terminal"? Thanks!

Read the article
Music player with a few specific requirements

- by Jordan Uggla

I am looking for a music player with a few specific requirements: Must have a search function that whittles down results as you type, searching the entire library. Must start playing a song when double clicked, and not continue to another song when that song finishes. Must be approachable and immediately usable by people completely unfamiliar with the program. I think this is mostly covered by the first two requirements being met. I've tried many players but unfortunately every one has failed to meet at least one of the requirements. Rhythmbox meets 1 and 3, but continues to the next search result after the song which was double clicked ends. Banshee is basically the same as Rhythmbox. While it has an option to "Stop when finished" this cannot (as far as I can tell) be made the default when double clicking a song. Audacious (as far as I can tell) fails at 1. Muine meets requirements 1 and 2, but unfortunately I couldn't make the search dialog always shown like it is with Rhythmbox / Banshee which, despite its very simple interface, made Muine incomprehensible to people trying to use it for the first time. Amarok I could not configure to meet requirement 1, but I think it's likely I was just missing something, and with its configurability I'm confident that I can set it up to meet requirements 2 and 3.

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >