Search Results

Search found 296 results on 12 pages for 'optimizer'.

Page 7/12 | < Previous Page | 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Performance Enhancement in Full-Text Search Query

    - by Calvin Sun
    Ever since its first release, we are continuing consolidating and developing InnoDB Full-Text Search feature. There is one recent improvement that worth blogging about. It is an effort with MySQL Optimizer team that simplifies some common queries’ Query Plans and dramatically shorted the query time. I will describe the issue, our solution and the end result by some performance numbers to demonstrate our efforts in continuing enhancement the Full-Text Search capability. The Issue: As we had discussed in previous Blogs, InnoDB implements Full-Text index as reversed auxiliary tables. The query once parsed will be reinterpreted into several queries into related auxiliary tables and then results are merged and consolidated to come up with the final result. So at the end of the query, we’ll have all matching records on hand, sorted by their ranking or by their Doc IDs. Unfortunately, MySQL’s optimizer and query processing had been initially designed for MyISAM Full-Text index, and sometimes did not fully utilize the complete result package from InnoDB. Here are a couple examples: Case 1: Query result ordered by Rank with only top N results: mysql> SELECT FTS_DOC_ID, MATCH (title, body) AGAINST ('database') AS SCORE FROM articles ORDER BY score DESC LIMIT 1; In this query, user tries to retrieve a single record with highest ranking. It should have a quick answer once we have all the matching documents on hand, especially if there are ranked. However, before this change, MySQL would almost retrieve rankings for almost every row in the table, sort them and them come with the top rank result. This whole retrieve and sort is quite unnecessary given the InnoDB already have the answer. In a real life case, user could have millions of rows, so in the old scheme, it would retrieve millions of rows' ranking and sort them, even if our FTS already found there are two 3 matched rows. Apparently, the million ranking retrieve is done in vain. In above case, it should just ask for 3 matched rows' ranking, all other rows' ranking are 0. If it want the top ranking, then it can just get the first record from our already sorted result. Case 2: Select Count(*) on matching records: mysql> SELECT COUNT(*) FROM articles WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE); In this case, InnoDB search can find matching rows quickly and will have all matching rows. However, before our change, in the old scheme, every row in the table was requested by MySQL one by one, just to check whether its ranking is larger than 0, and later comes up a count. In fact, there is no need for MySQL to fetch all rows, instead InnoDB already had all the matching records. The only thing need is to call an InnoDB API to retrieve the count The difference can be huge. Following query output shows how big the difference can be: mysql> select count(*) from searchindex_inno where match(si_title, si_text) against ('people')  +----------+ | count(*) | +----------+ | 666877 | +----------+ 1 row in set (16 min 17.37 sec) So the query took almost 16 minutes. Let’s see how long the InnoDB can come up the result. In InnoDB, you can obtain extra diagnostic printout by turning on “innodb_ft_enable_diag_print”, this will print out extra query info: Error log: keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 2 secs: row(s) 666877: error: 10 ft_init() ft_init_ext() keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 3 secs: row(s) 666877: error: 10 Output shows it only took InnoDB only 3 seconds to get the result, while the whole query took 16 minutes to finish. So large amount of time has been wasted on the un-needed row fetching. The Solution: The solution is obvious. MySQL can skip some of its steps, optimize its plan and obtain useful information directly from InnoDB. Some of savings from doing this include: 1) Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results. 2) Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount. 3) Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself. 4) Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs. Performance Results and comparison with MyISAM The result by this change is very obvious. I includes six testing result performed by Alexander Rubin just to demonstrate how fast the InnoDB query now becomes when comparing MyISAM Full-Text Search. These tests are base on the English Wikipedia data of 5.4 Million rows and approximately 16G table. The test was performed on a machine with 1 CPU Dual Core, SSD drive, 8G of RAM and InnoDB_buffer_pool is set to 8 GB. Table 1: SELECT with LIMIT CLAUSE mysql> SELECT si_title, match(si_title, si_text) against('family') as rel FROM si WHERE match(si_title, si_text) against('family') ORDER BY rel desc LIMIT 10; InnoDB MyISAM Times Faster Time for the query 1.63 sec 3 min 26.31 sec 127 You can see for this particular query (retrieve top 10 records), InnoDB Full-Text Search is now approximately 127 times faster than MyISAM. Table 2: SELECT COUNT QUERY mysql>select count(*) from si where match(si_title, si_text) against('family‘); +----------+ | count(*) | +----------+ | 293955 | +----------+ InnoDB MyISAM Times Faster Time for the query 1.35 sec 28 min 59.59 sec 1289 In this particular case, where there are 293k matching results, InnoDB took only 1.35 second to get all of them, while take MyISAM almost half an hour, that is about 1289 times faster!. Table 3: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county California 0.93 sec 32.03 sec 34.4 President united states of America 2.5 sec 36.98 sec 14.8 Table 4: SELECT title and text with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, si_title, si_text, ... as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.61 sec 41.65 sec 68.3 family film 1.15 sec 47.17 sec 41.0 Pizza restaurant orange county california 1.03 sec 48.2 sec 46.8 President united states of america 2.49 sec 44.61 sec 17.9 Table 5: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel  FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county califormia 0.93 sec 32.03 sec 34.4 President united states of america 2.5 sec 36.98 sec 14.8 Table 6: SELECT COUNT(*) mysql> SELECT count(*) FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.47 sec 82 sec 174.5 family film 0.83 sec 131 sec 157.8 Pizza restaurant orange county califormia 0.74 sec 106 sec 143.2 President united states of america 1.96 sec 220 sec 112.2  Again, table 3 to table 6 all showing InnoDB consistently outperform MyISAM in these queries by a large margin. It becomes obvious the InnoDB has great advantage over MyISAM in handling large data search. Summary: These results demonstrate the great performance we could achieve by making MySQL optimizer and InnoDB Full-Text Search more tightly coupled. I think there are still many cases that InnoDB’s result info have not been fully taken advantage of, which means we still have great room to improve. And we will continuously explore the area, and get more dramatic results for InnoDB full-text searches. Jimmy Yang, September 29, 2012

    Read the article

  • Gathering statistics for an Oracle WebCenter Content Database

    - by Nicolas Montoya
    Have you ever heard: "My Oracle WebCenter Content instance is running slow. I checked the memory and CPU usage of the application server and it has plenty of resources. What could be going wrong?An Oracle WebCenter Content instance runs on an application server and relies on a database server on the back end. If your application server tier is running fine, chances are that your database server tier may host the root of the problem. While many things could cause performance problems, on active Enterprise Content Management systems, keeping database statistics updated is extremely important.The Oracle Database have a set of built-in optimizer utilities that can help make database queries more efficient. It is strongly recommended to update or re-create the statistics about the physical characteristics of a table and the associated indexes in order to maximize the efficiency of optimizers. These physical characteristics include: Number of records Number of pages Average record length The frequency with which you need to update statistics depends on how quickly the data is changing. Typically, statistics should be updated when the number of new items since the last update is greater than ten percent of the number of items when the statistics were last updated. If a large amount of documents are being added or removed from the system, the a post step should be added to gather statistics upon completion of this massive data change. In some cases, you may need to collect statistics in the middle of the data processing to expedite its execution. These proceses include but are not limited to: data migration, bootstrapping of a new system, records management disposition processing (typically at the end of the calendar year), etc. A DOCUMENTS table with a ten million rows will often generate a very different plan than a table with just a thousand.A quick check of the statistics for the WebCenter Content (WCC) Database could be performed via the below query:SELECT OWNER, TABLE_NAME, NUM_ROWS, BLOCKS, AVG_ROW_LEN,TO_CHAR(LAST_ANALYZED, 'MM/DD/YYYY HH24:MI:SS')FROM DBA_TABLESWHERE TABLE_NAME='DOCUMENTS';OWNER                          TABLE_NAME                       NUM_ROWS------------------------------ ------------------------------ ----------    BLOCKS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ---------- ----------- -------------------ATEAM_OCS                      DOCUMENTS                            4172        46          61 04/06/2012 11:17:51This output will return not only the date when the WCC table DOCUMENTS was last analyzed, but also it will return the <DATABASE SCHEMA OWNER> for this table in the form of <PREFIX>_OCS.This database username could later on be used to check on other objects owned by the WCC <DATABASE SCHEMA OWNER> as shown below:SELECT OWNER, TABLE_NAME, NUM_ROWS, BLOCKS, AVG_ROW_LEN,TO_CHAR(LAST_ANALYZED, 'MM/DD/YYYY HH24:MI:SS')FROM DBA_TABLESWHERE OWNER='ATEAM_OCS'ORDER BY NUM_ROWS ASC;...OWNER                          TABLE_NAME                       NUM_ROWS------------------------------ ------------------------------ ----------    BLOCKS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ---------- ----------- -------------------ATEAM_OCS                      REVISIONS                            2051        46         141 04/09/2012 22:00:22ATEAM_OCS                      DOCUMENTS                            4172        46          61 04/06/2012 11:17:51ATEAM_OCS                      ARCHIVEHISTORY                       4908       244         218 04/06/2012 11:17:49OWNER                          TABLE_NAME                       NUM_ROWS------------------------------ ------------------------------ ----------    BLOCKS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ---------- ----------- -------------------ATEAM_OCS                      DOCUMENTHISTORY                      5865       110          72 04/06/2012 11:17:50ATEAM_OCS                      SCHEDULEDJOBSHISTORY                10131       244         131 04/06/2012 11:17:54ATEAM_OCS                      SCTACCESSLOG                        10204       496         268 04/06/2012 11:17:54...The Oracle Database allows to collect statistics of many different kinds as an aid to improving performance. The DBMS_STATS package is concerned with optimizer statistics only. The database sets automatic statistics collection of this kind on by default, DBMS_STATS package is intended for only specialized cases.The following subprograms gather certain classes of optimizer statistics:GATHER_DATABASE_STATS Procedures GATHER_DICTIONARY_STATS Procedure GATHER_FIXED_OBJECTS_STATS Procedure GATHER_INDEX_STATS Procedure GATHER_SCHEMA_STATS Procedures GATHER_SYSTEM_STATS Procedure GATHER_TABLE_STATS ProcedureThe DBMS_STATS.GATHER_SCHEMA_STATS PL/SQL Procedure gathers statistics for all objects in a schema.DBMS_STATS.GATHER_SCHEMA_STATS (    ownname          VARCHAR2,    estimate_percent NUMBER   DEFAULT to_estimate_percent_type                                                 (get_param('ESTIMATE_PERCENT')),    block_sample     BOOLEAN  DEFAULT FALSE,    method_opt       VARCHAR2 DEFAULT get_param('METHOD_OPT'),   degree           NUMBER   DEFAULT to_degree_type(get_param('DEGREE')),    granularity      VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),    cascade          BOOLEAN  DEFAULT to_cascade_type(get_param('CASCADE')),    stattab          VARCHAR2 DEFAULT NULL,    statid           VARCHAR2 DEFAULT NULL,    options          VARCHAR2 DEFAULT 'GATHER',    objlist          OUT      ObjectTab,   statown          VARCHAR2 DEFAULT NULL,    no_invalidate    BOOLEAN  DEFAULT to_no_invalidate_type (                                     get_param('NO_INVALIDATE')),  force             BOOLEAN DEFAULT FALSE);There are several values for the OPTIONS parameter that we need to know about: GATHER reanalyzes the whole schema     GATHER EMPTY only analyzes tables that have no existing statistics GATHER STALE only reanalyzes tables with more than 10 percent modifications (inserts, updates,   deletes) GATHER AUTO will reanalyze objects that currently have no statistics and objects with stale statistics. Using GATHER AUTO is like combining GATHER STALE and GATHER EMPTY. Example:exec dbms_stats.gather_schema_stats( -   ownname          => '<PREFIX>_OCS', -   options          => 'GATHER AUTO' -);

    Read the article

  • More SQL Smells

    - by Nick Harrison
    Let's continue exploring some of the SQL Smells from Phil's list. He has been putting together. Datatype mis-matches in predicates that rely on implicit conversion.(Plamen Ratchev) This is a great example poking holes in the whole theory of "If it works it's not broken" Queries will this probably will generally work and give the correct response. In fact, without careful analysis, you probably may be completely oblivious that there is even a problem. This subtle little problem will needlessly complicate queries and slow them down regardless of the indexes applied. Consider this example: CREATE TABLE [dbo].[Page](     [PageId] [int] IDENTITY(1,1) NOT NULL,     [Title] [varchar](75) NOT NULL,     [Sequence] [int] NOT NULL,     [ThemeId] [int] NOT NULL,     [CustomCss] [text] NOT NULL,     [CustomScript] [text] NOT NULL,     [PageGroupId] [int] NOT NULL;  CREATE PROCEDURE PageSelectBySequence ( @sequenceMin smallint , @sequenceMax smallint ) AS BEGIN SELECT [PageId] , [Title] , [Sequence] , [ThemeId] , [CustomCss] , [CustomScript] , [PageGroupId] FROM [CMS].[dbo].[Page] WHERE Sequence BETWEEN @sequenceMin AND @SequenceMax END  Note that the Sequence column is defined as int while the sequence parameter is defined as a small int. The problem is that the database may have to do a lot of type conversions to evaluate the query. In some cases, this may even negate the indexes that you have in place. Using Correlated subqueries instead of a join   (Dave_Levy/ Plamen Ratchev) There are two main problems here. The first is a little subjective, since this is a non-standard way of expressing the query, it is harder to understand. The other problem is much more objective and potentially problematic. You are taking much of the control away from the optimizer. Written properly, such a query may well out perform a corresponding query written with traditional joins. More likely than not, performance will degrade. Whenever you assume that you know better than the optimizer, you will most likely be wrong. This is the fundmental problem with any hint. Consider a query like this:  SELECT Page.Title , Page.Sequence , Page.ThemeId , Page.CustomCss , Page.CustomScript , PageEffectParams.Name , PageEffectParams.Value , ( SELECT EffectName FROM dbo.Effect WHERE EffectId = dbo.PageEffects.EffectId ) AS EffectName FROM Page INNER JOIN PageEffect ON Page.PageId = PageEffects.PageId INNER JOIN PageEffectParam ON PageEffects.PageEffectId = PageEffectParams.PageEffectId  This can and should be written as:  SELECT Page.Title , Page.Sequence , Page.ThemeId , Page.CustomCss , Page.CustomScript , PageEffectParams.Name , PageEffectParams.Value , EffectName FROM Page INNER JOIN PageEffect ON Page.PageId = PageEffects.PageId INNER JOIN PageEffectParam ON PageEffects.PageEffectId = PageEffectParams.PageEffectId INNER JOIN dbo.Effect ON dbo.Effects.EffectId = dbo.PageEffects.EffectId  The correlated query may just as easily show up in the where clause. It's not a good idea in the select clause or the where clause. Few or No comments. This one is a bit more complicated and controversial. All comments are not created equal. Some comments are helpful and need to be included. Other comments are not necessary and may indicate a problem. I tend to follow the rule of thumb that comments that explain why are good. Comments that explain how are bad. Many people may be shocked to hear the idea of a bad comment, but hear me out. If a comment is needed to explain what is going on or how it works, the logic is too complex and needs to be simplified. Comments that explain why are good. Comments may explain why the sql is needed are good. Comments that explain where the sql is used are good. Comments that explain how tables are related should not be needed if the sql is well written. If they are needed, you need to consider reworking the sql or simplify your data model. Use of functions in a WHERE clause. (Anil Das) Calling a function in the where clause will often negate the indexing strategy. The function will be called for every record considered. This will often a force a full table scan on the tables affected. Calling a function will not guarantee that there is a full table scan, but there is a good chance that it will. If you find that you often need to write queries using a particular function, you may need to add a column to the table that has the function already applied.

    Read the article

  • wamp cannot load mysqli extension

    - by localhost
    WAMP installed fine, no problems, BUT... When going to phpMyAdmin, I get the error from phpMyAdmin as follows: "Cannot load mysqli extension. Please check your PHP configuration". Also, phpMyAdmin documentation explains this error message as follows: "To connect to a MySQL server, PHP needs a set of MySQL functions called "MySQL extension". This extension may be part of the PHP distribution (compiled-in), otherwise it needs to be loaded dynamically. Its name is probably mysql.so or php_mysql.dll. phpMyAdmin tried to load the extension but failed. Usually, the problem is solved by installing a software package called "PHP-MySQL" or something similar." Finally, the apache_error.log file has the following PHP warnings (see the mySQL warning): PHP Warning: Zend Optimizer does not support this version of PHP - please upgrade to the latest version of Zend Optimizer in Unknown on line 0 PHP Warning: Zend Platform does not support this version of PHP - please upgrade to the latest version of Zend Platform in Unknown on line 0 PHP Warning: Zend Debug Server does not support this version of PHP - please upgrade to the latest version of Zend Debug Server in Unknown on line 0 PHP Warning: gd wrapper does not support this version of PHP - please upgrade to the latest version of gd wrapper in Unknown on line 0 PHP Warning: java wrapper does not support this version of PHP - please upgrade to the latest version of java wrapper in Unknown on line 0 PHP Warning: mysql wrapper does not support this version of PHP - please upgrade to the latest version of mysql wrapper in Unknown on line 0 So, for some reason PHP is not recognizing the mysql extension. Anyone know why? Any solution or workaround?

    Read the article

  • Sybase stored procedure - how do I create an index on a #table?

    - by DVK
    I have a stored procedure which creates and works with a temporary #table Some of the queries would be tremendously optimized if that temporary #table would have an index created on it. However, creating an index within the stored procedure fails: create procedure test1 as SELECT f1, f2, f3 INTO #table1 FROM main_table WHERE 1 = 2 -- insert rows into #table1 create index my_idx on #table1 (f1) SELECT f1, f2, f3 FROM #table1 (index my_idx) WHERE f1 = 11 -- "QUERY X" When I call the above, the query plan for "QUERY X" shows a table scan. If I simply run the code above outside the stored procedure, the messages show the following warning: Index 'my_idx' specified as optimizer hint in the FROM clause of table '#table1' does not exist. Optimizer will choose another index instead. This can be resolved when running ad-hoc (outside the stored procedure) by splitting the code above in two batches by addding "go" after index creation: create index my_idx on #table1 (f1) go Now, "QUERY X" query plan shows the use of index "my_idx". QUESTION: How do I mimique running the "create index" in a separate batch when it's inside the stored procedure? I can't insert a "go" there like I do with the ad-hoc copy above. P.S. If it matters, this is on Sybase 12.

    Read the article

  • Beginner Question: For extract a large subset of a table from MySQL, how does Indexing, order of tab

    - by chongman
    Sorry if this is too simple, but thanks in advance for helping. This is for MySQL but might be relevant for other RDMBSs tblA has 4 columns: colA, colB, colC, mydata, A_id It has about 10^9 records, with 10^3 distinct values for colA, colB, colC. tblB has 3 columns: colA, colB, B_id It has about 10^4 records. I want all the records from tblA (except the A_id) that have a match in tblB. In other words, I want to use tblB to describe the subset that I want to extract and then extract those records from tblA. Namely: SELECT a.colA, a.colB, a.colC, a.mydata FROM tblA as a INNER JOIN tblB as b ON a.colA=b.colA a.colB=b.colB ; It's taking a really long time (more than an hour) on a newish computer (4GB, Core2Quad, ubuntu), and I just want to check my understanding of the following optimization steps. ** Suppose this is the only query I will ever run on these tables. So ignore the need to run other queries. Now my questions: 1) What indexes should I create to optimize this query? I think I just need a multiple index on (colA, colB) for both tables. I don't think I need separate indexes for colA and colB. Another stack overflow article (that I can't find) mentioned that when adding new indexes, it is slower when there are existing indexes, so that might be a reason to use the multiple index. 2) Is INNER JOIN correct? I just want results where a match is found. 3) Is it faster if I join (tblA to tblB) or the other way around, (tblB to tblA)? This previous answer says that the optimizer should take care of that. 4) Does the order of the part after ON matter? This previous answer say that the optimizer also takes care of the execution order.

    Read the article

  • How to downgrade a certain ubuntu 10.04 package (php) back to karmic

    - by Eugene
    Hi! I've updated from 9.10 to 10.04 but unfortunately the PHP provided with 10.04 is not yet supported by zend optimizer. As far as I understand I need to somehow replace the PHP 5.3 package provided under 10.04 with an older PHP 5.2 package provided under 9.10. However I am not sure whether this is the right way to downgrade PHP and if yes, I don't know how to replace the 10.04 package with 9.10 package. Could you please help me with that?

    Read the article

  • Postgresql Internals - Documentation

    - by NogginTheNog
    I'm looking for some up-to-date information about postgresql internals, specifically the query optimizer. I've found this link (referred to in the "Further Reading" section of the 8.4 docs):- http://db.cs.berkeley.edu//papers/UCB-MS-zfong.pdf but it seems quite old. That in itself is not a problem, but I wanted to be sure that I have information that is relevant. Is this the best resource for understanding how postgresql processes queries (using plans, statistics etc.) or are there others?

    Read the article

  • How to downgrade a certain package (php) back to karmic

    - by Eugene
    Hi! I've updated from 9.10 to 10.04 but unfortunately the PHP provided with 10.04 is not yet supported by zend optimizer. As far as I understand I need to somehow replace the PHP 5.3 package provided under 10.04 with an older PHP 5.2 package provided under 9.10. However I am not sure whether this is the right way to downgrade PHP and if yes, I don't know how to replace the 10.04 package with 9.10 package. Could you please help me with that?

    Read the article

  • SQL SERVER – PAGEIOLATCH_DT, PAGEIOLATCH_EX, PAGEIOLATCH_KP, PAGEIOLATCH_SH, PAGEIOLATCH_UP – Wait Type – Day 9 of 28

    - by pinaldave
    It is very easy to say that you replace your hardware as that is not up to the mark. In reality, it is very difficult to implement. It is really hard to convince an infrastructure team to change any hardware because they are not performing at their best. I had a nightmare related to this issue in a deal with an infrastructure team as I suggested that they replace their faulty hardware. This is because they were initially not accepting the fact that it is the fault of their hardware. But it is really easy to say “Trust me, I am correct”, while it is equally important that you put some logical reasoning along with this statement. PAGEIOLATCH_XX is such a kind of those wait stats that we would directly like to blame on the underlying subsystem. Of course, most of the time, it is correct – the underlying subsystem is usually the problem. From Book On-Line: PAGEIOLATCH_DT Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Destroy mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_EX Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Exclusive mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_KP Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Keep mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_SH Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_UP Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Update mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_XX Explanation: Simply put, this particular wait type occurs when any of the tasks is waiting for data from the disk to move to the buffer cache. ReducingPAGEIOLATCH_XX wait: Just like any other wait type, this is again a very challenging and interesting subject to resolve. Here are a few things you can experiment on: Improve your IO subsystem speed (read the first paragraph of this article, if you have not read it, I repeat that it is easy to say a step like this than to actually implement or do it). This type of wait stats can also happen due to memory pressure or any other memory issues. Putting aside the issue of a faulty IO subsystem, this wait type warrants proper analysis of the memory counters. If due to any reasons, the memory is not optimal and unable to receive the IO data. This situation can create this kind of wait type. Proper placing of files is very important. We should check file system for the proper placement of files – LDF and MDF on separate drive, TempDB on separate drive, hot spot tables on separate filegroup (and on separate disk), etc. Check the File Statistics and see if there is higher IO Read and IO Write Stall SQL SERVER – Get File Statistics Using fn_virtualfilestats. It is very possible that there are no proper indexes on the system and there are lots of table scans and heap scans. Creating proper index can reduce the IO bandwidth considerably. If SQL Server can use appropriate cover index instead of clustered index, it can significantly reduce lots of CPU, Memory and IO (considering cover index has much lesser columns than cluster table and all other it depends conditions). You can refer to the two articles’ links below previously written by me that talk about how to optimize indexes. Create Missing Indexes Drop Unused Indexes Updating statistics can help the Query Optimizer to render optimal plan, which can only be either directly or indirectly. I have seen that updating statistics with full scan (again, if your database is huge and you cannot do this – never mind!) can provide optimal information to SQL Server optimizer leading to efficient plan. Checking Memory Related Perfmon Counters SQLServer: Memory Manager\Memory Grants Pending (Consistent higher value than 0-2) SQLServer: Memory Manager\Memory Grants Outstanding (Consistent higher value, Benchmark) SQLServer: Buffer Manager\Buffer Hit Cache Ratio (Higher is better, greater than 90% for usually smooth running system) SQLServer: Buffer Manager\Page Life Expectancy (Consistent lower value than 300 seconds) Memory: Available Mbytes (Information only) Memory: Page Faults/sec (Benchmark only) Memory: Pages/sec (Benchmark only) Checking Disk Related Perfmon Counters Average Disk sec/Read (Consistent higher value than 4-8 millisecond is not good) Average Disk sec/Write (Consistent higher value than 4-8 millisecond is not good) Average Disk Read/Write Queue Length (Consistent higher value than benchmark is not good) Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All of the discussions of Wait Stats in this blog is generic and varies from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • Design Book–Fourth(last) Section (Physical Abstraction Optimization)

    - by drsql
    In this last section of the book, we will shift focus to the physical abstraction layer optimization. By this I mean the little bits and pieces of the design that is specifically there for performance and are actually part of the relational engine (read: the part of the SQL Server experience that ideally is hidden from you completely, but in 2010 reality it isn’t quite so yet.  This includes all of the data structures like database, files, etc; the optimizer; some coding, etc. In my mind, this...(read more)

    Read the article

  • SQL SERVER – Guest Post by Sandip Pani – SQL Server Statistics Name and Index Creation

    - by pinaldave
    Sometimes something very small or a common error which we observe in daily life teaches us new things. SQL Server Expert Sandip Pani (winner of Joes 2 Pros Contests) has come across similar experience. Sandip has written a guest post on an error he faced in his daily work. Sandip is working for QSI Healthcare as an Associate Technical Specialist and have more than 5 years of total experience. He blogs at SQLcommitted.com and contribute in various forums. His social media hands are LinkedIn, Facebook and Twitter. Once I faced following error when I was working on performance tuning project and attempt to create an Index. Mug 1913, Level 16, State 1, Line 1 The operation failed because an index or statistics with name ‘Ix_Table1_1′ already exists on table ‘Table1′. The immediate reaction to the error was that I might have created that index earlier and when I researched it further I found the same as the index was indeed created two times. This totally makes sense. This can happen due to many reasons for example if the user is careless and executes the same code two times as well, when he attempts to create index without checking if there was index already on the object. However when I paid attention to the details of the error, I realize that error message also talks about statistics along with the index. I got curious if the same would happen if I attempt to create indexes with the same name as statistics already created. There are a few other questions also prompted in my mind. I decided to do a small demonstration of the subject and build following demonstration script. The goal of my experiment is to find out the relation between statistics and the index. Statistics is one of the important input parameter for the optimizer during query optimization process. If the query is nontrivial then only optimizer uses statistics to perform a cost based optimization to select a plan. For accuracy and further learning I suggest to read MSDN. Now let’s find out the relationship between index and statistics. We will do the experiment in two parts. i) Creating Index ii) Creating Statistics We will be using the following T-SQL script for our example. IF (OBJECT_ID('Table1') IS NOT NULL) DROP TABLE Table1 GO CREATE TABLE Table1 (Col1 INT NOT NULL, Col2 VARCHAR(20) NOT NULL) GO We will be using following two queries to check if there are any index or statistics on our sample table Table1. -- Details of Index SELECT OBJECT_NAME(OBJECT_ID) AS TableName, Name AS IndexName, type_desc FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'table1' GO -- Details of Statistics SELECT OBJECT_NAME(OBJECT_ID) TableName, Name AS StatisticsName FROM sys.stats WHERE OBJECT_NAME(OBJECT_ID) = 'table1' GO When I ran above two scripts on the table right after it was created it did not give us any result which was expected. Now let us begin our test. 1) Create an index on the table Create following index on the table. CREATE NONCLUSTERED INDEX Ix_Table1_1 ON Table1(Col1) GO Now let us use above two scripts and see their results. We can see that when we created index at the same time it created statistics also with the same name. Before continuing to next set of demo – drop the table using following script and re-create the table using a script provided at the beginning of the table. DROP TABLE table1 GO 2) Create a statistic on the table Create following statistics on the table. CREATE STATISTICS Ix_table1_1 ON Table1 (Col1) GO Now let us use above two scripts and see their results. We can see that when we created statistics Index is not created. The behavior of this experiment is different from the earlier experiment. Clean up the table setup using the following script: DROP TABLE table1 GO Above two experiments teach us very valuable lesson that when we create indexes, SQL Server generates the index and statistics (with the same name as the index name) together. Now due to the reason if we have already had statistics with the same name but not the index, it is quite possible that we will face the error to create the index even though there is no index with the same name. A Quick Check To validate that if we create statistics first and then index after that with the same name, it will throw an error let us run following script in SSMS. Make sure to drop the table and clean up our sample table at the end of the experiment. -- Create sample table CREATE TABLE TestTable (Col1 INT NOT NULL, Col2 VARCHAR(20) NOT NULL) GO -- Create Statistics CREATE STATISTICS IX_TestTable_1 ON TestTable (Col1) GO -- Create Index CREATE NONCLUSTERED INDEX IX_TestTable_1 ON TestTable(Col1) GO -- Check error /*Msg 1913, Level 16, State 1, Line 2 The operation failed because an index or statistics with name 'IX_TestTable_1' already exists on table 'TestTable'. */ -- Clean up DROP TABLE TestTable GO While creating index it will throw the following error as statistics with the same name is already created. In simple words – when we create index the name of the index should be different from any of the existing indexes and statistics. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Error Messages, SQL Index, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

    Read the article

  • Data Warehouse Best Practices

    - by jean-pierre.dijcks
    In our quest to share our endless wisdom (ahem…) one of the things we figured might be handy is recording some of the best practices for data warehousing. And so we did. And, we did some more… We now have recreated our websites on Oracle Technology Network and have a separate page for best practices, parallelism and other cool topics related to data warehousing. But the main topic of this post is the set of recorded best practices. Here is what is available (and it is a series that ties together but can be read independently), applicable for almost any database version: Partitioning 3NF schema design for a data warehouse Star schema design Data Loading Parallel Execution Optimizer and Stats management The best practices page has a lot of other useful information so have a look here.

    Read the article

  • MySQL Connect and OurSQL Interview

    - by Keith Larson
    In the latest episode of our "Meet The MySQL Experts" podcast, I had the pleasure of being able to interview the hosts of the OurSQL podcast, Sheeri Cabral of Mozilla and Gerry Narvaja of Tokutek, about the upcoming MySQL Connect Conference.  Enjoy the podcast ! MySQL Connect Blog posts: MySQL Connect: New Keynote Announced MySQL Connect: Sessions From Users and Customers MySQL Connect: Some Fun Stuff! MySQL Connect: Replication Sessions MySQL Connect: Optimizer Sessions MySQL Connect: Focus on InnoDB Sessions Interview with Ronald Bradford about MySQL Connect Interview with Sarah Novotny about MySQL Connect Interview with Giuseppe Maxia "the datacharmer" about MySQL Connect Interview with Lenz Grimmer about MySQL Connect Plan Your MySQL Connect Conference With Schedule Builder You can check out the full program here as well as in the September edition of the MySQL newsletter. Not registered yet? You can still save US$ 300 over the on-site fee – Register Now!

    Read the article

  • Joins in single-table queries

    - by Rob Farley
    Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

    Read the article

  • Interview with Ronald Bradford about MySQL Connect

    - by Keith Larson
    Ronald Bradford,  an Oracle ACE Director has been busy working with  database consulting, book writing (EffectiveMySQL) while traveling and speaking around the world in support of MySQL. I was able to take some of his time to get an interview on this thoughts about theMySQL Connect conference. Keith Larson: What where your thoughts when you heard that Oracle was going to provide the community the MySQL Conference ?Ronald Bradford: Oracle has already been providing various different local community events including OTN Tech Days and  MySQL community days. These are great for local regions both in the US and abroad.  In previous years there has been an increase of content at Oracle Open World, however that benefits the Oracle community far more then the MySQL community.  It is good to see that Oracle is realizing the benefit in providing a large scale dedicated event for the MySQL community that includes speakers from the MySQL development teams, invested companies in the ecosystem and other community evangelists.I fully expect a successful event and look forward to hopefully seeing MySQL Connect at the upcoming Brazil and Japan OOW conferences and perhaps an event on the East Coast.Keith Larson: Since you are part of the content committee, what did you think of the submissions that were received during call for papers?Ronald Bradford: There was a large number of quality submissions to the number of available presentation sessions. As with the previous years as a committee member for the annual MySQL conference, there is always a large variety of common cornerstone MySQL features as well as new products and upcoming companies sharing their MySQL experiences. All of the usual major players in the ecosystem will in presenting at MySQL Connect including Facebook, Twitter, Yahoo, Continuent, Percona, Tokutek, Sphinx and Amazon to name a few.  This is ensuring the event will have a large number of quality speakers and a difficult time in choosing what to attend. Keith Larson: What sessions do you look forwarding to attending? Ronald Bradford: As with most quality conferences you can only be in one place at one time, so with multiple tracks per session it is always difficult to decide. The continued work and success with MySQL Cluster, and with a number of sessions I am sure will be popular. The features that interest me the most are around the optimizer, where there are several sessions on new features, and on the importance of backups. There are three presentations in this area to choose from.Keith Larson: Are you going to cover any of the content in your books at your MySQL Connect sessions?Ronald Bradford: I will be giving two presentations at MySQL Connect. The first will include the techniques available for creating better indexes where I will be touching on some aspects of the first Effective MySQL book on Optimizing SQL Statements.  In my second presentation from experiences of managing 500+ AWS MySQL instances, I will be touching on areas including SQL tuning, backup and recovery and scale out with replication.   These are the key topics of the initial books in the Effective MySQL series that focus on performance, scalability and business continuity.  The books however cover a far greater amount of detail then can be presented in a 1 hour session. Keith Larson: What features of MySQL 5.6 do you look forward to the most ?Ronald Bradford: I am very impressed with the optimizer trace feature. The ability to see exposed information is invaluable not just for MySQL 5.6, but to also apply information discerned for optimizing SQL statements in earlier versions of MySQL.  Not everybody understands that it is easy to deploy a MySQL 5.6 slave into an existing topology running an older version if MySQL for evaluation of many new features.  You can use the new mysqlbinlog streaming feature for duplicating master binary logs on an older version with a MySQL 5.6 slave.  The improvements in instrumentation in the Performance Schema are exciting.   However, as with my upcoming Replication Techniques in Depth title, that will be available for sale at MySQL Connect, there are numerous replication features, some long overdue with provide significant management benefits. Crash Save Slaves, Global transaction Identifiers (GTID)  and checksums just to mention a few.Keith Larson: You have been to numerous conferences, what would you recommend for people at the conference? Ronald Bradford: Make the time to meet and introduce yourself to the speakers that cover the topics that most interest you. The MySQL ecosystem has a very strong community.  The relationships you build with presenters, developers and architects in MySQL can be invaluable, however they are created over time. Get to know these people, interact with them over time.  This is the opportunity to learn more then just the content from a 1 hour session. Keith Larson: Any additional tips to handling the long hours ? Ronald Bradford: Conferences can be hard, especially with all the post event drinking.  This is a two day event and I am sure will include additional events on Friday and Saturday night so come well prepared, and leave work behind. Take the time to learn something new.   You can always catchup on sleep later. Keith Larson: Thank you so much for taking some time to do this I look forward to seeing you at the MySQL Connect conference.  Please stay tuned here for more updates on MySQL. 

    Read the article

  • Oracle Expert Live Virtual Seminars - Learn the tricks that only the expert know

    - by rituchhibber
    Oracle University Expert Seminars are exclusive events delivered by top Oracle experts with years of experience in working with Oracle products.         Introduction into ADF & BPM with Markus Grünewald - 11-12 December 2012 ADF/WebCenter 11g Development in Depth with Andrejus Baranovskis - 13-14 December 2012 Beating the Optimizer with Jonathan Lewis - Online - 17 January 2013 RAC Performance Tuning On-Line with Arup Nanda - 25 January 2013 Mastering Oracle Parallel Execution with Randolf Geist - 30 January 2013 Minimize Downtime with Rolling Upgrade using Data Guard with Uwe Hesse - 8 February 2013 For a full list of Oracle Expert Seminars near you or on line click here. Remember that your OPN discount is applied to the standard prices shown on the website.For more information, assistance in booking and to request new dates, contact your local Oracle University Service Desk.

    Read the article

  • NEW CERTIFICATION: Oracle Certified Expert, Oracle Database 11g Release 2 SQL Tuning

    - by Brandye Barrington
    Oracle Certification announces the release of the new Oracle Certified Expert, Oracle Database 11g Release 2 SQL Tuning certification. This certification is designed forDevelopers, Database Administrators and SQL developers who are proficient at tuning efficient SQL statements. This certification covers topics on core elements such as: identifying and tuning inefficient SQL statements, using automatic SQL tuning, managing optimizer statistics on database objects, implementing partitioning and analyizing queries. Beta testing for the Oracle Database 11g Release 2: SQL Tuning exam (1Z1-117) is now underway and thus is available at the greatly discounted rate of $50 USD. Visit pearsonvue.com/oracle and register for exam 1Z1-117. You can get all preparation details on the Oracle Certification website, including exam objectives, number of questions, time allotments, and pricing. QUICK LINKS: Certification Track: Oracle Certified Expert, Oracle Database 11g Release 2 SQL Tuning Certification Exam: Oracle Database 11g Release 2: SQL Tuning (1Z0-117) Certification Website: About Beta Exams Register Now: Pearson VUE

    Read the article

  • Developing Schema Compare for Oracle (Part 6): 9i Query Performance

    - by Simon Cooper
    All throughout the EAP and beta versions of Schema Compare for Oracle, our main request was support for Oracle 9i. After releasing version 1.0 with support for 10g and 11g, our next step was then to get version 1.1 of SCfO out with support for 9i. However, there were some significant problems that we had to overcome first. This post will concentrate on query execution time. When we first tested SCfO on a 9i server, after accounting for various changes to the data dictionary, we found that database registration was taking a long time. And I mean a looooooong time. The same database that on 10g or 11g would take a couple of minutes to register would be taking upwards of 30 mins on 9i. Obviously, this is not ideal, so a poke around the query execution plans was required. As an example, let's take the table population query - the one that reads ALL_TABLES and joins it with a few other dictionary views to get us back our list of tables. On 10g, this query takes 5.6 seconds. On 9i, it takes 89.47 seconds. The difference in execution plan is even more dramatic - here's the (edited) execution plan on 10g: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 108K| 939 || 1 | SORT ORDER BY | | 108K| 939 || 2 | NESTED LOOPS OUTER | | 108K| 938 ||* 3 | HASH JOIN RIGHT OUTER | | 103K| 762 || 4 | VIEW | ALL_EXTERNAL_LOCATIONS | 2058 | 3 ||* 20 | HASH JOIN RIGHT OUTER | | 73472 | 759 || 21 | VIEW | ALL_EXTERNAL_TABLES | 2097 | 3 ||* 34 | HASH JOIN RIGHT OUTER | | 39920 | 755 || 35 | VIEW | ALL_MVIEWS | 51 | 7 || 58 | NESTED LOOPS OUTER | | 39104 | 748 || 59 | VIEW | ALL_TABLES | 6704 | 668 || 89 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2025 | 5 || 106 | VIEW | ALL_PART_TABLES | 277 | 11 |------------------------------------------------------------------------------- And the same query on 9i: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 16P| 55G|| 1 | SORT ORDER BY | | 16P| 55G|| 2 | NESTED LOOPS OUTER | | 16P| 862M|| 3 | NESTED LOOPS OUTER | | 5251G| 992K|| 4 | NESTED LOOPS OUTER | | 4243M| 2578 || 5 | NESTED LOOPS OUTER | | 2669K| 1440 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 ||* 50 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2043 | ||* 66 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_TABLES | 1777K| ||* 80 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_LOCATIONS | 1744K| ||* 96 | VIEW | ALL_PART_TABLES | 852K| |------------------------------------------------------------------------------- Have a look at the cost column. 10g's overall query cost is 939, and 9i is 55,000,000,000 (or more precisely, 55,496,472,769). It's also having to process far more data. What on earth could be causing this huge difference in query cost? After trawling through the '10g New Features' documentation, we found item 1.9.2.21. Before 10g, Oracle advised that you do not collect statistics on data dictionary objects. From 10g, it advised that you do collect statistics on the data dictionary; for our queries, Oracle therefore knows what sort of data is in the dictionary tables, and so can generate an efficient execution plan. On 9i, no statistics are present on the system tables, so Oracle has to use the Rule Based Optimizer, which turns most LEFT JOINs into nested loops. If we force 9i to use hash joins, like 10g, we get a much better plan: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 7587K| 3704 || 1 | SORT ORDER BY | | 7587K| 3704 ||* 2 | HASH JOIN OUTER | | 7587K| 822 ||* 3 | HASH JOIN OUTER | | 5262K| 616 ||* 4 | HASH JOIN OUTER | | 2980K| 465 ||* 5 | HASH JOIN OUTER | | 710K| 432 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 || 50 | VIEW | ALL_PART_TABLES | 852K| 104 || 78 | VIEW | ALL_TAB_COMMENTS | 2043 | 14 || 93 | VIEW | ALL_EXTERNAL_LOCATIONS | 1744K| 31 || 106 | VIEW | ALL_EXTERNAL_TABLES | 1777K| 28 |------------------------------------------------------------------------------- That's much more like it. This drops the execution time down to 24 seconds. Not as good as 10g, but still an improvement. There are still several problems with this, however. 10g introduced a new join method - a right outer hash join (used in the first execution plan). The 9i query optimizer doesn't have this option available, so forcing a hash join means it has to hash the ALL_TABLES table, and furthermore re-hash it for every hash join in the execution plan; this could be thousands and thousands of rows. And although forcing hash joins somewhat alleviates this problem on our test systems, there's no guarantee that this will improve the execution time on customers' systems; it may even increase the time it takes (say, if all their tables are partitioned, or they've got a lot of materialized views). Ideally, we would want a solution that provides a speedup whatever the input. To try and get some ideas, we asked some oracle performance specialists to see if they had any ideas or tips. Their recommendation was to add a hidden hook into the product that allowed users to specify their own query hints, or even rewrite the queries entirely. However, we would prefer not to take that approach; as well as a lot of new infrastructure & a rewrite of the population code, it would have meant that any users of 9i would have to spend some time optimizing it to get it working on their system before they could use the product. Another approach was needed. All our population queries have a very specific pattern - a base table provides most of the information we need (ALL_TABLES for tables, or ALL_TAB_COLS for columns) and we do a left join to extra subsidiary tables that fill in gaps (for instance, ALL_PART_TABLES for partition information). All the left joins use the same set of columns to join on (typically the object owner & name), so we could re-use the hash information for each join, rather than re-hashing the same columns for every join. To allow us to do this, along with various other performance improvements that could be done for the specific query pattern we were using, we read all the tables individually and do a hash join on the client. Fortunately, this 'pure' algorithmic problem is the kind that can be very well optimized for expected real-world situations; as well as storing row data we're not using in the hash key on disk, we use very specific memory-efficient data structures to store all the information we need. This allows us to achieve a database population time that is as fast as on 10g, and even (in some situations) slightly faster, and a memory overhead of roughly 150 bytes per row of data in the result set (for schemas with 10,000 tables in that means an extra 1.4MB memory being used during population). Next: fun with the 9i dictionary views.

    Read the article

  • Joining on NULLs

    - by Dave Ballantyne
    A problem I see on a fairly regular basis is that of dealing with NULL values.  Specifically here, where we are joining two tables on two columns, one of which is ‘optional’ ie is nullable.  So something like this: i.e. Lookup where all the columns are equal, even when NULL.   NULL’s are a tricky thing to initially wrap your mind around.  Statements like “NULL is not equal to NULL and neither is it not not equal to NULL, it’s NULL” can cause a serious brain freeze and leave you a gibbering wreck and needing your mummy. Before we plod on, time to setup some data to demo against. Create table #SourceTable ( Id integer not null, SubId integer null, AnotherCol char(255) not null ) go create unique clustered index idxSourceTable on #SourceTable(id,subID) go with cteNums as ( select top(1000) number from master..spt_values where type ='P' ) insert into #SourceTable select Num1.number,nullif(Num2.number,0),'SomeJunk' from cteNums num1 cross join cteNums num2 go Create table #LookupTable ( Id integer not null, SubID integer null ) go insert into #LookupTable Select top(100) id,subid from #SourceTable where subid is not null order by newid() go insert into #LookupTable Select top(3) id,subid from #SourceTable where subid is null order by newid() If that has run correctly, you will have 1 million rows in #SourceTable and 103 rows in #LookupTable.  We now want to join one to the other. First attempt – Lets just join select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and #LookupTable.SubID = #SourceTable.SubID OK, that’s a fail.  We had 100 rows back,  we didn’t correctly account for the 3 rows that have null values.  Remember NULL <> NULL and the join clause specifies SUBID=SUBID, which for those rows is not true. Second attempt – Lets deal with those pesky NULLS select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and isnull(#LookupTable.SubID,0) = isnull(#SourceTable.SubID,0) OK, that’s the right result, well done and 99.9% of the time that is where its left. It is a relatively trivial CPU overhead to wrap ISNULL around both columns and compare that result, so no problems.  But, although that’s true, this a relational database we are using here, not a procedural language.  SQL is a declarative language, we are making a request to the engine to get the results we want.  How we ask for them can make a ton of difference. Lets look at the plan for our second attempt, specifically the clustered index seek on the #SourceTable   There are 2 predicates. The ‘seek predicate’ and ‘predicate’.  The ‘seek predicate’ describes how SQLServer has been able to use an Index.  Here, it has been able to navigate the index to resolve where ID=ID.  So far so good, but what about the ‘predicate’ (aka residual probe) ? This is a row-by-row operation.  For each row found in the index matching the Seek Predicate, the leaf level nodes have been scanned and tested using this logical condition.  In this example [Expr1007] is the result of the IsNull operation on #LookupTable and that is tested for equality with the IsNull operation on #SourceTable.  This residual probe is quite a high overhead, if we can express our statement slightly differently to take full advantage of the index and make the test part of the ‘Seek Predicate’. Third attempt – X is null and Y is null So, lets state the query in a slightly manner: select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and ( #LookupTable.SubID = #SourceTable.SubID or (#LookupTable.SubID is null and #SourceTable.SubId is null) ) So its slightly wordier and may not be as clear in its intent to the human reader, that is what comments are for, but the key point is that it is now clearer to the query optimizer what our intention is. Let look at the plan for that query, again specifically the index seek operation on #SourceTable No ‘predicate’, just a ‘Seek Predicate’ against the index to resolve both ID and SubID.  A subtle difference that can be easily overlooked.  But has it made a difference to the performance ? Well, yes , a perhaps surprisingly high one. Clever query optimizer well done. If you are using a scalar function on a column, you a pretty much guaranteeing that a residual probe will be used.  By re-wording the query you may well be able to avoid this and use the index completely to resolve lookups. In-terms of performance and scalability your system will be in a much better position if you can.

    Read the article

  • How would i rank my keywords in Yahoo search engine?

    - by user1430715
    I am working as search engine optimizer team lead in a company and facing problem in a project which name is http://www.Prooftech.com.sg... Problem :- The Website has 10 keywords for which my client wanted the top 10 Ranking in Yahoo Singapore search engine. I have got top 10 ranking for the following 7 keywords Waterproofing, RC Roof ,Wall Leakages ,Ceiling Leakages , Water Leakages ,Roof Tile Coating ,Roof Tiles Repair in my 3 months work but still i am not getting the listing positions for Roof ,Concrete Repair ,Grouting .... I have Done lot of Bookmarking ,Blog Commenting ,Blog Creations ,Press Release,Classified Ads to get these 3 keywords in listing but there is no changes in the results.... Can any help me out from this problem so i can get Good rankings for Roof ,Concrete Repair ,Grouting

    Read the article

  • New "How do I ..." series

    - by Maria Colgan
    Over the last year or so the Optimizer development team has presented at a number of conferences and we got a lot of questions that start with "How do I ...". Where people were looking for a specific command or set of steps to fix a problem they had encountered. So we thought it would be a good idea to create a series of small posts that deal with these "How do I" question directly. We will use a simple example each time, that shows exactly what commands and procedures should be used to address a given problem. If you have an interesting "How do I .." question you would like to see us answer on the blog please email me and we will do our best to answer them! Watch out for the first post in this series which addresses the problem of "How do I deal with a third party application that has embedded hints that result in a sub-optimal execution plan in my environment?"

    Read the article

  • Web optimization

    - by hmloo
    1. CSS Optimization Organize your CSS code Good CSS organization helps with future maintainability of the site, it helps you and your team member understand the CSS more quickly and jump to specific styles. Structure CSS code For small project, you can break your CSS code in separate blocks according to the structure of the page or page content. for example you can break your CSS document according the content of your web page(e.g. Header, Main Content, Footer) Structure CSS file For large project, you may feel having too much CSS code in one place, so it's the best to structure your CSS into more CSS files, and use a master style sheet to import these style sheets. this solution can not only organize style structure, but also reduce server request./*--------------Master style sheet--------------*/ @import "Reset.css"; @import "Structure.css"; @import "Typography.css"; @import "Forms.css"; Create index for your CSS Another important thing is to create index at the beginning of your CSS file, index can help you quickly understand the whole CSS structure./*---------------------------------------- 1. Header 2. Navigation 3. Main Content 4. Sidebar 5. Footer ------------------------------------------*/ Writing efficient CSS selectors keep in mind that browsers match CSS selectors from right to left and the order of efficiency for selectors 1. id (#myid) 2. class (.myclass) 3. tag (div, h1, p) 4. adjacent sibling (h1 + p) 5. child (ul > li) 6. descendent (li a) 7. universal (*) 8. attribute (a[rel="external"]) 9. pseudo-class and pseudo element (a:hover, li:first) the rightmost selector is called "key selector", so when you write your CSS code, you should choose more efficient key selector. Here are some best practice: Don't tag-qualify Never do this:div#myid div.myclass .myclass#myid IDs are unique, classes are more unique than a tag so they don't need a tag. Doing so makes the selector less efficient. Avoid overqualifying selectors for example#nav a is more efficient thanul#nav li a Don't repeat declarationExample: body {font-size:12px;}h1 {font-size:12px;font-weight:bold;} since h1 is already inherited from body, so you don't need to repeate atrribute. Using 0 instead of 0px Always using #selector { margin: 0; } There’s no need to include the px after 0, removing all those superfluous px can reduce the size of your CSS file. Group declaration Example: h1 { font-size: 16pt; } h1 { color: #fff; } h1 { font-family: Arial, sans-serif; } it’s much better to combine them:h1 { font-size: 16pt; color: #fff; font-family: Arial, sans-serif; } Group selectorsExample: h1 { color: #fff; font-family: Arial, sans-serif; } h2 { color: #fff; font-family: Arial, sans-serif; } it would be much better if setup as:h1, h2 { color: #fff; font-family: Arial, sans-serif; } Group attributeExample: h1 { color: #fff; font-family: Arial, sans-serif; } h2 { color: #fff; font-family: Arial, sans-serif; font-size: 16pt; } you can set different rules for specific elements after setting a rule for a grouph1, h2 { color: #fff; font-family: Arial, sans-serif; } h2 { font-size: 16pt; } Using Shorthand PropertiesExample: #selector { margin-top: 8px; margin-right: 4px; margin-bottom: 8px; margin-left: 4px; }Better: #selector { margin: 8px 4px 8px 4px; }Best: #selector { margin: 8px 4px; } a good diagram illustrated how shorthand declarations are interpreted depending on how many values are specified for margin and padding property. instead of using:#selector { background-image: url(”logo.png”); background-position: top left; background-repeat: no-repeat; } is used:#selector { background: url(logo.png) no-repeat top left; } 2. Image Optimization Image Optimizer Image Optimizer is a free Visual Studio2010 extension that optimizes PNG, GIF and JPG file sizes without quality loss. It uses SmushIt and PunyPNG for the optimization. Just right click on any folder or images in Solution Explorer and choose optimize images, then it will automatically optimize all PNG, GIF and JPEG files in that folder. CSS Image Sprites CSS Image Sprites are a way to combine a collection of images to a single image, then use CSS background-position property to shift the visible area to show the required image, many images can take a long time to load and generates multiple server requests, so Image Sprite can reduce the number of server requests and improve site performance. You can use many online tools to generate your image sprite and CSS, and you can also try the Sprite and Image Optimization framework released by The ASP.NET team.

    Read the article

  • Continuation - Viewing FIRST_ROWS before query completes.

    - by Frank Developer
    I have identified the query constructs my users normally use. would it make sense for me to create composite indexes to support those constructs and provide FIRST_ROWS capability? If I migrate from SE to IDS, I will loose the ability to write low-level functions with c-isam calls, but gain FIRST_ROWS along with other goodies like: SET-READS for index scans (onconfig USE_[KO]BATCHEDREAD), optimizer directives, parallel queries, etc?

    Read the article

  • SQL join: where clause vs. on clause

    - by BCS
    After reading it, this is not a duplicate of Explicit vs Implicit SQL Joins. The answer may be related (or even the same) but the question is different. What is the difference and what should go in each? If I understand the theory correctly, the query optimizer should be able to use both interchangeably.

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12  | Next Page >