database performance - Page 10

SQL SERVER – Speed Up! – Parallel Processes and Unparalleled Performance – TechEd 2012 India

- by pinaldave

TechEd India 2012 is just around the corner and I will be presenting there on two different session. SQL Server Performance Tuning is a very challenging subject that requires expertise in Database Administration and Database Development. I always have enjoyed talking about SQL Server Performance tuning subject. Just like doctors I like to call my every attempt to improve the performance of SQL Server queries and database server as a practice too. I have been working with SQL Server for more than 8 years and I believe that many of the performance tuning concept I have mastered. However, performance tuning is not a simple subject. However there are occasions when I feel stumped, there are occasional when I am not sure what should be the next step. When I face situation where I cannot figure things out easily, it makes me most happy because I clearly see this as a learning opportunity. I have been presenting in TechEd India for last three years. This is my fourth time opportunity to present a technical session on SQL Server. Just like every other year, I decided to present something different, something which I have spend years of learning. This time, I am going to present about parallel processes. It is widely believed that more the CPU will improve performance of the server. It is true in many cases. However, there are cases when limiting the CPU usages have improved overall health of the server. I will be presenting on the subject of Parallel Processes and its effects. I have spent more than a year working on this subject only. After working on various queries on multi-CPU systems I have personally learned few things. In coming TechEd session, I am going to share my experience with parallel processes and performance tuning. Session Details Title: Speed Up! – Parallel Processes and Unparalleled Performance (Add to Calendar) Abstract: “More CPU More Performance” – A very common understanding is that usage of multiple CPUs can improve the performance of the query. To get maximum performance out of any query – one has to master various aspects of the parallel processes. In this deep dive session, we will explore this complex subject with a very simple interactive demo. An attendee will walk away with proper understanding of CX_PACKET wait types, MAXDOP, parallelism threshold and various other concepts. Date and Time: March 23, 2012, 12:15 to 13:15 Location: Hotel Lalit Ashok - Kumara Krupa High Grounds, Bengaluru – 560001, Karnataka, India. Add to Calendar Please submit your questions in the comments area and I will be for sure discussing them during my session. If I pick your question to discuss during my session, here is your gift I commit right now – SQL Server Interview Questions and Answers Book. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology Tagged: TechEd, TechEdIn

Read the article

SQL server virtual memory usage and performance

- by user365035

Hello, I have a very large DB used mostly for analytics. The performance overall is very sluggish. I just noticed that when running the query below, the amount of virtual memory used greatly exceeds the amount of physical memory available. Currently, physical memory is 10GB (10238k bytes) whereas the virtual memory returns significantly more - 8388607k bytes. That seems really wrong, but I'm at a bit of a loss on how to proceed. USE [master]; GO select cpu_count , hyperthread_ratio , physical_memory_in_bytes / 1048576 as 'mem_MB' , virtual_memory_in_bytes / 1048576 as 'virtual_mem_MB' , max_workers_count , os_error_mode , os_priority_class from sys.dm_os_sys_info

Read the article

Oracle Database 12c ????????(?????)

- by OTN-J Master

Oracle Database 12c????????????? ?????????????????????????????????????¦ ??????????????? ????????? Oracle Database 12c????OTN??? Oracle Database 12c???? ¦ Oracle Database 12c???????????????? Oracle Database 12c ?????? (PDF)¦ Oracle Database 12c???????????? ??????????????? Oracle Database 12c (??????????????) "Oracle Database 12c???"??????????????????? ??????NEC???????????????? ???????????????????????¦ Oracle Database 12c??????????????????? ???????????Oracle Database 12c ????? (PC/????????????!) ????????????????????????????? ?????????????????! Oracle Database 12c???????? (@IT /Database Expert) ??????Oracle Database 12c????????·?????????????????? (EnterpriseZine/DB Online) ??????????!Oracle Database 12c???? (EnterpriseZine/DB Online) ¦ Oracle Database 12c?????????????????????? ??????????Oracle Database 12c ?????! (EnterpriseZine/DB Online) ¦ Oracle Database 12c???????????? ????????? OTN???????????????????????????????????????????? ?????????/??????? ~????????????????12c??????????!~ Oracle Database 12c????????!???????????? ¦ Oracle Database 12c??????????????????? Oracle University?? Oracle Database 12c: ?????? ¦ Oracle Database 12c?????????????????????????? ?????????????????? ¦ Oracle Database 12c????????????????? ?????????????????????????????????????????(??)???????????????????????????OTN Community(??????)?????????OTN Community??? ¦ Oracle Database 12c???????????????? 12c????????????????????????OTN????????????????Twitter(@oracletechnetjp)???????????????????????! ????????????????????????????Oracle Database 12c???????????????(?8???????????&??????????)????????

Read the article

2?????????????(Database??)

- by rika.tokumichi

???????????OTN????????? ??????????????????????????????????????????????????????? ????????????????????????????????????????? ???Database??????????????2?????????????????????????????????? ??????????? 1?:Oracle SQL Developer 2.1 (2.1.0.63.73)?Download? 2?:Oracle Database 11g Release 1?Download? 3?:Oracle Database 10g Express Edition?Download? 4?:Oracle Database 10g Release 2?Download? 5?:Oracle Database 11g Release 2?Download? (????2?1?~2?28?) ??????1??2??????????! ?????TOP5?????????????????????????? ??12????????????????????????????? ???Oracle Database 11g Release2?????Grid Infrastructure???? ??Grid Infrastructure??????????Oracle Clusterware?Oracle Automatic Storage Management(ASM)???????? ??????·????????????????????????????????????????????? ????????????????????OTN???????????????????????????????? >?????:Oracle Database 11g R2?????Oracle VM???????????? ??10?30????????????Oracle VM Forum 2009????????????????2009?9?????????Oracle Database 11g Release 2??????????????????????????????????????????????????????????????????????????????????????????????????????? >???:???????????????????2(???????) ????????????????????????????????????????????????????????????????????????????2???????????(????????????????)????????????????????????????????????????????????????? >Oracle Database 11g Release 2???????? ?????????????????????????????????????????????????????????? ???????????

Read the article

How optimize queries with fully qualified names in t-sql?

- by tomaszs

Whe I call: select * from Database.dbo.Table where NAME = 'cat' It takes: 200 ms And when I change database to Database in Management Studio and call it without fully qualified name it's much faster: select * from Table where NAME = 'cat' It takes: 17 ms Is there any way to make fully qualified queries faster without changing database?

Read the article

Users in database server or database tables

- by Batcat

Hi all, I came across an interesting issue about client server application design. We have this browser based management application where it has many users using the system. So obvisously within that application we have an user management module within it. I have always thought having an user table in the database to keep all the login details was good enough. However, a senior developer said user management should be done in the database server layer if not then is poorly designed. What he meant was, if a user wants to use the application then a user should be created in the user table AND in the database server as a user account as well. So if I have 50 users using my applications, then I should have 50 database server user logins. I personally think having just one user account in the database server for this database was enough. Just grant this user with the allowed privileges to operate all the necessary operation need by the application. The users that are interacting with the application should have their user accounts created and managed within the database table as they are more related to the application layer. I don't see and agree there is need to create a database server user account for every user created for the application in the user table. A single database server user should be enough to handle all the query sent by the application. Really hope to hear some suggestions / opinions and whether I'm missing something? performance or security issues? Thank you very much.

Read the article

SQL SERVER – Plan Recompilation and Reduce Recompilation – Performance Tuning

- by pinaldave

Recompilation process is same as compilation and degrades server performance. In SQL Server 2000 and earlier versions, this was a serious issue but in SQL server 2005, the severity of this issue has been significantly reduced by introducing a new feature called Statement-level recompilation. When SQL Server 2005 recompiles stored procedures, only the statement that [...]

Read the article

SQL SERVER – Improve Performance by Reducing IO – Creating Covered Index

- by pinaldave

This blog post is in the response of the T-SQL Tuesday #004: IO by Mike Walsh. The subject of this month is IO. Here is my quick blog post on how Cover Index can Improve Performance by Reducing IO. Let us kick off this post with disclaimers about Index. Index is a very complex subject and [...]

Read the article

Oracle Database Smart Flash Cache: Only on Oracle Linux and Oracle Solaris

- by sergio.leunissen

Oracle Database Smart Flash Cache is a feature that was first introduced with Oracle Database 11g Release 2. Only available on Oracle Linux and Oracle Solaris, this feature increases the size of the database buffer cache without having to add RAM to the system. In effect, it acts as a second level cache on flash memory and will especially benefit read-intensive database applications. The Oracle Database Smart Flash Cache white paper concludes: Available at no additional cost, Database Smart Flash Cache on Oracle Solaris and Oracle Linux has the potential to offer considerable benefit to users of Oracle Database 11g Release 2 with disk-bound read-mostly or read-only workloads, through the simple addition of flash storage such as the Sun Storage F5100 Flash Array or the Sun Flash Accelerator F20 PCIe Card. Read the white paper.

Read the article

.NET Reflector 7.2 Early Access Build 2 Released: Performance Critical

- by Bart Read

I've just posted a write-up of some of the performance tuning I've done to improve .NET Reflector 7.2's start-up time here: http://www.reflector.net/2011/05/net-reflector-7-start-up-time-running-out-of-gas-or-pedal-to-the-metal/ You can get the new build from the .NET Reflector homepage at http://www.reflector.net/. Please remember to give us your feedback in the forum, at http://forums.reflector.net/, using the tags #7.2 and #eap. Technorati Tags: reflector,early access,7.2

Read the article

OBIEE 11.1.1 - Built-in BI Metrics for Performance Monitoring

- by Ahmed Awan

You can use Fusion Middleware Control metrics to monitor System Components (BI processes) and WebLogic Server processes. Tip: · Use Oracle Enterprise Manager (EM) URL to monitor end to end OBIEE real time performance: :7001/em"http://<server>:7001/em · In Oracle Business Intelligence 11g, the perfmon URL is still valid to use i.e. :9704/analytics/saw.dll?Perfmon"http://<server>:9704/analytics/saw.dll?Perfmon

Read the article

High Performance Storage Systems for SQL Server

Rod Colledge turns his pessimistic mindset to storage systems, and describes the best way to configure the storage systems of SQL Servers for both performance and reliability. Even Rod gets a glint in his eye when he then goes on to describe the dazzling speed of solid-state storage, though he is quick to identify the risks.

Read the article

Compute Scalars, Expressions and Execution Plan Performance

- by Paul White

The humble Compute Scalar is one of the least well-understood of the execution plan operators, and usually the last place people look for query performance problems. It often appears in execution plans with a very low (or even zero) cost, which goes some way to explaining why people ignore it. Some readers will already know that a Compute Scalar can contain a call to a user-defined function, and that any T-SQL function with a BEGIN…END block in its definition can have truly disastrous consequences...(read more)

Read the article

Can frequent state changes decrease rendering performance?

- by Miro

Can frequent texture and shader binding decrease rendering performance? "Frequent" binding example: for object for material in object render part of object using that material "Low count" binding example: for material for object in material render part of object using that material I'm planning to use an octree later and with this "low count" method of rendering it can drastically increase memory consumption. So is it good idea?

Read the article

Brendan Gregg's "Systems Performance: Enterprise and the Cloud"

- by user12608550

Long ago, the prerequisite UNIX performance book was Adrian Cockcroft's 1994 classic, Sun Performance and Tuning: Sparc & Solaris, later updated in 1998 as Java and the Internet. As Solaris evolved to include the invaluable DTrace observability features, new essential performance references have been published, such as Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris (2006) by McDougal, Mauro, and Gregg, and DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD (2011), also by Mauro and Gregg. Much has occurred in Solaris Land since those books appeared, notably Oracle's acquisition of Sun Microsystems in 2010 and the demise of the OpenSolaris community. But operating system technologies have continued to improve markedly in recent years, driven by stunning advances in multicore processor architecture, virtualization, and the massive scalability requirements of cloud computing. A new performance reference was needed, and I eagerly waited for something that thoroughly covered modern, distributed computing performance issues from the ground up. Well, there's a new classic now, authored yet again by Brendan Gregg, former Solaris kernel engineer at Sun and now Lead Performance Engineer at Joyent. Systems Performance: Enterprise and the Cloud is a modern, very comprehensive guide to general system performance principles and practices, as well as a highly detailed reference for specific UNIX and Linux observability tools used to examine and diagnose operating system behaviour. It provides thorough definitions of terms, explains performance diagnostic Best Practices and "Worst Practices" (called "anti-methods"), and covers key observability tools including DTrace, SystemTap, and all the traditional UNIX utilities like vmstat, ps, iostat, and many others. The book focuses on operating system performance principles and expands on these with respect to Linux (Ubuntu, Fedora, and CentOS are cited), and to Solaris and its derivatives [1]; it is not directed at any one OS so it is extremely useful as a broad performance reference. The author goes beyond the intricacies of performance analysis and shows how to interpret and visualize statistical information gathered from the observability tools. It's often difficult to extract understanding from voluminous rows of text output, and techniques are provided to assist with summarizing, visualizing, and interpreting the performance data. Gregg includes myriad useful references from the system performance literature, including a "Who's Who" of contributors to this great body of diagnostic tools and methods. This outstanding book should be required reading for UNIX and Linux system administrators as well as anyone charged with diagnosing OS performance issues. Moreover, the book can easily serve as a textbook for a graduate level course in operating systems [2]. [1] Solaris 11, of course, and Joyent's SmartOS (developed from OpenSolaris) [2] Gregg has taught system performance seminars for many years; I have also taught such courses...this book would be perfect for the OS component of an advanced CS curriculum.

Read the article

Let the RAM improves performance

- by user1717079

I have a low profile machine but with a lot of fast RAM, 4 Gb, which is really an amount of memory that i probably will never use, not even an half, since i just use this machine for coding and browsing the web. The HDD is really slow and so the overall performance are bad when booting, caching or starting new program, I'm wondering if Ubuntu can provide some setting or utility to solve this situation and let my system rely more on the RAM usage.

Read the article

Increase Performance of VS 2010 by using a SSD

- by System.Data

After searching on the internet for performance improvements when using Visual Studio 2010 with a solid state hard drive, I heard a lot of different opinions. A lot of people said that there isn't really a benefit when using a SSD, but in contrast others said the exact opposite. I am a bit confused with the contrasting opinions and I cannot really make a decision whether buying a SSD would make a difference. What are your experiences with this issue and which SSD did you use?

Read the article

Database Consolidation onto Private Clouds - updated for Oracle Database 12c

- by B R Clouse

One of our team's most popular white papers has been expanded and updated to discuss Oracle Database 12c. Now available on our OTN page, the new version of Database Consolidation onto Private Clouds covers best practices for consolidation with pluggable databases that the new mulitenant architecture provides, and expanded information on the database and schema consolidation options. These are the consolidation models the paper evaluates: server database schema pluggable databases Key considerations for consolidating workloads which the paper explores: Choosing a consolidation model How PDBs solve the IT complexity problem Isolation in consolidated environments Cloud pool design Complementary workloads Enterprise Manager 12c for consolidation planning and operations Many more white papers have been updated or are new for Oracle Database 12c. We'll continue to highlight those which tie directory to your journey to enterprise cloud.

Read the article

Using DB_PARAMS to Tune the EP_LOAD_SALES Performance

- by user702295

The DB_PARAMS table can be used to tune the EP_LOAD_SALES performance. The AWR report supplied shows 16 CPUs so I imaging that you can run with 8 or more parallel threads. This can be done by setting the following DB_PARAMS parameters. Note that most of parameter changes are just changing a 2 or 4 into an 8: DBHintEp_Load_SalesUseParallel = TRUE DBHintEp_Load_SalesUseParallelDML = TRUE DBHintEp_Load_SalesInsertErr = + parallel(@T_SRC_SALES@ 8) full(@T_SRC_SALES@) DBHintEp_Load_SalesInsertLd = + parallel(@T_SRC_SALES@ 8) DBHintEp_Load_SalesMergeSALES_DATA = + parallel(@T_SRC_SALES_LD@ 8) full(@T_SRC_SALES_LD@) DBHintMdp_AddUpdateIs_Fictive0SD = + parallel(s 8 ) DBHintMdp_AddUpdateIs_Fictive2SD = + parallel(s 8 )

Read the article

Multiple database accesses or one massive access?

- by DudeOnRock

What is a better approach when it comes to performance and optimal resource utilization: accessing a database multiple times through AJAX to only get the exact information needed when it is needed, or performing one access to retrieve an object that holds all information that might be needed, with a high probability that not all is actually needed? I know how to benchmark the actual queries, but I don't know how to test what is best when it comes to database performance when thousands of users are accessing the database simultaneously and how connection pooling comes into play.

Read the article

Use a SQL Database for a Desktop Game

- by sharethis

Developing a Game Engine I am planning a computer game and its engine. There will be a 3 dimensional world with first person view and it will be single player for now. The programming language is C++ and it uses OpenGL. Data Centered Design Decision My design decision is to use a data centered architecture where there is a global event manager and a global data manager. There are many components like physics, input, sound, renderer, ai, ... Each component can trigger and listen to events. Moreover, each component can read, edit, create and remove data. The question is about the data manager. Whether to Use a Relational Database Should I use a SQL Database, e.g. SQLite or MySQL, to store the game data? This contains virtually all game content like items, characters, inventories, ... Except of meshes and textures which are even more performance related, so I will keep them in memory. Is a SQL database fast enough to use it for realtime reading and writing game informations, like the position of a moving character? I also need to care about cross-platform compatibility. Aside from keeping everything in memory, what alternatives do I have? Advantages Would Be The advantages of using a relational database like MySQL would be the data orientated structure which allows fast computation. I would not need objects for representing entities. I could easily query data of objects near the player needed for rendering. And I don't have to take care about data of objects far away. Moreover there would be no need for savegames since the hole game state is saved in the database. Last but not least, expanding the game to an online game would be relative easy because there already is a place where the hole game state is stored.

Read the article

Python performance: iteration and operations on nested lists

- by J.J.

Problem Hey folks. I'm looking for some advice on python performance. Some background on my problem: Given: A mesh of nodes of size (x,y) each with a value (0...255) starting at 0 A list of N input coordinates each at a specified location within the range (0...x, 0...y) Increment the value of the node at the input coordinate and the node's neighbors within range Z up to a maximum of 255. Neighbors beyond the mesh edge are ignored. (No wrapping) BASE CASE: A mesh of size 1024x1024 nodes, with 400 input coordinates and a range Z of 75 nodes. Processing should be O(x*y*Z*N). I expect x, y and Z to remain roughly around the values in the base case, but the number of input coordinates N could increase up to 100,000. My goal is to minimize processing time. Current results I have 2 current implementations: f1, f2 Running speed on my 2.26 GHz Intel Core 2 Duo with Python 2.6.1: f1: 2.9s f2: 1.8s f1 is the initial naive implementation: three nested for loops. f2 is replaces the inner for loop with a list comprehension. Code is included below for your perusal. Question How can I further reduce the processing time? I'd prefer sub-1.0s for the test parameters. Please, keep the recommendations to native Python. I know I can move to a third-party package such as numpy, but I'm trying to avoid any third party packages. Also, I've generated random input coordinates, and simplified the definition of the node value updates to keep our discussion simple. The specifics have to change slightly and are outside the scope of my question. thanks much! f1 is the initial naive implementation: three nested for loops. 2.9s def f1(x,y,n,z): rows = [] for i in range(x): rows.append([0 for i in xrange(y)]) for i in range(n): inputX, inputY = (int(x*random.random()), int(y*random.random())) topleft = (inputX - z, inputY - z) for i in xrange(max(0, topleft[0]), min(topleft[0]+(z*2), x)): for j in xrange(max(0, topleft[1]), min(topleft[1]+(z*2), y)): if rows[i][j] <= 255: rows[i][j] += 1 f2 is replaces the inner for loop with a list comprehension. 1.8s def f2(x,y,n,z): rows = [] for i in range(x): rows.append([0 for i in xrange(y)]) for i in range(n): inputX, inputY = (int(x*random.random()), int(y*random.random())) topleft = (inputX - z, inputY - z) for i in xrange(max(0, topleft[0]), min(topleft[0]+(z*2), x)): l = max(0, topleft[1]) r = min(topleft[1]+(z*2), y) rows[i][l:r] = [j+1 for j in rows[i][l:r] if j < 255]

Read the article

One database or many?

- by dsims

I am developing a website that will manage data for multiple entities. No data is shared between entities, but they may be owned by the same customer. A customer may want to manage all their entities from a single "dashboard". So should I have one database for everything, or keep the data seperated into individual databases? Is there a best-practice? What are the positives/negatives for having a: database for the entire site (entity has a "customerID", data has "entityID") database for each customer (data has "entityID") database for each entity (relation of database to customer is outside of database) Multiple databases seems like it would have better performance (fewer rows and joins) but may eventually become a maintenance nightmare.

Read the article

SYS2 Scripts Updated – Scripts to monitor database backup, database space usage and memory grants now available

- by Davide Mauri

I’ve just released three new scripts of my “sys2” script collection that can be found on CodePlex: Project Page: http://sys2dmvs.codeplex.com/ Source Code Download: http://sys2dmvs.codeplex.com/SourceControl/changeset/view/57732 The three new scripts are the following sys2.database_backup_info.sql sys2.query_memory_grants.sql sys2.stp_get_databases_space_used_info.sql Here’s some more details: database_backup_info This script has been made to quickly check if and when backup was done. It will report the last full, differential and log backup date and time for each database. Along with these information you’ll also get some additional metadata that shows if a database is a read-only database and its recovery model: By default it will check only the last seven days, but you can change this value just specifying how many days back you want to check. To analyze the last seven days, and list only the database with FULL recovery model without a log backup select * from sys2.databases_backup_info(default) where recovery_model = 3 and log_backup = 0 To analyze the last fifteen days, and list only the database with FULL recovery model with a differential backup select * from sys2.databases_backup_info(15) where recovery_model = 3 and diff_backup = 1 I just love this script, I use it every time I need to check that backups are not too old and that t-log backup are correctly scheduled. query_memory_grants This is just a wrapper around sys.dm_exec_query_memory_grants that enriches the default result set with the text of the query for which memory has been granted or is waiting for a memory grant and, optionally, its execution plan stp_get_databases_space_used_info This is a stored procedure that list all the available databases and for each one the overall size, the used space within that size, the maximum size it may reach and the auto grow options. This is another script I use every day in order to be able to monitor, track and forecast database space usage. As usual feedbacks and suggestions are more than welcome!

Read the article

Performance Enhancement in Full-Text Search Query

- by Calvin Sun

Ever since its first release, we are continuing consolidating and developing InnoDB Full-Text Search feature. There is one recent improvement that worth blogging about. It is an effort with MySQL Optimizer team that simplifies some common queries’ Query Plans and dramatically shorted the query time. I will describe the issue, our solution and the end result by some performance numbers to demonstrate our efforts in continuing enhancement the Full-Text Search capability. The Issue: As we had discussed in previous Blogs, InnoDB implements Full-Text index as reversed auxiliary tables. The query once parsed will be reinterpreted into several queries into related auxiliary tables and then results are merged and consolidated to come up with the final result. So at the end of the query, we’ll have all matching records on hand, sorted by their ranking or by their Doc IDs. Unfortunately, MySQL’s optimizer and query processing had been initially designed for MyISAM Full-Text index, and sometimes did not fully utilize the complete result package from InnoDB. Here are a couple examples: Case 1: Query result ordered by Rank with only top N results: mysql> SELECT FTS_DOC_ID, MATCH (title, body) AGAINST ('database') AS SCORE FROM articles ORDER BY score DESC LIMIT 1; In this query, user tries to retrieve a single record with highest ranking. It should have a quick answer once we have all the matching documents on hand, especially if there are ranked. However, before this change, MySQL would almost retrieve rankings for almost every row in the table, sort them and them come with the top rank result. This whole retrieve and sort is quite unnecessary given the InnoDB already have the answer. In a real life case, user could have millions of rows, so in the old scheme, it would retrieve millions of rows' ranking and sort them, even if our FTS already found there are two 3 matched rows. Apparently, the million ranking retrieve is done in vain. In above case, it should just ask for 3 matched rows' ranking, all other rows' ranking are 0. If it want the top ranking, then it can just get the first record from our already sorted result. Case 2: Select Count(*) on matching records: mysql> SELECT COUNT(*) FROM articles WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE); In this case, InnoDB search can find matching rows quickly and will have all matching rows. However, before our change, in the old scheme, every row in the table was requested by MySQL one by one, just to check whether its ranking is larger than 0, and later comes up a count. In fact, there is no need for MySQL to fetch all rows, instead InnoDB already had all the matching records. The only thing need is to call an InnoDB API to retrieve the count The difference can be huge. Following query output shows how big the difference can be: mysql> select count(*) from searchindex_inno where match(si_title, si_text) against ('people') +----------+ | count(*) | +----------+ | 666877 | +----------+ 1 row in set (16 min 17.37 sec) So the query took almost 16 minutes. Let’s see how long the InnoDB can come up the result. In InnoDB, you can obtain extra diagnostic printout by turning on “innodb_ft_enable_diag_print”, this will print out extra query info: Error log: keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 2 secs: row(s) 666877: error: 10 ft_init() ft_init_ext() keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 3 secs: row(s) 666877: error: 10 Output shows it only took InnoDB only 3 seconds to get the result, while the whole query took 16 minutes to finish. So large amount of time has been wasted on the un-needed row fetching. The Solution: The solution is obvious. MySQL can skip some of its steps, optimize its plan and obtain useful information directly from InnoDB. Some of savings from doing this include: 1) Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results. 2) Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount. 3) Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself. 4) Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs. Performance Results and comparison with MyISAM The result by this change is very obvious. I includes six testing result performed by Alexander Rubin just to demonstrate how fast the InnoDB query now becomes when comparing MyISAM Full-Text Search. These tests are base on the English Wikipedia data of 5.4 Million rows and approximately 16G table. The test was performed on a machine with 1 CPU Dual Core, SSD drive, 8G of RAM and InnoDB_buffer_pool is set to 8 GB. Table 1: SELECT with LIMIT CLAUSE mysql> SELECT si_title, match(si_title, si_text) against('family') as rel FROM si WHERE match(si_title, si_text) against('family') ORDER BY rel desc LIMIT 10; InnoDB MyISAM Times Faster Time for the query 1.63 sec 3 min 26.31 sec 127 You can see for this particular query (retrieve top 10 records), InnoDB Full-Text Search is now approximately 127 times faster than MyISAM. Table 2: SELECT COUNT QUERY mysql>select count(*) from si where match(si_title, si_text) against('family‘); +----------+ | count(*) | +----------+ | 293955 | +----------+ InnoDB MyISAM Times Faster Time for the query 1.35 sec 28 min 59.59 sec 1289 In this particular case, where there are 293k matching results, InnoDB took only 1.35 second to get all of them, while take MyISAM almost half an hour, that is about 1289 times faster!. Table 3: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county California 0.93 sec 32.03 sec 34.4 President united states of America 2.5 sec 36.98 sec 14.8 Table 4: SELECT title and text with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, si_title, si_text, ... as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.61 sec 41.65 sec 68.3 family film 1.15 sec 47.17 sec 41.0 Pizza restaurant orange county california 1.03 sec 48.2 sec 46.8 President united states of america 2.49 sec 44.61 sec 17.9 Table 5: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county califormia 0.93 sec 32.03 sec 34.4 President united states of america 2.5 sec 36.98 sec 14.8 Table 6: SELECT COUNT(*) mysql> SELECT count(*) FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.47 sec 82 sec 174.5 family film 0.83 sec 131 sec 157.8 Pizza restaurant orange county califormia 0.74 sec 106 sec 143.2 President united states of america 1.96 sec 220 sec 112.2 Again, table 3 to table 6 all showing InnoDB consistently outperform MyISAM in these queries by a large margin. It becomes obvious the InnoDB has great advantage over MyISAM in handling large data search. Summary: These results demonstrate the great performance we could achieve by making MySQL optimizer and InnoDB Full-Text Search more tightly coupled. I think there are still many cases that InnoDB’s result info have not been fully taken advantage of, which means we still have great room to improve. And we will continuously explore the area, and get more dramatic results for InnoDB full-text searches. Jimmy Yang, September 29, 2012

Search Results

Search found 40567 results on 1623 pages for 'database performance'.

Page 10/1623 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by pinaldave

- by user365035

- by OTN-J Master

- by rika.tokumichi

- by tomaszs

- by Batcat

- by pinaldave

- by pinaldave

- by sergio.leunissen

- by Bart Read

- by Ahmed Awan

- by Paul White

- by Miro

- by user12608550

- by user1717079

- by System.Data

- by B R Clouse

- by user702295

- by DudeOnRock

- by sharethis

- by J.J.

- by dsims

- by Davide Mauri

- by Calvin Sun

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >