Search Results

Search found 19055 results on 763 pages for 'high performance'.

Page 303/763 | < Previous Page | 299 300 301 302 303 304 305 306 307 308 309 310  | Next Page >

  • NUMA-aware placement of communication variables

    - by Dave
    For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

    Read the article

  • Stairway to PowerPivot and DAX - Level 3: The DAX DISTINCT() Function and Basic Distinct Counts

    Bill Pearson, Business Intelligence architect and author, exposes the DAX DISTINCT() function, and then provides some hands-on exposure to its use in generating distinct counts. Moreover, he further explores working with measures in the PivotTable in this, the third Level of our new Stairway to PowerPivot and DAX series. Optimize SQL Server performance“With SQL Monitor, we can be proactive in our optimization process, instead of waiting until a customer reports a problem,” John Trumbul, Sr. Software Engineer. Optimize your servers with a free trial.

    Read the article

  • VirtualBox 4.2.14 is now available

    - by user12611829
    The VirtualBox development team has just released version 4.2.14, and it is now available for download. This is a maintenance release for version 4.2 and contains quite a few fixes. Here is the list from the official Changelog. VMM: another TLB invalidation fix for non-present pages VMM: fixed a performance regression (4.2.8 regression; bug #11674) GUI: fixed a crash on shutdown GUI: prevent stuck keys under certain conditions on Windows hosts (bugs #2613, #6171) VRDP: fixed a rare crash on the guest screen resize VRDP: allow to change VRDP parameters (including enabling/disabling the server) if the VM is paused USB: fixed passing through devices on Mac OS X host to a VM with 2 or more virtual CPUs (bug #7462) USB: fixed hang during isochronous transfer with certain devices (4.1 regression; Windows hosts only; bug #11839) USB: properly handle orphaned URBs (bug #11207) BIOS: fixed function for returning the PCI interrupt routing table (fixes NetWare 6.x guests) BIOS: don't use the ENTER / LEAVE instructions in the BIOS as these don't work in the real mode as set up by certain guests (e.g. Plan 9 and QNX 4) DMI: allow to configure DmiChassisType (bug #11832) Storage: fixed lost writes if iSCSI is used with snapshots and asynchronous I/O (bug #11479) Storage: fixed accessing certain VHDX images created by Windows 8 (bug #11502) Storage: fixed hang when creating a snapshot using Parallels disk images (bug #9617) 3D: seamless + 3D fixes (bug #11723) 3D: version 4.2.12 was not able to read saved states of older versions under certain conditions (bug #11718) Main/Properties: don't create a guest property for non-running VMs if the property does not exist and is about to be removed (bug #11765) Main/Properties: don't forget to make new guest properties persistent after the VM was terminated (bug #11719) Main/Display: don't lose seamless regions during screen resize Main/OVF: don't crash during import if the client forgot to call Appliance::interpret() (bug #10845) Main/OVF: don't create invalid appliances by stripping the file name if the VM name is very long (bug #11814) Main/OVF: don't fail if the appliance contains multiple file references (bug #10689) Main/Metrics: fixed Solaris file descriptor leak Settings: limit depth of snapshot tree to 250 levels, as more will lead to decreased performance and may trigger crashes VBoxManage: fixed setting the parent UUID on diff images using sethdparentuuid Linux hosts: work around for not crashing as a result of automatic NUMA balancing which was introduced in Linux 3.8 (bug #11610) Windows installer: force the installation of the public certificate in background (i.e. completely prevent user interaction) if the --silent command line option is specified Windows Additions: fixed problems with partial install in the unattended case Windows Additions: fixed display glitch with the Start button in seamless mode for some themes Windows Additions: Seamless mode and auto-resize fixes Windows Additions: fixed trying to to retrieve new auto-logon credentials if current ones were not processed yet Windows Additions installer: added the /with_wddm switch to select the experimental WDDM driver by default Linux Additions: fixed setting own timed out and aborted texts in information label of the lightdm greeter Linux Additions: fixed compilation against Linux 3.2.0 Ubuntu kernels (4.2.12 regression as a side effect of the Debian kernel build fix; bug #11709) X11 Additions: reduced the CPU load of VBoxClient in drag'and'drop mode OS/2 Additions: made the mouse wheel work (bug #6793) Guest Additions: fixed problems copying and pasting between two guests on an X11 host (bug #11792) The full changelog can be found here. You can download binaries for Solaris, Linux, Windows and MacOS hosts at http://www.virtualbox.org/wiki/Downloads Technocrati Tags: Oracle Virtualization VirtualBox

    Read the article

  • Sun Blade 6000 Interactive 3D Demo

    - by ferhat
    Three dimensional  fly-by demos of Sun systems are available from your everyday Java-enabled browsers. Oracle's flexible, eco-efficient Sun Blade 6000 chassis integrates Oracle's x86 and SPARC server blade modules with high-capacity networking and storage blades to support a wide range of application environments.  Click on the static picture below to enter the interactive 3D demo mode: Visit Oracle Technology Network pages and product pages for more information on Oracle's Sun Blades Servers.   

    Read the article

  • Decoding the SQL Server Index Structure

    A deep dive into the implementation of indexes in SQL Server 2008 R2. This is information that you must know in order to tune your queries for optimum performance. Partial scans of indexes are now possible! SQL Server monitoring made easy "Keeping an eye on our many SQL Server instances is much easier with SQL Response." Mike Lile.Download a free trial of SQL Response now.

    Read the article

  • A System Monitoring Tool Primer

    <b>CertCities:</b> "Linux comes with a number of utilities that can be used to monitor one or more of these performance parameters. The following sections introduce a few of these utilities and show how to understand the information presented by them"

    Read the article

  • Business Strategy - Google Case Study

    Business strategy defined by SMBTN.com is a term used in business planning that implies a careful selection and application of resources to obtain a competitive advantage in anticipation of future events or trends. In more general terms business strategy is positioning a company so that it has the greatest competitive advantage over others in the markets and industries that they participate in. This process involves making corporate decisions regarding which markets to provide goods and services, pricing, acceptable quality levels, and how to interact with others in the marketplace. The primary objective of business strategy is to create and increase value for all of its shareholders and stakeholders through the creation of customer value. According to InformationWeek.com, Google has a distinctive technology advantage over its competitors like Microsoft, eBay, Amazon, Yahoo. Google utilizes custom high-performance systems which are cost efficient because they can scale to extreme workloads. This hardware allows for a huge cost advantage over its competitors. In addition, InformationWeek.com interviewed Stephen Arnold who stated that Google’s programmers are 50%-100% more productive compared to programmers working for their competitors.  He based this theory on Google’s competitors having to spend up to four times as much just to keep up. In addition to Google’s technological advantage, they also have developed a decentralized management schema where employees report directly to multiple managers and team project leaders. This allows for the responsibility of the technology department to be shared amongst multiple senior level engineers and removes the need for a singular department head to oversee the activities of the department.  This is a unique approach from the standard management style. Typically a department head like a CIO or CTO would oversee the department’s global initiatives and business functionality.  This would then be passed down and administered through middle management and implemented by programmers, business analyst, network administrators and Database administrators. It goes without saying that an IT professional’s responsibilities would be directed by Google’s technological advantage and management strategy.  Simply because they work within the department, and would have to design, develop, and support the high-performance systems and would have to report multiple managers and project leaders on a regular basis. Since Google was established and driven by new and immerging technology, all other departments would be directly impacted by the technology department.  In fact, they would have to cater to the technology department since it is a huge driving for in the success of Google. Reference: http://www.smbtn.com/smallbusinessdictionary/#b http://www.informationweek.com/news/software/linux/showArticle.jhtml?articleID=192300292&pgno=1&queryText=&isPrev=

    Read the article

  • A Real-Time HPC Approach for Optimizing Multicore Architectures

    Complex math is at the heart of many of the biggest technical challenges. With multicore processors, the type of calculations that would have required a supercomputer can now be performed in real-time, embedded environments. High-performance computing - Supercomputer - Real-time computing - Operating system - Companies

    Read the article

  • e-Seminar para Parceiros - Junho 2010

    - by Claudia Costa
    A equipa de Alliances & Channels apresenta o novo e-Seminar para o mês de Junho. Para se inscrever para a formação que se encontra abaixo por favor utilize os link de registo indicado. Nome Dia Duração Local Introduction to Oracle GoldenGate: Real -Time Data Integration and High Availability Solutions 24 1 hora Início: 9h00 On-line        

    Read the article

  • Creating an ASP.NET Dynamic Web Page Using MS SQL Server 2008 Database (GridView Display)

    Dynamic pages pages that pull insert update and delete data or content from a database are extremely useful in modern websites. They provide a high level of user interactivity that improves user experience. This article will show you how to create such pages in ASP.NET that use a Microsoft SQL Server 2 8 database.... GoGrid Cloud Center Connect Cloud and Dedicated Servers on Your Private Data Center

    Read the article

  • Heroku Postgres: A New SQL Database-as-a-Service

    Idera, a Houston-based company known worldwide for its SQL Server solutions in the realms of backup and recovery, performance monitoring, auditing, security, and more, recently announced that it had won five of SQL Server Magazine's 2011 Community Choice Awards. SQL Server Magazine, a publication produced by Penton Media, offers SQL Server users, both beginning and advanced, a host of hands-on information delivered by SQL Server experts. The magazine presented Idera with 2011 Community Choice Awards for five separate products which will only serve to boost the already strong reputation of it...

    Read the article

  • On the Internet Content is King!

    People don't just visit sites with great graphics and wonderful design, they go for the information they learn from that website. Having high quality content will not just attract visitors, it will also attract search engines and improve your rankings in the search engines.

    Read the article

  • T-SQL User-Defined Functions: the good, the bad, and the ugly (part 4)

    - by Hugo Kornelis
    Scalar user-defined functions are bad for performance. I already showed that for T-SQL scalar user-defined functions without and with data access, and for most CLR scalar user-defined functions without data access , and in this blog post I will show that CLR scalar user-defined functions with data access fit into that picture. First attempt Sticking to my simplistic example of finding the triple of an integer value by reading it from a pre-populated lookup table and following the standard recommendations...(read more)

    Read the article

  • An XEvent a Day (28 of 31) – Tracking Page Compression Operations

    - by Jonathan Kehayias
    The Database Compression feature in SQL Server 2008 Enterprise Edition can provide some significant reductions in storage requirements for SQL Server databases, and in the right implementations and scenarios performance improvements as well.  There isn’t really a whole lot of information about the operations of database compression that is documented as being available in the DMV’s or SQL Trace.  Paul Randal pointed out on Twitter today that sys.dm_db_index_operational_stats() provides...(read more)

    Read the article

  • An XEvent a Day (26 of 31) – Configuring Session Options

    - by Jonathan Kehayias
    There are 7 Session level options that can be configured in Extended Events that affect the way an Event Session operates.  These options can impact performance and should be considered when configuring an Event Session.  I have made use of a few of these periodically throughout this months blog posts, and in today’s blog post I’ll cover each of the options separately, and provide further information about their usage.  Mike Wachal from the Extended Events team at Microsoft, talked...(read more)

    Read the article

  • Major Steps of SEO

    In modern world, we find that web marketing is gaining an increased foothold potential. Every organization aims at increasing the number of customers through online promotion and advertising. It can be achieved through higher ranking of the websites in the search engine. The websites which rank high in the search engine will definitely get a good number of visits by the searchers than the other sources.

    Read the article

  • 9/18 Live Webcast: Three Compelling Reasons to Upgrade to Oracle Database 11g - Still time to register

    - by jgelhaus
    If you or your organization is still working with Oracle Database 10g or an even older version, now is the time to upgrade. Oracle Database 11g offers a wide variety of advantages to enhance your operation. Join us 10 am PT / 1pm ET September 18th for this live Webcast and learn about what you’re missing: the business, operational, and technical benefits. With Oracle Database 11g, you can: Upgrade with zero downtime Improve application performance and database security Reduce the amount of storage required Save time and money Register today 

    Read the article

  • NoFollow and DoFollow Blog - Comment Links Affect Search Engine Optimization

    Many sellers of information products leave comments on blogs, especially popular ones with high Google PageRank, thinking that they are getting those good inbound links to their sites. But there's a problem. Most blogs put a "No Follow" tag in the link to your website. Sure, readers can click on it and check you out, and that's a good thing. But you get no SEO benefit.

    Read the article

  • On-Demand Webcast: Managing Oracle Exadata with Oracle Enterprise Manager 11g

    - by Scott McNeil
    Watch this on-demand webcast and discover how Oracle Enterprise Manager 11g's unique management capabilities allow you to efficiently manage all stages of Oracle Exadata's lifecycle, from testing applications on Exadata to deployment. You'll learn how to: Maximize and predict database performance Drive down IT operational costs through automation Ensure service quality with proactive management Register today and unlock the potential of Oracle Exadata for your enterprise. Register Now!

    Read the article

  • what's included in a typical computer architecture class? [closed]

    - by sq1020
    Does this description fit what's usually included in a computer architecture class? Computer Organization and Assembly Language An introduction to the hardware organization and assembly language of the Intel processor. Topics include memory hierarchy and design- CPU design- pipelining- addressing modes- subroutine linkage- polled input/output- interrupts- high level language interfacing and macros.

    Read the article

  • London User Group Meetings this week (19th/20th May); 26th May-Agile Data Warehousing; 17th June-Kim

    - by tonyrogerson
    Got two user group meetings in London for you, we've also started the Cuppa Corner sessions - the first 3 are up on the site - A trip to First Normal Form, Lookup and Cache Transform in SSIS and Pipeline Limiter in SSIS - we are aiming for at least one per week. WhereScape are doing a breakfast meeting on Agile techniques to Data Warehousing and Kimberly Tripp and Paul Randal are over in June for a 1 day master class. Finally a 3 day performance and monitoring workshop on 22- 24th June in London...(read more)

    Read the article

  • User connection management in Reporting Services configuration

    - by Testas
    IT professionals will use Reporting Services Configuration Manager to perform post installation tasks for SQL Server Reporting Services. Introduced in SQL Server 2005, Reporting Services Configuration Manager provides an intuitive interface to perform tasks including specifying the report server database, report manager url, and indeed one of the first post installation tasks that should be performed is backing up the encryption keys that are used to protect the sensitive information within the rdl files.  Many of the options that are selected within Reporting Services Configuration Manager are written to a number of configuration files including the rsreportserver.config file located in C:\Program Files\Microsoft SQL Server\Report Server InstanceName\Reporting Services\ReportServer folder.When opening this file you will notice that there are more configuration settings within the rsreportserver.config file than is available through the Reporting Services Configuration Manager Interface. As a result there are additional configuration options that can be defined within this file.  A customer was having a problem performing stress tests against a new Report Server that would be going live for an enterprise reporting system. One aspect of the stress test was to fire 50 connections from a single user account. When performing the stress test an error described that the maximum active request had been exceeded. Within the rsreportserver.config, there is a key that is added to the file:  <Add Key=”MaxActiveReqForOneUser” Value=”20”/>  Changing the value from 20 to 50 accommodated the needs of the stress test, however, a wider question should be asked pertaining to this setting when implementing Reporting Services to a production environment. Within an intranet environment, the default setting is appropriate when network bandwidth is high, users are known and demand for reports is particularly high from a group of users.  However, when deploying a Reporting Server solution to an extranet, or the internet, you may want to consider reducing this setting to reduce to scope of connections that can be acquired by a single user and placing unnecessary pressure on the report server. I do hope that Reporting Services Configuration Manager evolves to include an advanced page that includes an intuitive interface to change configuration settings such as the MaxActiveReqForOneUser, and also configure rendering and data extensions and define secure connection levels to the report server. All these options can be configured within the rsreportserver.config file, and these are setting that customers would like to see in Reporting Services Configuration Manager in the future.   If you think that the SQL community would benefit from this addition, you can vote on it at Microsoft Connect  https://connect.microsoft.com/SQLServer/feedback/details/565575/extending-reporting-services-configuration-manager-rscm    

    Read the article

< Previous Page | 299 300 301 302 303 304 305 306 307 308 309 310  | Next Page >