Search Results

Search found 258446 results on 10338 pages for 'stack memory'.

Page 185/10338 | < Previous Page | 181 182 183 184 185 186 187 188 189 190 191 192  | Next Page >

  • Gnome 3 freezes on logon on samsung RV 509

    - by Noufal
    I have a Samsung NP-RV509 A0FIN and I tried to install GNU/Linux with gnome 3.2 on it. I tried Fedora 16, Ubuntu 11.10 and Linux Mint 12 RC, but with no success. All of these freezes upon login into gnome shell. I think it is the problem with graphics driver, so I tried xorg-edgers ppa on my last installation, ie., Linux Mint. I also tried various intel graphics packages listed on Synaptic package manager, but no success again. My device configuration is as follows(obtained from windows 7): More details about my computer Component Details Subscore Base score Processor Intel(R) Pentium(R) CPU P6200 @ 2.13GHz 5.6 4.6 Memory (RAM) 4.00 GB 7.2 Graphics Intel(R) HD Graphics 4.6 Gaming graphics 1562 MB Total available graphics memory 5.2 Primary hard disk 12GB Free (50GB Total) 5.9 Windows 7 Ultimate System -------------------------------------------------------------------------------- Manufacturer SAMSUNG ELECTRONICS CO., LTD. Model RV409/RV509/RV709 Total amount of system memory 4.00 GB RAM System type 32-bit operating system Number of processor cores 2 64-bit capable Yes Storage -------------------------------------------------------------------------------- Total size of hard disk(s) 418 GB Disk partition (C:) 12 GB Free (50 GB Total) Media drive (D:) CD/DVD Disk partition (E:) 526 MB Free (191 GB Total) Disk partition (F:) 101 GB Free (177 GB Total) Graphics -------------------------------------------------------------------------------- Display adapter type Intel(R) HD Graphics Total available graphics memory 1562 MB Dedicated graphics memory 64 MB Dedicated system memory 0 MB Shared system memory 1498 MB Display adapter driver version 8.15.10.2202 Primary monitor resolution 1366x768 DirectX version DirectX 10 Network -------------------------------------------------------------------------------- Network Adapter Realtek PCIe GBE Family Controller Network Adapter Broadcom 802.11n Network Adapter Network Adapter Microsoft Virtual WiFi Miniport Adapter Notes -------------------------------------------------------------------------------- The gaming graphics score is based on the primary graphics adapter. If this system has linked or multiple graphics adapters, some software applications may see additional performance benefits. Any help is appreciated, and thanks in advance.

    Read the article

  • How to Avoid Your Next 12-Month Science Project

    - by constant
    While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack. After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades? On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea. Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company. Engineering Systems is Hard Work! The backbone of Exalogic is its InfiniBand network: 4 times better bandwidth than even 10 Gigabit Ethernet, and only about a tenth of its latency. What a potential for increased scalability and throughput across the middleware and database layers! But InfiniBand is a beast that needs to be tamed: It is true that Exalogic uses a standard, open-source Open Fabrics Enterprise Distribution (OFED) InfiniBand driver stack. Unfortunately, this software has been developed by the HPC community with fastest speed in mind (which is good) but, despite the name, not many other enterprise-class requirements are included (which is less good). Here are some of the improvements that Oracle's InfiniBand development team had to add to the OFED stack to make it enterprise-ready, simply because typical HPC users didn't have the need to implement them: More than 100 bug fixes in the pieces that were not related to the Message Passing Interface Protocol (MPI), which is the protocol that HPC users use most of the time, but which is less useful in the enterprise. Performance optimizations and tuning across the whole IB stack: From Switches, Host Channel Adapters (HCAs) and drivers to low-level protocols, middleware and applications. Yes, even the standard HPC IB stack could be improved in terms of performance. Ethernet over IB (EoIB): Exalogic uses InfiniBand internally to reach high performance, but it needs to play nicely with datacenters around it. That's why Oracle added Ethernet over InfiniBand technology to it that allows for creating many virtual 10GBE adapters inside Exalogic's nodes that are aggregated and connected to Exalogic's IB gateway switches. While this is an open standard, it's up to the vendor to implement it. In this case, Oracle integrated the EoIB stack with Oracle's own IB to 10GBE gateway switches, and made it fully virtualized from the beginning. This means that Exalogic customers can completely rewire their server infrastructure inside the rack without having to physically pull or plug a single cable - a must-have for every cloud deployment. Anybody who wants to match this level of integration would need to add an InfiniBand switch development team to their project. Or just buy Oracle's gateway switches, which are conveniently shipped with a whole server infrastructure attached! IPv6 support for InfiniBand's Sockets Direct Protocol (SDP), Reliable Datagram Sockets (RDS), TCP/IP over IB (IPoIB) and EoIB protocols. Because no IPv6 = not very enterprise-class. HA capability for SDP. High Availability is not a big requirement for HPC, but for enterprise-class application servers it is. Every node in Exalogic's InfiniBand network is connected twice for redundancy. If any cable or port or HCA fails, there's always a replacement link ready to take over. This requires extra magic at the protocol level to work. So in addition to Weblogic's failover capabilities, Oracle implemented IB automatic path migration at the SDP level to avoid unnecessary failover operations at the middleware level. Security, for example spoof-protection. Another feature that is less important for traditional users of InfiniBand, but very important for enterprise customers. InfiniBand Partitioning and Quality-of-Service (QoS): One of the first questions we get from customers about Exalogic is: “How can we implement multi-tenancy?” The answer is to partition your IB network, which effectively creates many networks that work independently and that are protected at the lowest networking layer possible. In addition to that, QoS allows administrators to prioritize traffic flow in multi-tenancy environments so they can keep their service levels where it matters most. Resilient IB Fabric Management: InfiniBand is a self-managing network, so a lot of the magic lies in coming up with the right topology and in teaching the subnet manager how to properly discover and manage the network. Oracle's Infiniband switches come with pre-integrated, highly available fabric management with seamless integration into Oracle Enterprise Manager Ops Center. In short: Oracle elevated the OFED InfiniBand stack into an enterprise-class networking infrastructure. Many years and multiple teams of manpower went into the above improvements - this is something you can only get from Oracle, because no other InfiniBand vendor can give you these features across the whole stack! Exabus: Because it's not About the Size of Your Network, it's How You Use it! So let's assume that you somehow were able to get your hands on an enterprise-class IB driver stack. Or maybe you don't care and are just happy with the standard OFED one? Anyway, the next step is to actually leverage that InfiniBand performance. Here are the choices: Use traditional TCP/IP on top of the InfiniBand stack, Develop your own integration between your middleware and the lower-level (but faster) InfiniBand protocols. While more bandwidth is always a good thing, it's actually the low latency that enables superior performance for your applications when running on any networking infrastructure: The lower the latency, the faster the response travels through the network and the more transactions you can close per second. The reason why InfiniBand is such a low latency technology is that it gets rid of most if not all of your traditional networking protocol stack: Data is literally beamed from one region of RAM in one server into another region of RAM in another server with no kernel/drivers/UDP/TCP or other networking stack overhead involved! Which makes option 1 a no-go: Adding TCP/IP on top of InfiniBand is like adding training wheels to your racing bike. It may be ok in the beginning and for development, but it's not quite the performance IB was meant to deliver. Which only leaves option 2: Integrating your middleware with fast, low-level InfiniBand protocols. And this is what Exalogic's "Exabus" technology is all about. Here are a few Exabus features that help applications leverage the performance of InfiniBand in Exalogic: RDMA and SDP integration at the JDBC driver level (SDP), for Oracle Weblogic (SDP), Oracle Coherence (RDMA), Oracle Tuxedo (RDMA) and the new Oracle Traffic Director (RDMA) on Exalogic. Using these protocols, middleware can communicate a lot faster with each other and the Oracle database than by using standard networking protocols, Seamless Integration of Ethernet over InfiniBand from Exalogic's Gateway switches into the OS, Oracle Weblogic optimizations for handling massive amounts of parallel transactions. Because if you have an 8-lane Autobahn, you also need to improve your ramps so you can feed it with many cars in parallel. Integration of Weblogic with Oracle Exadata for faster performance, optimized session management and failover. As you see, “Exabus” is Oracle's word for describing all the InfiniBand enhancements Oracle put into Exalogic: OFED stack enhancements, protocols for faster IB access, and InfiniBand support and optimizations at the virtualization and middleware level. All working together to deliver the full potential of InfiniBand performance. Who else has 100% control over their middleware so they can develop their own low-level protocol integration with InfiniBand? Even if you take an open source approach, you're looking at years of development work to create, test and support a whole new networking technology in your middleware! The Extras: Less Hassle, More Productivity, Faster Time to Market And then there are the other advantages of Engineered Systems that are true for Exalogic the same as they are for every other Engineered System: One simple purchasing process: No headaches due to endless RFPs and no “Will X work with Y?” uncertainties. Everything has been engineered together: All kinds of bugs and problems have been already fixed at the design level that would have only manifested themselves after you have built the system from scratch. Everything is built, tested and integrated at the factory level . Less integration pain for you, faster time to market. Every Exalogic machine world-wide is identical to Oracle's own machines in the lab: Instant replication of any problems you may encounter, faster time to resolution. Simplified patching, management and operations. One throat to choke: Imagine finger-pointing hell for systems that have been put together using several different vendors. Oracle's Engineered Systems have a single phone number that customers can call to get their problems solved. For more business-centric values, read The Business Value of Engineered Systems. Conclusion: Buy Exalogic, or get ready for a 6-12 Month Science Project And here's the reason why it's not easy to "build your own Exalogic": There's a lot of work required to make such a system fly. In fact, anybody who is starting to "just put together a bunch of servers and an InfiniBand network" is really looking at a 6-12 month science project. And the outcome is likely to not be very enterprise-class. And it won't have Exalogic's performance either. Because building an Engineered System is literally rocket science: It takes a lot of time, effort, resources and many iterations of design/test/analyze/fix to build such a system. That's why InfiniBand has been reserved for HPC scientists for such a long time. And only Oracle can bring the power of InfiniBand in an enterprise-class, ready-to use, pre-integrated version to customers, without the develop/integrate/support pain. For more details, check the new Exalogic overview white paper which was updated only recently. P.S.: Thanks to my colleagues Ola, Paul, Don and Andy for helping me put together this article! var flattr_uid = '26528'; var flattr_tle = 'How to Avoid Your Next 12-Month Science Project'; var flattr_dsc = 'While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack.After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades?On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea.Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company.'; var flattr_tag = 'Engineered Systems,Engineered Systems,Infiniband,Integration,latency,Oracle,performance'; var flattr_cat = 'text'; var flattr_url = 'http://constantin.glez.de/blog/2012/04/how-avoid-your-next-12-month-science-project'; var flattr_lng = 'en_GB'

    Read the article

  • Organizations &amp; Architecture UNISA Studies &ndash; Chap 7

    - by MarkPearl
    Learning Outcomes Name different device categories Discuss the functions and structure of I/.O modules Describe the principles of Programmed I/O Describe the principles of Interrupt-driven I/O Describe the principles of DMA Discuss the evolution characteristic of I/O channels Describe different types of I/O interface Explain the principles of point-to-point and multipoint configurations Discuss the way in which a FireWire serial bus functions Discuss the principles of InfiniBand architecture External Devices An external device attaches to the computer by a link to an I/O module. The link is used to exchange control, status, and data between the I/O module and the external device. External devices can be classified into 3 categories… Human readable – e.g. video display Machine readable – e.g. magnetic disk Communications – e.g. wifi card I/O Modules An I/O module has two major functions… Interface to the processor and memory via the system bus or central switch Interface to one or more peripheral devices by tailored data links Module Functions The major functions or requirements for an I/O module fall into the following categories… Control and timing Processor communication Device communication Data buffering Error detection I/O function includes a control and timing requirement, to coordinate the flow of traffic between internal resources and external devices. Processor communication involves the following… Command decoding Data Status reporting Address recognition The I/O device must be able to perform device communication. This communication involves commands, status information, and data. An essential task of an I/O module is data buffering due to the relative slow speeds of most external devices. An I/O module is often responsible for error detection and for subsequently reporting errors to the processor. I/O Module Structure An I/O module functions to allow the processor to view a wide range of devices in a simple minded way. The I/O module may hide the details of timing, formats, and the electro mechanics of an external device so that the processor can function in terms of simple reads and write commands. An I/O channel/processor is an I/O module that takes on most of the detailed processing burden, presenting a high-level interface to the processor. There are 3 techniques are possible for I/O operations Programmed I/O Interrupt[t I/O DMA Access Programmed I/O When a processor is executing a program and encounters an instruction relating to I/O it executes that instruction by issuing a command to the appropriate I/O module. With programmed I/O, the I/O module will perform the requested action and then set the appropriate bits in the I/O status register. The I/O module takes no further actions to alert the processor. I/O Commands To execute an I/O related instruction, the processor issues an address, specifying the particular I/O module and external device, and an I/O command. There are four types of I/O commands that an I/O module may receive when it is addressed by a processor… Control – used to activate a peripheral and tell it what to do Test – Used to test various status conditions associated with an I/O module and its peripherals Read – Causes the I/O module to obtain an item of data from the peripheral and place it in an internal buffer Write – Causes the I/O module to take an item of data form the data bus and subsequently transmit that data item to the peripheral The main disadvantage of this technique is it is a time consuming process that keeps the processor busy needlessly I/O Instructions With programmed I/O there is a close correspondence between the I/O related instructions that the processor fetches from memory and the I/O commands that the processor issues to an I/O module to execute the instructions. Typically there will be many I/O devices connected through I/O modules to the system – each device is given a unique identifier or address – when the processor issues an I/O command, the command contains the address of the address of the desired device, thus each I/O module must interpret the address lines to determine if the command is for itself. When the processor, main memory and I/O share a common bus, two modes of addressing are possible… Memory mapped I/O Isolated I/O (for a detailed explanation read page 245 of book) The advantage of memory mapped I/O over isolated I/O is that it has a large repertoire of instructions that can be used, allowing more efficient programming. The disadvantage of memory mapped I/O over isolated I/O is that valuable memory address space is sued up. Interrupts driven I/O Interrupt driven I/O works as follows… The processor issues an I/O command to a module and then goes on to do some other useful work The I/O module will then interrupts the processor to request service when is is ready to exchange data with the processor The processor then executes the data transfer and then resumes its former processing Interrupt Processing The occurrence of an interrupt triggers a number of events, both in the processor hardware and in software. When an I/O device completes an I/O operations the following sequence of hardware events occurs… The device issues an interrupt signal to the processor The processor finishes execution of the current instruction before responding to the interrupt The processor tests for an interrupt – determines that there is one – and sends an acknowledgement signal to the device that issues the interrupt. The acknowledgement allows the device to remove its interrupt signal The processor now needs to prepare to transfer control to the interrupt routine. To begin, it needs to save information needed to resume the current program at the point of interrupt. The minimum information required is the status of the processor and the location of the next instruction to be executed. The processor now loads the program counter with the entry location of the interrupt-handling program that will respond to this interrupt. It also saves the values of the process registers because the Interrupt operation may modify these The interrupt handler processes the interrupt – this includes examination of status information relating to the I/O operation or other event that caused an interrupt When interrupt processing is complete, the saved register values are retrieved from the stack and restored to the registers Finally, the PSW and program counter values from the stack are restored. Design Issues Two design issues arise in implementing interrupt I/O Because there will be multiple I/O modules, how does the processor determine which device issued the interrupt? If multiple interrupts have occurred, how does the processor decide which one to process? Addressing device recognition, 4 general categories of techniques are in common use… Multiple interrupt lines Software poll Daisy chain Bus arbitration For a detailed explanation of these approaches read page 250 of the textbook. Interrupt driven I/O while more efficient than simple programmed I/O still requires the active intervention of the processor to transfer data between memory and an I/O module, and any data transfer must traverse a path through the processor. Thus is suffers from two inherent drawbacks… The I/O transfer rate is limited by the speed with which the processor can test and service a device The processor is tied up in managing an I/O transfer; a number of instructions must be executed for each I/O transfer Direct Memory Access When large volumes of data are to be moved, an efficient technique is direct memory access (DMA) DMA Function DMA involves an additional module on the system bus. The DMA module is capable of mimicking the processor and taking over control of the system from the processor. It needs to do this to transfer data to and from memory over the system bus. DMA must the bus only when the processor does not need it, or it must force the processor to suspend operation temporarily (most common – referred to as cycle stealing). When the processor wishes to read or write a block of data, it issues a command to the DMA module by sending to the DMA module the following information… Whether a read or write is requested using the read or write control line between the processor and the DMA module The address of the I/O device involved, communicated on the data lines The starting location in memory to read from or write to, communicated on the data lines and stored by the DMA module in its address register The number of words to be read or written, communicated via the data lines and stored in the data count register The processor then continues with other work, it delegates the I/O operation to the DMA module which transfers the entire block of data, one word at a time, directly to or from memory without going through the processor. When the transfer is complete, the DMA module sends an interrupt signal to the processor, this the processor is involved only at the beginning and end of the transfer. I/O Channels and Processors Characteristics of I/O Channels As one proceeds along the evolutionary path, more and more of the I/O function is performed without CPU involvement. The I/O channel represents an extension of the DMA concept. An I/O channel ahs the ability to execute I/O instructions, which gives it complete control over I/O operations. In a computer system with such devices, the CPU does not execute I/O instructions – such instructions are stored in main memory to be executed by a special purpose processor in the I/O channel itself. Two types of I/O channels are common A selector channel controls multiple high-speed devices. A multiplexor channel can handle I/O with multiple characters as fast as possible to multiple devices. The external interface: FireWire and InfiniBand Types of Interfaces One major characteristic of the interface is whether it is serial or parallel parallel interface – there are multiple lines connecting the I/O module and the peripheral, and multiple bits are transferred simultaneously serial interface – there is only one line used to transmit data, and bits must be transmitted one at a time With new generation serial interfaces, parallel interfaces are becoming less common. In either case, the I/O module must engage in a dialogue with the peripheral. In general terms the dialog may look as follows… The I/O module sends a control signal requesting permission to send data The peripheral acknowledges the request The I/O module transfers data The peripheral acknowledges receipt of data For a detailed explanation of FireWire and InfiniBand technology read page 264 – 270 of the textbook

    Read the article

  • How to keep track of previous scenes and return to them in libgdx

    - by MxyL
    I have three scenes: SceneTitle, SceneMenu, SceneLoad. (The difference between the title scene and the menu scene is that the title scene is what you see when you first turn on the game, and the menu scene is what you can access during the game. During the game, meaning, after you've hit "play!" in the title scene.) I provide the ability to save progress and consequently load a particular game. An issue that I've run into is being able to easily keep track of the previous scene. For example, if you enter the load scene and then decide to change your mind, the game needs to go back to where you were before; this isn't something that can be hardcoded. Now, an easy solution off the top of my head is to simply maintain a scene stack, which basically keeps track of history for me. A simple transaction would be as follows I'm currently in the menu scene, so the top of the stack is SceneMenu I go to the load scene, so the game pushes SceneLoad onto the stack. When I return from the load scene, the game pops SceneLoad off the stack and initializes the scene that's currently at the top, which is SceneMenu I'm coding in Java, so I can't simply pass around Classes as if they were objects, so I've decided implemented as enum for eac scene and put that on the stack and then have my scene managing class go through a list of if conditions to return the appropriate instance of the class. How can I implement my scene stack without having to do too much work maintaining it?

    Read the article

  • More Denali Execution Plan Warning Goodies

    - by Dave Ballantyne
    In my last blog, I showed how the execution plan in denali has been enhanced by 2 new warnings ,conversion affecting cardinality and conversion affecting seek, which are shown when a data type conversion has happened either implicitly or explicitly. That is not all though, there is more .  Also added are two warnings when performance has been affected due to memory issues. Memory spills to tempdb are a costly operation and happen when SqlServer is under memory pressure and needs to free some up. For a long time you have been able to see these as warnings in a profiler trace as a sort or hash warning event,  but now they are included right in the execution plan.  Not only that but also you can see which operator caused the spill , not just which statement.  Pretty damn handy. Another cause of performance problems relating to memory are memory grant waits.  Here is an informative write up on them,  but simply speaking , SQLServer has to allocate a certain amount of memory for each statement. If it is unable to you get a “memory grant wait”.  Once again there are other methods of analyzing these,  but the plan now shows these too. Don't worry that’s not real production code There is one other new warning that is of interest to me, “Unmatched Indexes”.  Once I find out the conditions under which that fires ill blog about it.

    Read the article

  • Series On Embedded Development (Part 1)

    - by user12612705
    This is the first in a series of entries on developing applications for the embedded environment. Most of this information is relevant to any type of embedded development (and even for desktop and server too), not just Java. This information is based on a talk Hinkmond Wong and I gave at JavaOne 2012 entitled Reducing Dynamic Memory in Java Embedded Applications. One thing to remember when developing embeddded applications is that memory matters. Yes, memory matters in desktop and server environments as well, but there's just plain less of it in embedded devices. So I'm going to be talking about saving this precious resource as well as another precious resource, CPU cycles...and a bit about power too. CPU matters too, and again, in embedded devices, there's just plain less of it. What you'll find, no surprise, is that there's a trade-off between performance and memory. To get better performance, you need to use more memory, and to save more memory, you need to need to use more CPU cycles. I'll be discussing three Memory Reduction Categories: - Optionality, both build-time and runtime. Optionality is about providing options so you can get rid of the stuff you don't need and include the stuff you do need. - Tunability, which is about providing options so you can tune your application by trading performance for size, and vice-versa. - Efficiency, which is about balancing size savings with performance.

    Read the article

  • Nokogiri pull parser (Nokogiri::XML::Reader) issue with self closing tag

    - by Vlad Zloteanu
    I have a huge XML(400MB) containing products. Using a DOM parser is therefore excluded, so i tried to parse and process it using a pull parser. Below is a snippet from the each_product(&block) method where i iterate over the product list. Basically, using a stack, i transform each <product> ... </product> node into a hash and process it. while (reader.read) case reader.node_type #start element when Nokogiri::XML::Node::ELEMENT_NODE elem_name = reader.name.to_s stack.push([elem_name, {}]) #text element when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE stack.last[1] = reader.value #end element when Nokogiri::XML::Node::ELEMENT_DECL return if stack.empty? elem = stack.pop parent = stack.last if parent.nil? yield(elem[1]) elem = nil next end key = elem[0] parent_childs = parent[1] # ... parent_childs[key] = elem[1] end The issue is on self-closing tags (EG <country/>), as i can not make the difference between a 'normal' and a 'self-closing' tag. They both are of type Nokogiri::XML::Node::ELEMENT_NODE and i am not able to find any other discriminator in the documentation. Any ideas on how to solve this issue?

    Read the article

  • Imagemagick PDF to JPG conversion failing

    - by sbressler
    I'm trying to convert the first page of a PDF to a JPG. I'm pretty sure I got this to work with certain PDFs, but is it really possible that certain PDFs are made incorrectly and cannot be converted? I tried running this first: $ convert 10-03-26.pdf[1] test.jpg And I got the follow: Error: /syntaxerror in readxref Operand stack: Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1 3 %oparray_pop 1 3 %oparray_pop --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- Dictionary stack: --dict:1062/1417(ro)(G)-- --dict:0/20(G)-- --dict:73/200(L)-- --dict:73/200(L)-- --dict:97/127(ro)(G)-- --dict:229/230(ro)(G)-- --dict:14/15(L)-- Current allocation mode is local ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1 convert: Postscript delegate failed `10-03-26.pdf'. Running this instead: $ convert -verbose -colorspace rgb '10-03-26.pdf[1]' test.jpg I get the following: Error: /syntaxerror in readxref Operand stack: Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1 3 %oparray_pop 1 3 %oparray_pop --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- Dictionary stack: --dict:1062/1417(ro)(G)-- --dict:0/20(G)-- --dict:73/200(L)-- --dict:73/200(L)-- --dict:97/127(ro)(G)-- --dict:229/230(ro)(G)-- --dict:14/15(L)-- Current allocation mode is local ESP Ghostscript 7.07.1: Unrecoverable error, exit code 1 "gs" -q -dBATCH -dSAFER -dMaxBitmap=500000000 -dNOPAUSE -dAlignToPixels=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-g792x1611" "-r72x72" -dFirstPage=2 -dLastPage=2 "-sOutputFile=/tmp/magick-XXU3T44P" "-f/tmp/magick-XXoMKL8Z" "-f/tmp/magic2eec1F"Start of Image Define Huffman Table 0x00 0 1 5 1 1 1 1 1 1 0 0 0 0 0 0 0 Define Huffman Table 0x01 0 3 1 1 1 1 1 1 1 1 1 0 0 0 0 0 Define Huffman Table 0x10 0 2 1 3 3 2 4 3 5 5 4 4 0 0 1 125 Define Huffman Table 0x11 0 2 1 2 4 4 3 4 7 5 4 4 0 1 2 119 End Of Image convert: Postscript delegate failed `10-03-26.pdf'. Why would the conversion fail? Thanks!

    Read the article

  • Issue with getting 2 chars from string using indexer

    - by Learner
    I am facing an issue in reading char values. See my program below. I want to evaluate an infix expression. As you can see I want to read '10' , '*', '20' and then use them...but if I use string indexer s[0] will be '1' and not '10' and hence I am not able to get the expected result. Can you guys suggest me something? Code is in c# class Program { static void Main(string[] args) { string infix = "10*2+20-20+3"; float result = EvaluateInfix(infix); Console.WriteLine(result); Console.ReadKey(); } public static float EvaluateInfix(string s) { Stack<float> operand = new Stack<float>(); Stack<char> operator1 = new Stack<char>(); int len = s.Length; for (int i = 0; i < len; i++) { if (isOperator(s[i])) // I am having an issue here as s[i] gives each character and I want the number 10 operator1.Push(s[i]); else { operand.Push(s[i]); if (operand.Count == 2) Compute(operand, operator1); } } return operand.Pop(); } public static void Compute(Stack<float> operand, Stack<char> operator1) { float operand1 = operand.Pop(); float operand2 = operand.Pop(); char op = operator1.Pop(); if (op == '+') operand.Push(operand1 + operand2); else if(op=='-') operand.Push(operand1 - operand2); else if(op=='*') operand.Push(operand1 * operand2); else if(op=='/') operand.Push(operand1 / operand2); } public static bool isOperator(char c) { bool result = false; if (c == '+' || c == '-' || c == '*' || c == '/') result = true; return result; } } }

    Read the article

  • setcontext and makecontext to call a generic function pointer

    - by Simone Margaritelli
    In another question i had the problem to port the code unsigned long stack[] = { 1, 23, 33, 43 }; /* save all the registers and the stack pointer */ unsigned long esp; asm __volatile__ ( "pusha" ); asm __volatile__ ( "mov %%esp, %0" :"=m" (esp)); for( i = 0; i < sizeof(stack); i++ ){ unsigned long val = stack[i]; asm __volatile__ ( "push %0" :: "m"(val) ); } unsigned long ret = function_pointer(); /* restore registers and stack pointer */ asm __volatile__ ( "mov %0, %%esp" :: "m" (esp) ); asm __volatile__ ( "popa" ); To a 64bit platform and many guys told me i should use the setcontext and makecontext functions set instead due to the calling conversion differences between 32 and 64 bits and portability issues. Well, i really can't find any useful documentation online, or at least not the kind i need to implement this, so, how can i use those functions to push arguments onto the stack, call a generic function pointer, obtain the return value and then restore the registers?

    Read the article

  • Asterisk: Dropping calls with an "ast_yyerror"

    - by Nick
    I'm having an issue where asterisk will play our greeting to the caller, and then drop the call instead of making our phones ring. The bit of information I could find said it was caused by an error in evaluating a dialplan expression. I'm thinking it's this line: exten = START,n,GotoIf($[${FORCE_CLOSED}=TRUE]?CLOSED,1) But I'm not sure what's wrong with it. I see the following error on the console: [Apr 4 16:29:49] WARNING[27038]: ast_expr2.fl:459 ast_yyerror: ast_yyerror(): syntax error: syntax error, unexpected '=', expecting $end; Input:=TRUE^ Surrounding Console output: -- Executing [START@AGInbound:1] Answer("IAX2/AtlantaTeliax-10086", "") in new stack -- Executing [START@AGInbound:2] BackGround("IAX2/AtlantaTeliax-10086", 0000_AG_THANK_YOU_FOR_CALLING_AG") in new stack -- Playing '0000_AG_THANK_YOU_FOR_CALLING_AG.slin' (language 'en') [Apr 4 16:29:49] WARNING[27038]: ast_expr2.fl:459 ast_yyerror: ast_yyerror(): syntax error: syntax error, unexpected '=', expecting $end; Input: =TRUE ^ [Apr 4 16:29:49] WARNING[27038]: ast_expr2.fl:463 ast_yyerror: If you have questions, please refer to doc/tex/channelvariables.tex in the asterisk source. -- Executing [START@AGInbound:3] GotoIf("IAX2/AtlantaTeliax-10086", "?CLOSED,1") in new stack -- Executing [START@AGInbound:4] GotoIfTime("IAX2/AtlantaTeliax-10086", "9:30-17:0|mon-fri|*|*?OPEN,1") in new stack -- Executing [START@AGInbound:5] GotoIfTime("IAX2/AtlantaTeliax-10086", "10:0-18:30|sat|*|*?OPEN,1") in new stack -- Executing [START@AGInbound:6] GotoIfTime("IAX2/AtlantaTeliax-10086", "12:0-17:0|sun|*|*?OPEN,1") in new stack Relevant lines from the dial plan: exten = START,1,Answer() exten = START,n,Background(0000_AG_THANK_YOU_FOR_CALLING_AG) ; See if we're open ; Force Closed if no one's going to be answering exten = START,n,GotoIf($[${FORCE_CLOSED}=TRUE]?CLOSED,1) exten = START,n,GotoIfTime(${AG_WEEKDAY_OPEN_HOUR}:${AG_WEEKDAY_OPEN_MIN}-${AG$ exten = START,n,GotoIfTime(${AG_SATURDAY_OPEN_HOUR}:${AG_SATURDAY_OPEN_MIN}-${$ exten = START,n,GotoIfTime(${AG_SUNDAY_OPEN_HOUR}:${AG_SUNDAY_OPEN_MIN}-${AG_S$ ; ...and we're not. But maybe the time of day has been overridden? exten = START,n,GotoIf($[${OVERRIDE_TIME_OF_DAY}=TRUE]?OPEN,1) ; No override... We're definatly closed. exten = START,n,Goto(CLOSED,1)

    Read the article

  • The cross-thread usage of "HttpContext.Current" property and related things

    - by smwikipedia
    I read from < Essential ASP.NET with Examples in C# the following statement: Another useful property to know about is the static Current property of the HttpContext class. This property always points to the current instance of the HttpContext class for the request being serviced. This can be convenient if you are writing helper classes that will be used from pages or other pipeline classes and may need to access the context for whatever reason. By using the static Current property to retrieve the context, you can avoid passing a reference to it to helper classes. For example, the class shown in Listing 4-1 uses the Current property of the context to access the QueryString and print something to the current response buffer. Note that for this static property to be correctly initialized, the caller must be executing on the original request thread, so if you have spawned additional threads to perform work during a request, you must take care to provide access to the context class yourself. I am wondering about the root cause of the bold part, and one thing leads to another, here is my thoughts: We know that a process can have multiple threads. Each of these threads have their own stacks, respectively. These threads also have access to a shared memory area, the heap. The stack then, as I understand it, is kind of where all the context for that thread is stored. For a thread to access something in the heap it must use a pointer, and the pointer is stored on its stack. So when we make some cross-thread calls, we must make sure that all the necessary context info is passed from the caller thread's stack to the callee thread's stack. But I am not quite sure if I made any mistake. Any comments will be deeply appreciated. Thanks. ADD Here the stack is limited to user stack.

    Read the article

  • How can I get size in bytes of an object sent using RMI?

    - by Lucas Batistussi
    I'm implementing a cache server with MongoDB and ConcurrentHashMap java class. When there are available space to put object in memory, it will put at. Otherwise, the object will be saved in a mongodb database. Is allowed that user specify a size limit in memory (this should not exceed heap size limit obviously!) for the memory cache. The clients can use the cache service connecting through RMI. I need to know the size of each object to verify if a new incoming object can be put into memory. I searched over internet and i got this solution to get size: public long getObjectSize(Object o){ try { ByteArrayOutputStream bos = new ByteArrayOutputStream(); ObjectOutputStream oos = new ObjectOutputStream(bos); oos.writeObject(o); oos.close(); return bos.size(); } catch (Exception e) { return Long.MAX_VALUE; } } This solution works very well. But, in terms of memory use doesn't solve my problem. :( If many clients are verifying the object size at same time this will cause stack overflow, right? Well... some people can say: Why you don't get the specific object size and store it in memory and when another object is need to put in memory check the object size? This is not possible because the objects are variable in size. :( Someone can help me? I was thinking in get socket from RMI communication, but I don't know how to do this...

    Read the article

  • ORA-4030 Troubleshooting

    - by [email protected]
    QUICKLINK: Note 399497.1 FAQ ORA-4030 Note 1088087.1 : ORA-4030 Diagnostic Tools [Video]   Have you observed an ORA-0430 error reported in your alert log? ORA-4030 errors are raised when memory or resources are requested from the Operating System and the Operating System is unable to provide the memory or resources.   The arguments included with the ORA-4030 are often important to narrowing down the problem. For more specifics on the ORA-4030 error and scenarios that lead to this problem, see Note 399497.1 FAQ ORA-4030.   Looking for the best way to diagnose? There are several available diagnostic tools (error tracing, 11g Diagnosibility, OCM, Process Memory Guides, RDA, OSW, diagnostic scripts) that collectively can prove powerful for identifying the cause of the ORA-4030.    Error Tracing   The ORA-4030 error usually occurs on the client workstation and for this reason, a trace file and alert log entry may not have been generated on the server side.  It may be necessary to add additional tracing events to get initial diagnostics on the problem. To setup tracing to trap the ORA-4030, on the server use the following in SQLPlus: alter system set events '4030 trace name heapdump level 536870917;name errorstack level 3';Once the error reoccurs with the event set, you can turn off  tracing using the following command in SQLPlus:alter system set events '4030 trace name context off; name context off';NOTE:   See more diagnostics information to collect in Note 399497.1  11g DiagnosibilityStarting with Oracle Database 11g Release 1, the Diagnosability infrastructure was introduced which places traces and core files into a location controlled by the DIAGNOSTIC_DEST initialization parameter when an incident, such as an ORA-4030 occurs.  For earlier versions, the trace file will be written to either USER_DUMP_DEST (if the error was caught in a user process) or BACKGROUND_DUMP_DEST (if the error was caught in a background process like PMON or SMON). The trace file may contain vital information about what led to the error condition.    Note 443529.1 11g Quick Steps to Package and Send Critical Error Diagnostic Informationto Support[Video]  Oracle Configuration Manager (OCM) Oracle Configuration Manager (OCM) works with My Oracle Support to enable proactive support capability that helps you organize, collect and manage your Oracle configurations. Oracle Configuration Manager Quick Start Guide Note 548815.1: My Oracle Support Configuration Management FAQ Note 250434.1: BULLETIN: Learn More About My Oracle Support Configuration Manager    General Process Memory Guides   An ORA-4030 indicates a limit has been reached with respect to the Oracle process private memory allocation.    Each Operating System will handle memory allocations with Oracle slightly differently. Solaris     Note 163763.1Linux       Note 341782.1IBM AIX   Notes 166491.1 and 123754.1HP           Note 166490.1Windows Note 225349.1, Note 373602.1, Note 231159.1, Note 269495.1, Note 762031.1Generic    Note 169706.1   RDAThe RDA report will show more detailed information about the database and Server Configuration. Note 414966.1 RDA Documentation Index Download RDA -- refer to Note 314422.1 Remote Diagnostic Agent (RDA) 4 - Getting Started OS Watcher (OSW)This tool is designed to gather Operating System side statistics to compare with the findings from the database.  This is a key tool in cases where memory usage is higher than expected on the server while not experiencing ORA-4030 errors currently. Reference more details on setup and usage in Note 301137.1 OS Watcher User Guide Diagnostic Scripts   Refer to Note 1088087.1 : ORA-4030 Diagnostic Tools [Video] Common Causes/Solutions The ORA-4030 can occur for a variety of reasons.  Some common causes are:   * OS Memory limit reached such as physical memory and/or swap/virtual paging.   For instance, IBM AIX can experience ORA-4030 issues related to swap scenarios.  See Note 740603.1 10.2.0.4 not using large pages on AIX for more on that problem. Also reference Note 188149.1 for pointers on 10g and stack size issues.* OS limits reached (kernel or user shell limits) that limit overall, user level or process level memory * OS limit on PGA memory size due to SGA attach address           Reference: Note 1028623.6 SOLARIS How to Relocate the SGA* Oracle internal limit on functionality like PL/SQL varrays or bulk collections. ORA-4030 errors will include arguments like "pl/sql vc2" "pmucalm coll" "pmuccst: adt/re".  See Coding Pointers for pointers on application design to get around these issues* Application design causing limits to be reached* Bug - space leaks, heap leaks   ***For reference to the content in this blog, refer to Note.1088267.1 Master Note for Diagnosing ORA-4030

    Read the article

  • SPARC M7 Chip - 32 cores - Mind Blowing performance

    - by Angelo-Oracle
    The M7 Chip Oracle just announced its Next Generation Processor at the HotChips HC26 conference. As the Tech Lead in our Systems Division's Partner group, I had a front row seat to the extraordinary price performance advantage of Oracle current T5 and M6 based systems. Partner after partner tested  these systems and were impressed with it performance. Just read some of the quotes to see what our partner has been saying about our hardware. We just announced our next generation processor, the M7. This has 32 cores (up from 16-cores in T5 and 12-cores in M6). With 20 nm technology  this is our most advanced processor. The processor has more cores than anything else in the industry today. After the Sun acquisition Oracle has released 5 processors in 4 years and this is the 6th.  The S4 core  The M7 is built using the foundation of the S4 core. This is the next generation core technology. Like its predecessor, the S4 has 8 dynamic threads. It increases the frequency while maintaining the Pipeline depth. Each core has its own fine grain power estimator that keeps the core within its power envelop in 250 nano-sec granularity. Each core also includes Software in Silicon features for Application Acceleration Support. Each core includes features to improve Application Data Integrity, with almost no performance loss. The core also allows using part of the Virtual Address to store meta-data.  User-Level Synchronization Instructions are also part of the S4 core. Each core has 16 KB Instruction and 16 KB Data L1 cache. The Core Clusters  The cores on the M7 chip are organized in sets of 4-core clusters. The core clusters share  L2 cache.  All four cores in the complex share 256 KB of 4 way set associative L2 Instruction Cache, with over 1/2 TB/s of throughput. Two cores share 256 KB of 8 way set associative L2 Data Cache, with over 1/2 TB/s of throughput. With this innovative Core Cluster architecture, the M7 doubles core execution bandwidth. to maximize per-thread performance.  The Chip  Each  M7 chip has 8 sets of these core-clusters. The chip has 64 MB on-chip L3 cache. This L3 caches is shared among all the cores and is partitioned into 8 x 8 MB chunks. Each chunk is  8-way set associative cache. The aggregate bandwidth for the L3 cache on the chip is over 1.6TB/s. Each chip has 4 DDR4 memory controllers and can support upto 16 DDR4 DIMMs, allowing for 2 TB of RAM/chip. The chip also includes 4 internal links of PCIe Gen3 I/O controllers.  Each chip has 7 coherence links, allowing for 8 of these chips to be connected together gluelessly. Also 32 of these chips can be connected in an SMP configuration. A potential system with 32 chips will have 1024 cores and 8192 threads and 64 TB of RAM.  Software in Silicon The M7 chip has many built in Application Accelerators in Silicon. These features will be exposed to our Software partners using the SPARC Accelerator Program.  The M7  has built-in logic to decompress data at the speed of memory access. This means that applications can directly work on compressed data in memory increasing the data access rates. The VA Masking feature allows the use of part of the virtual address to store meta-data.  Realtime Application Data Integrity The Realtime Application Data Integrity feature helps applications safeguard against invalid, stale memory reference and buffer overflows. The first 4-bits if the Pointer can be used to store a version number and this version number is also maintained in the memory & cache lines. When a pointer accesses memory the hardware checks to make sure the two versions match. A SEGV signal is raised when there is a mismatch. This feature can be used by the Database, applications and the OS.  M7 Database In-Memory Query Accelerator The M7 chip also includes a In-Silicon Query Engines.  These accelerate tasks that work on In-Memory Columnar Vectors. Oracle In-Memory options stores data in Column Format. The M7 Query Engine can speed up In-Memory Format Conversion, Value and Range Comparisons and Set Membership lookups. This engine can work on Compressed data - this means not only are we accelerating the query performance but also increasing the memory bandwidth for queries.  SPARC Accelerated Program  At the Hotchips conference we also introduced the SPARC Accelerated Program to provide our partners and third part developers access to all the goodness of the M7's SPARC Application Acceleration features. Please get in touch with us if you are interested in knowing more about this program. 

    Read the article

  • What does DetourAttach(&(PVOID &)BindKeyT, BindKeyD); mean? Attaching a detour to a memory address..

    - by user288546
    Hello everyone! This is just a simple question. I've been reading the source of something which attaches to a memory address of a subroutine using DetourAttach(&(PVOID &)BindKeyT, BindKeyD); where BindKeyT is the address to a subroutine in memory. I'm curious, what exactly does (&(PVOID &) mean in english? I understand that PVOID is a void pointer, but how does this get translated into a function which can be used to attach a detour to?

    Read the article

  • Template inheritence c++

    - by Chris Condy
    I have made a template singleton class, I have also made a data structure that is templated. My question is; how do I make my templated data structure inherit from a singleton so you can only have one float type of this structure? I have tested both seperate and have found no problems. Code provided under... (That is the problem) template <class Type> class AbstractRManagers : public Singleton<AbstractRManagers<Type> > The problem is the code above doesn't work I get alot of errors. I cant get it to no matter what I do template a templated singleton class... I was asking for maybe advice or maybe if the code above is incorrect guidence? #ifndef SINGLETON_H #define SINGLETON_H template <class Type> class Singleton { public: virtual ~Singleton(); Singleton(); static Type* m_instance; }; template <class Type> Type* Singleton<Type>::m_instance = 0; #include "Singleton.cpp" #endif #ifndef SINGLETON_CPP #define SINGLETON_CPP #include "Singleton.h" template <class Type> Singleton<Type>::Singleton() { } template <class Type> Singleton<Type>::~Singleton() { } template <class Type> Type* Singleton<Type>::getInstance() { if(m_instance==nullptr) { m_instance = new Type; } return m_instance; } #endif #ifndef ABSTRACTRMANAGERS_H #define ABSTRACTRMANAGERS_H #include <vector> #include <map> #include <stack> #include "Singleton.h" template <class Type> class AbstractRManagers : public Singleton<AbstractRManagers<Type> > { public: virtual ~AbstractRManagers(); int insert(Type* type, std::string name); Type* remove(int i); Type* remove(std::string name); Type* get(int i); Type* getS(std::string name); int get(std::string name); int get(Type* i); bool check(std::string name); int resourceSize(); protected: private: std::vector<Type*> m_resources; std::map<std::string,int> m_map; std::stack<int> m_freePos; }; #include "AbstractRManagers.cpp" #endif #ifndef ABSTRACTRMANAGERS_CPP #define ABSTRACTRMANAGERS_CPP #include "AbstractRManagers.h" template <class Type> int AbstractRManagers<Type>::insert(Type* type, std::string name) { int i=0; if(!check(name)) { if(m_freePos.empty()) { m_resources.push_back(type); i = m_resources.size()-1; m_map[name] = i; } else { i = m_freePos.top(); m_freePos.pop(); m_resources[i] = type; m_map[name] = i; } } else i = -1; return i; } template <class Type> int AbstractRManagers<Type>::resourceSize() { return m_resources.size(); } template <class Type> bool AbstractRManagers<Type>::check(std::string name) { std::map<std::string,int>::iterator it; it = m_map.find(name); if(it==m_map.end()) return false; return true; } template <class Type> Type* AbstractRManagers<Type>::remove(std::string name) { Type* temp = m_resources[m_map[name]]; if(temp!=NULL) { std::map<std::string,int>::iterator it; it = m_map[name]; m_resources[m_map[name]] = NULL; m_freePos.push(m_map[name]); delete (*it).second; delete (*it).first; return temp; } return NULL; } template <class Type> Type* AbstractRManagers<Type>::remove(int i) { if((i < m_resources.size())&&(i > 0)) { Type* temp = m_resources[i]; m_resources[i] = NULL; m_freePos.push(i); std::map<std::string,int>::iterator it; for(it=m_map.begin();it!=m_map.end();it++) { if((*it).second == i) { delete (*it).second; delete (*it).first; return temp; } } return temp; } return NULL; } template <class Type> int AbstractRManagers<Type>::get(Type* i) { for(int i2=0;i2<m_resources.size();i2++) { if(i == m_resources[i2]) { return i2; } } return -1; } template <class Type> Type* AbstractRManagers<Type>::get(int i) { if((i < m_resources.size())&&(i >= 0)) { return m_resources[i]; } return NULL; } template <class Type> Type* AbstractRManagers<Type>::getS(std::string name) { return m_resources[m_map[name]]; } template <class Type> int AbstractRManagers<Type>::get(std::string name) { return m_map[name]; } template <class Type> AbstractRManagers<Type>::~AbstractRManagers() { } #endif #include "AbstractRManagers.h" struct b { float x; }; int main() { b* a = new b(); AbstractRManagers<b>::getInstance()->insert(a,"a"); return 0; } This program produces next errors when compiled : 1> main.cpp 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::stack<_Ty,_Container> &,const std::stack<_Ty,_Container> &)' : could not deduce template argument for 'const std::stack<_Ty,_Container> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\stack(166) : see declaration of 'std::operator <' 1> c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(124) : while compiling class template member function 'bool std::less<_Ty>::operator ()(const _Ty &,const _Ty &) const' 1> with 1> [ 1> _Ty=std::string 1> ] 1> c:\program files\microsoft visual studio 10.0\vc\include\map(71) : see reference to class template instantiation 'std::less<_Ty>' being compiled 1> with 1> [ 1> _Ty=std::string 1> ] 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(451) : see reference to class template instantiation 'std::_Tmap_traits<_Kty,_Ty,_Pr,_Alloc,_Mfl>' being compiled 1> with 1> [ 1> _Kty=std::string, 1> _Ty=int, 1> _Pr=std::less<std::string>, 1> _Alloc=std::allocator<std::pair<const std::string,int>>, 1> _Mfl=false 1> ] 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(520) : see reference to class template instantiation 'std::_Tree_nod<_Traits>' being compiled 1> with 1> [ 1> _Traits=std::_Tmap_traits<std::string,int,std::less<std::string>,std::allocator<std::pair<const std::string,int>>,false> 1> ] 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(659) : see reference to class template instantiation 'std::_Tree_val<_Traits>' being compiled 1> with 1> [ 1> _Traits=std::_Tmap_traits<std::string,int,std::less<std::string>,std::allocator<std::pair<const std::string,int>>,false> 1> ] 1> c:\program files\microsoft visual studio 10.0\vc\include\map(81) : see reference to class template instantiation 'std::_Tree<_Traits>' being compiled 1> with 1> [ 1> _Traits=std::_Tmap_traits<std::string,int,std::less<std::string>,std::allocator<std::pair<const std::string,int>>,false> 1> ] 1> c:\users\chris\desktop\311\ideas\idea1\idea1\abstractrmanagers.h(28) : see reference to class template instantiation 'std::map<_Kty,_Ty>' being compiled 1> with 1> [ 1> _Kty=std::string, 1> _Ty=int 1> ] 1> c:\users\chris\desktop\311\ideas\idea1\idea1\abstractrmanagers.h(30) : see reference to class template instantiation 'AbstractRManagers<Type>' being compiled 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::stack<_Ty,_Container> &,const std::stack<_Ty,_Container> &)' : could not deduce template argument for 'const std::stack<_Ty,_Container> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\stack(166) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::stack<_Ty,_Container> &,const std::stack<_Ty,_Container> &)' : could not deduce template argument for 'const std::stack<_Ty,_Container> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\stack(166) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::deque<_Ty,_Alloc> &,const std::deque<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::deque<_Ty,_Alloc> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\deque(1725) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::deque<_Ty,_Alloc> &,const std::deque<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::deque<_Ty,_Alloc> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\deque(1725) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::deque<_Ty,_Alloc> &,const std::deque<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::deque<_Ty,_Alloc> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\deque(1725) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Tree<_Traits> &,const std::_Tree<_Traits> &)' : could not deduce template argument for 'const std::_Tree<_Traits> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(1885) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Tree<_Traits> &,const std::_Tree<_Traits> &)' : could not deduce template argument for 'const std::_Tree<_Traits> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(1885) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Tree<_Traits> &,const std::_Tree<_Traits> &)' : could not deduce template argument for 'const std::_Tree<_Traits> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xtree(1885) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::vector<_Ty,_Ax> &,const std::vector<_Ty,_Ax> &)' : could not deduce template argument for 'const std::vector<_Ty,_Ax> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\vector(1502) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::vector<_Ty,_Ax> &,const std::vector<_Ty,_Ax> &)' : could not deduce template argument for 'const std::vector<_Ty,_Ax> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\vector(1502) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::vector<_Ty,_Ax> &,const std::vector<_Ty,_Ax> &)' : could not deduce template argument for 'const std::vector<_Ty,_Ax> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\vector(1502) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::unique_ptr<_Ty,_Dx> &,const std::unique_ptr<_Ty2,_Dx2> &)' : could not deduce template argument for 'const std::unique_ptr<_Ty,_Dx> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\memory(2582) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::unique_ptr<_Ty,_Dx> &,const std::unique_ptr<_Ty2,_Dx2> &)' : could not deduce template argument for 'const std::unique_ptr<_Ty,_Dx> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\memory(2582) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::unique_ptr<_Ty,_Dx> &,const std::unique_ptr<_Ty2,_Dx2> &)' : could not deduce template argument for 'const std::unique_ptr<_Ty,_Dx> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\memory(2582) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::reverse_iterator<_RanIt> &,const std::reverse_iterator<_RanIt2> &)' : could not deduce template argument for 'const std::reverse_iterator<_RanIt> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1356) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::reverse_iterator<_RanIt> &,const std::reverse_iterator<_RanIt2> &)' : could not deduce template argument for 'const std::reverse_iterator<_RanIt> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1356) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::reverse_iterator<_RanIt> &,const std::reverse_iterator<_RanIt2> &)' : could not deduce template argument for 'const std::reverse_iterator<_RanIt> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1356) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Revranit<_RanIt,_Base> &,const std::_Revranit<_RanIt2,_Base2> &)' : could not deduce template argument for 'const std::_Revranit<_RanIt,_Base> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1179) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Revranit<_RanIt,_Base> &,const std::_Revranit<_RanIt2,_Base2> &)' : could not deduce template argument for 'const std::_Revranit<_RanIt,_Base> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1179) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::_Revranit<_RanIt,_Base> &,const std::_Revranit<_RanIt2,_Base2> &)' : could not deduce template argument for 'const std::_Revranit<_RanIt,_Base> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\xutility(1179) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::pair<_Ty1,_Ty2> &,const std::pair<_Ty1,_Ty2> &)' : could not deduce template argument for 'const std::pair<_Ty1,_Ty2> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\utility(318) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::pair<_Ty1,_Ty2> &,const std::pair<_Ty1,_Ty2> &)' : could not deduce template argument for 'const std::pair<_Ty1,_Ty2> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\utility(318) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2784: 'bool std::operator <(const std::pair<_Ty1,_Ty2> &,const std::pair<_Ty1,_Ty2> &)' : could not deduce template argument for 'const std::pair<_Ty1,_Ty2> &' from 'const std::string' 1> c:\program files\microsoft visual studio 10.0\vc\include\utility(318) : see declaration of 'std::operator <' 1>c:\program files\microsoft visual studio 10.0\vc\include\xfunctional(125): error C2676: binary '<' : 'const std::string' does not define this operator or a conversion to a type acceptable to the predefined operator ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Bukkit saving inventory

    - by HcgRandon
    Alright i will make this quick... I am working on a command in my plugin to allow you to transfer worlds and i am trying to save inventory but i am getting a problem here is the code: public void savePlayerInv(Player p, World w){ File playerInvConfigFile = new File(plugin.getDataFolder() + File.separator + "players" + File.separator + p.getName(), "inventory.yml"); FileConfiguration pInv = YamlConfiguration.loadConfiguration(playerInvConfigFile); PlayerInventory inv = p.getInventory(); int i = 0; for (ItemStack stack : inv.getContents()) { //increment integer i++; String startInventory = w.getName() + ".inv." + Integer.toString(i); //save inv pInv.set(startInventory + ".amount", stack.getAmount()); pInv.set(startInventory + ".durability", Short.toString(stack.getDurability())); pInv.set(startInventory + ".type", stack.getTypeId()); //pInv.set(startInventory + ".enchantment", stack.getEnchantments()); //TODO add enchant saveing } i = 0; for (ItemStack armor : inv.getArmorContents()){ i++; String startArmor = w.getName() + ".armor." + Integer.toString(i); //save armor pInv.set(startArmor + ".amount", armor.getAmount()); pInv.set(startArmor + ".durability", armor.getDurability()); pInv.set(startArmor + ".type", armor.getTypeId()); //pInv.set(startArmor + ".enchantment", armor.getEnchantments()); } //save exp if (p.getExp() != 0) { pInv.set(w.getName() + ".exp", p.getExp()); } } Now here is the stack trace i recive it is commplaing about line 130 which is this line pInv.set(startInventory + ".amount", stack.getAmount()); okay now trace 2012-03-21 13:23:25 [SEVERE] null org.bukkit.command.CommandException: Unhandled exception executing command 'wtp' in plugin Needs v1.0 at org.bukkit.command.PluginCommand.execute(PluginCommand.java:42) at org.bukkit.command.SimpleCommandMap.dispatch(SimpleCommandMap.java:166) at org.bukkit.craftbukkit.CraftServer.dispatchCommand(CraftServer.java:461) at net.minecraft.server.NetServerHandler.handleCommand(NetServerHandler.java:818) at net.minecraft.server.NetServerHandler.chat(NetServerHandler.java:778) at net.minecraft.server.NetServerHandler.a(NetServerHandler.java:761) at net.minecraft.server.Packet3Chat.handle(Packet3Chat.java:33) at net.minecraft.server.NetworkManager.b(NetworkManager.java:229) at net.minecraft.server.NetServerHandler.a(NetServerHandler.java:112) at net.minecraft.server.NetworkListenThread.a(NetworkListenThread.java:78) at net.minecraft.server.MinecraftServer.w(MinecraftServer.java:554) at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:452) at net.minecraft.server.ThreadServerApplication.run(SourceFile:490) Caused by: java.lang.NullPointerException at com.devoverflow.improved.needs.commands.CommandWorldtp.savePlayerInv(CommandWorldtp.java:130) at com.devoverflow.improved.needs.commands.CommandWorldtp.onCommand(CommandWorldtp.java:60) at org.bukkit.command.PluginCommand.execute(PluginCommand.java:40) ... 12 more

    Read the article

  • WPF MVVM UserControl Binding "Container", dispose to avoid memory leak.

    - by user178657
    For simplicity. I have a window, with a bindable usercontrol <UserControl Content="{Binding Path = BindingControl, UpdateSourceTrigger=PropertyChanged}"> I have two user controls, ControlA, and ControlB, Both UserControls have their own Datacontext ViewModel, ControlAViewModel and ControlBViewModel. Both ControlAViewModel and ControlBViewModel inh. from a "ViewModelBase" public abstract class ViewModelBase : DependencyObject, INotifyPropertyChanged, IDisposable........ Main window was added to IoC... To set the property of the Bindable UC, i do ComponentRepository.Resolve<MainViewWindow>().Bindingcontrol= new ControlA; ControlA, in its Datacontext, creates a DispatcherTimer, to do "somestuff".. Later on., I need to navigate elsewhere, so the other usercontrol is loaded into the container ComponentRepository.Resolve<MainViewWindow>().Bindingcontrol= new ControlB If i put a break point in the "someStuff" that was in ControlA's datacontext. The DispatcherTimer is still running... i.e. loading a new usercontrol into the bindable Usercontrol on mainwindow does not dispose/close/GC the DispatcherTimer that was created in the DataContext View Model... Ive looked around, and as stated by others, dispose doesnt get called because its not supposed to... :) Not all my usercontrols have DispatcherTimer, just a few that need to do some sort of "read and refresh" updates./. Should i track these DispatcherTimer objects in the ViewModelBase that all Usercontrols inh. and manually stop/dispose them everytime a new usercontrol is loaded? Is there a better way?

    Read the article

  • DexFile.class error in eclipse

    - by ninjasense
    I get this weird error everytime I debug in eclipse. It just seemed to appear one day and I was wondering if anyone else was running int the same problem. It does not affect my app in anyway visibly and does not cause a crash but it is an annoyance while debugging. Here is the full error: // Compiled from DexFile.java (version 1.5 : 49.0, super bit) public final class dalvik.system.DexFile { // Method descriptor #8 (Ljava/io/File;)V // Stack: 3, Locals: 2 public DexFile(java.io.File file) throws java.io.IOException; 0 aload_0 [this] 1 invokespecial java.lang.Object() [1] 4 new java.lang.RuntimeException [2] 7 dup 8 ldc <String "Stub!"> [3] 10 invokespecial java.lang.RuntimeException(java.lang.String) [4] 13 athrow Line numbers: [pc: 0, line: 4] Local variable table: [pc: 0, pc: 14] local: this index: 0 type: dalvik.system.DexFile [pc: 0, pc: 14] local: file index: 1 type: java.io.File // Method descriptor #18 (Ljava/lang/String;)V // Stack: 3, Locals: 2 public DexFile(java.lang.String fileName) throws java.io.IOException; 0 aload_0 [this] 1 invokespecial java.lang.Object() [1] 4 new java.lang.RuntimeException [2] 7 dup 8 ldc <String "Stub!"> [3] 10 invokespecial java.lang.RuntimeException(java.lang.String) [4] 13 athrow Line numbers: [pc: 0, line: 5] Local variable table: [pc: 0, pc: 14] local: this index: 0 type: dalvik.system.DexFile [pc: 0, pc: 14] local: fileName index: 1 type: java.lang.String // Method descriptor #22 (Ljava/lang/String;Ljava/lang/String;I)Ldalvik/system/DexFile; // Stack: 3, Locals: 3 public static dalvik.system.DexFile loadDex(java.lang.String sourcePathName, java.lang.String outputPathName, int flags) throws java.io.IOException; 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 6] Local variable table: [pc: 0, pc: 10] local: sourcePathName index: 0 type: java.lang.String [pc: 0, pc: 10] local: outputPathName index: 1 type: java.lang.String [pc: 0, pc: 10] local: flags index: 2 type: int // Method descriptor #28 ()Ljava/lang/String; // Stack: 3, Locals: 1 public java.lang.String getName(); 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 7] Local variable table: [pc: 0, pc: 10] local: this index: 0 type: dalvik.system.DexFile // Method descriptor #30 ()V // Stack: 3, Locals: 1 public void close() throws java.io.IOException; 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 8] Local variable table: [pc: 0, pc: 10] local: this index: 0 type: dalvik.system.DexFile // Method descriptor #32 (Ljava/lang/String;Ljava/lang/ClassLoader;)Ljava/lang/Class; // Stack: 3, Locals: 3 public java.lang.Class loadClass(java.lang.String name, java.lang.ClassLoader loader); 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 9] Local variable table: [pc: 0, pc: 10] local: this index: 0 type: dalvik.system.DexFile [pc: 0, pc: 10] local: name index: 1 type: java.lang.String [pc: 0, pc: 10] local: loader index: 2 type: java.lang.ClassLoader // Method descriptor #37 ()Ljava/util/Enumeration; // Signature: ()Ljava/util/Enumeration<Ljava/lang/String;>; // Stack: 3, Locals: 1 public java.util.Enumeration entries(); 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 10] Local variable table: [pc: 0, pc: 10] local: this index: 0 type: dalvik.system.DexFile // Method descriptor #30 ()V // Stack: 3, Locals: 1 protected void finalize() throws java.io.IOException; 0 new java.lang.RuntimeException [2] 3 dup 4 ldc <String "Stub!"> [3] 6 invokespecial java.lang.RuntimeException(java.lang.String) [4] 9 athrow Line numbers: [pc: 0, line: 11] Local variable table: [pc: 0, pc: 10] local: this index: 0 type: dalvik.system.DexFile // Method descriptor #42 (Ljava/lang/String;)Z public static native boolean isDexOptNeeded(java.lang.String arg0) throws java.io.FileNotFoundException, java.io.IOException; } Thanks

    Read the article

  • How to write mmap input memory to O_DIRECT output file?

    - by Friedrich
    why doesn't following pseudo-code work (O_DIRECT results in EFAULT) in_fd = open("/dev/mem"); in_mmap = mmap(in_fd); out_fd = open("/tmp/file", O_DIRECT); write(out_fd, in_mmap, PAGE_SIZE); while following does (no O_DIRECT) in_fd = open("/dev/mem"); in_mmap = mmap(in_fd); out_fd = open("/tmp/file"); write(out_fd, in_mmap, PAGE_SIZE); I guess it's something with virtual kernel pages to virtual user pages, which cannot be translated in the write call? Best regards, Friedrich

    Read the article

  • Handling android player errors

    - by stack hoss
    I am anew android developer and i made ashoutcast radio player and work good but when i open app it work for afew time but suddenly stop and need to press stop and play again but i need Handling android player errors to automatic restart on errors package com.test.test; import java.io.IOException; import android.app.Notification; import android.app.NotificationManager; import android.app.PendingIntent; import android.app.Service; import android.content.Context; import android.content.Intent; import android.content.SharedPreferences; import android.media.AudioManager; import android.media.AudioManager.OnAudioFocusChangeListener; import android.media.MediaPlayer; import android.os.IBinder; import android.preference.PreferenceManager; import android.util.Log; public class StreamService extends Service { private static final String TAG = "StreamService"; MediaPlayer mp; boolean isPlaying; Intent MainActivity; SharedPreferences prefs; SharedPreferences.Editor editor; Notification n; NotificationManager notificationManager; // Change this int to some number specifically for this app int notifId = 85; private OnAudioFocusChangeListener focusChangeListener = new OnAudioFocusChangeListener() { public void onAudioFocusChange(int focusChange) { switch (focusChange) { case (AudioManager.AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK) : // Lower the volume while ducking. mp.setVolume(0.2f, 0.2f); break; case (AudioManager.AUDIOFOCUS_LOSS_TRANSIENT) : mp.pause(); break; case (AudioManager.AUDIOFOCUS_LOSS) : mp.stop(); break; case (AudioManager.AUDIOFOCUS_GAIN) : // Return the volume to normal and resume if paused. mp.setVolume(1f, 1f); mp.start(); break; default: break; } } }; @Override public IBinder onBind(Intent arg0) { // TODO Auto-generated method stub return null; } @SuppressWarnings("deprecation") @Override public void onCreate() { super.onCreate(); Log.d(TAG, "onCreate"); // Init the SharedPreferences and Editor prefs = PreferenceManager.getDefaultSharedPreferences(getApplicationContext()); editor = prefs.edit(); // Set up the buffering notification notificationManager = (NotificationManager) getApplicationContext() .getSystemService(NOTIFICATION_SERVICE); Context context = getApplicationContext(); String notifTitle = context.getResources().getString(R.string.app_name); String notifMessage = context.getResources().getString(R.string.buffering); n = new Notification(); n.icon = R.drawable.ic_launcher; n.tickerText = "Buffering"; n.when = System.currentTimeMillis(); Intent nIntent = new Intent(context, MainActivity.class); nIntent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK | Intent.FLAG_ACTIVITY_SINGLE_TOP); PendingIntent pIntent = PendingIntent.getActivity(context, 0, nIntent, 0); n.setLatestEventInfo(context, notifTitle, notifMessage, pIntent); notificationManager.notify(notifId, n); // It's very important that you put the IP/URL of your ShoutCast stream here // Otherwise you'll get Webcom Radio String url = "http://47.182.19.93:9888/"; mp = new MediaPlayer(); mp.setAudioStreamType(AudioManager.STREAM_MUSIC); try { mp.reset(); mp.setDataSource(url); mp.prepare(); mp.start(); } catch (IllegalArgumentException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (SecurityException e) { // TODO Auto-generated catch block Log.e(TAG, "SecurityException"); } catch (IllegalStateException e) { // TODO Auto-generated catch block Log.e(TAG, "IllegalStateException"); } catch (IOException e) { // TODO Auto-generated catch block Log.e(TAG, "IOException"); } } @SuppressWarnings("deprecation") @Override public void onStart(Intent intent, int startId) { Log.d(TAG, "onStart"); mp.start(); // Set the isPlaying preference to true editor.putBoolean("isPlaying", true); editor.commit(); Context context = getApplicationContext(); String notifTitle = context.getResources().getString(R.string.app_name); String notifMessage = context.getResources().getString(R.string.now_playing); n.icon = R.drawable.ic_launcher; n.tickerText = notifMessage; n.flags = Notification.FLAG_NO_CLEAR; n.when = System.currentTimeMillis(); Intent nIntent = new Intent(context, MainActivity.class); PendingIntent pIntent = PendingIntent.getActivity(context, 0, nIntent, 0); n.setLatestEventInfo(context, notifTitle, notifMessage, pIntent); // Change 5315 to some nother number notificationManager.notify(notifId, n); AudioManager am = (AudioManager)getSystemService(Context.AUDIO_SERVICE); // Request audio focus for playback int result = am.requestAudioFocus(focusChangeListener, // Use the music stream. AudioManager.STREAM_MUSIC, // Request permanent focus. AudioManager.AUDIOFOCUS_GAIN); if (result == AudioManager.AUDIOFOCUS_REQUEST_GRANTED) { // other app had stopped playing song now , so u can do u stuff now . } } @Override public void onDestroy() { Log.d(TAG, "onDestroy"); mp.stop(); mp.release(); mp = null; editor.putBoolean("isPlaying", false); editor.commit(); notificationManager.cancel(notifId); AudioManager am = (AudioManager)getSystemService(Context.AUDIO_SERVICE); am.abandonAudioFocus(focusChangeListener); } }

    Read the article

< Previous Page | 181 182 183 184 185 186 187 188 189 190 191 192  | Next Page >