Search Results

Search found 14206 results on 569 pages for 'compressed folder'.

Page 124/569 | < Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131  | Next Page >

  • Efficient way to check for changes to the contents of folders

    - by MrVimes
    I am creating an application that maintains a database of files of a certain type in a given folder (and all subfolders) Initially the program will recurse the folders and add any file it finds of that type to the database. I want the application to have the ability to re-scan the folder and add any files that were not there the last time the folders were scanned. It can't use the date created property of the file because there is a high chance of a file being added to the folders that isn't a new file. I am wondering what the most efficient way of doing this is, and if there is a way that doesn't involve checking each file is in the database already (which, if there are 5000 files would mean 5000 queries of a list 5000 items in size, or 25 million 'checks' for the sql engine to perform) I suppose a more specific question to acheive the same goal would be - is there a property of a file (in Microsoft Windows) that will reliably tell you when that file arrived in that folder.

    Read the article

  • MySQL vs. SQL Server GoDaddy, What is the difference between hosted DB and App_Data Db

    - by Nate Gates
    I'm using GoDdady for site hosting, and I'm currently using MySQL, because there are less limits on size,etc. My question is what is the difference between using a hosted GoDaddy Db such as MySQL vs. creating a SQL Server database in the the App_Data folder? My guess is security? Would it be a bad idea to use a SQL ServerDB that's located in the App_Data folder? Additional Well I am able to create a .mdf (SQL Server DB file) in the App_Data folder, but I'm really unsure if should use that or not, If I did use it it would simplify using some of the Microsoft tools. Like I said my guess is that it would be less secure, but I don't really know. I know I have a 10gb, file system limit, so I'm assuming my db would have to share that space.

    Read the article

  • Which folders need to be backed up for migration in Joomla?

    - by Devdatta Tengshe
    I'm helping someone update & migrate their old website, built on the Joomla framework. Currently it is running on Joomla 1.5.8 which is an ancient version. I've convinced them to upgrade Joomla to at least 2.5 I have already made a backup of the database. Most links I have seen talk of backing up the entire public_html folder (The website runs on a shared host). But in my fresh Joomla installation there are several folders that are in the public_html folder. So which of the folders in the public_html folder are from the content of the website, and which are of the old Joomla framework? I'm afraid that I might overwrite files of the new Joomla framework with the old framework, if copy all the files and folders into the new installation.

    Read the article

  • new installation won't sync directories with my account

    - by barrydrake
    I had a dual-boot arrangement on my desktop with 12.04 and 12.10 (testing). Last week, I managed to trash both and ended up re-installing both. After doing all the updates, I activated Ubuntu-one which worked just fine on the previous installations. On both 12.04 and 12.10, the Ubuntu-One window fails to show the folders for syncing and only syncs the Ubuntu-One folder. Adding a folder on my net book on which sync is working properly brings up a pop up on my desktop that tells me a new folder is available for syncing - but the Ubuntu window fails to show it. I have put a screenshot from my net book (working OK) and one from my desktop which shows the problem. Any suggestions please?

    Read the article

  • Which folders need to be backedup for migration in Joomla?

    - by Devdatta Tengshe
    I have never used Joomla before, so this might be a noobish question. I have tried searching, but haven't found anything relevant. I'm helping someone update & migrate their old website, built on the Joomla framework. Currently it is running on Joomla 1.5.8 which is an ancient version. I've convinced them to upgrade Joomla to atleast 2.5 I have already taken the database backup. Most links I have seen talk of backing up the entire public_html folder (The website runs on a shared host). But in my fresh Joomla installation there are several folders that are in the public_html folder. So which of the folders in the public_html folder are from the content of the website, and which are of the old Joomla framework? I'm afraid that I might overwrite files of the new Joomla framework with the old framework, if copy all the files and folders into the new installation.

    Read the article

  • Why not sync folders outside home with ubuntu-one?

    - by peer
    It took me a while to find out that with ubuntu-one I can sync only folders in my home-folder. On all other folders the ubuntu-one - option is available in preferences, but the actual actions are greyed out. The ubuntu-faq is quite clear on that: https://wiki.ubuntu.com/UbuntuOne/FAQ/CanISyncAFolderOutsideMyHomeFolder But I actually wonder why and if this is going to change and if there is a trick around it (an other one than setting my home to /) I personally dont have any important data in my home-folder other than the program-configs. All documents, pictures, music are on a folder called /data that is actually on a different partition. That makes it much easier when one wants to reinstall ubuntu.

    Read the article

  • boot loading problem after wubi installation of Ubuntu 12.04 on Win7

    - by user63085
    I was new in Ubuntu OS and I was just trying to use wubi windows installer to get a Ubuntu first for hands-on. I followed exactly the same as the instructions and after reboot win7, there is no Ubuntu selection in windows boot manager, with only Windows 7 showing there evilly -.- What I've found out was that the grub folder inside Ubuntu folder ( in my C:\ drive) was empty, either inside the ubuntu\disks\grub or ubuntu\install\grub. I thought this might be the reason why I could not load ubuntu during startup. Cause I've also looked into the EasyBCD settings, and ubuntu entry with Bootloader Path: \ubuntu\winboot\wubildr.mbr was lying there peacefully, looking perfectly fine. However it was not in boot loader actually. Is there a way to restore the grub folder with grub2, or is there any way to fix this problem so that I can find the "Ubuntu" selection at windows startup? Very appreciate your help :) Henry

    Read the article

  • Why not sync folders outside home with Ubuntu One?

    - by peer
    It took me a while to find out that with Ubuntu One I can sync only folders in my home folder. On all other folders the Ubuntu One option is available in preferences, but the actual actions are greyed out. The Ubuntu One FAQ is quite clear on that: No, currently you can only select to synchronize folders inside your home directory. But I actually wonder why and if this is going to change and if there is a trick around it (an other one than setting my home to /) ? I personally don't have any important data in my home folder other than the program configs. All documents, pictures, music are on a folder called /data that is actually on a different partition. That makes it much easier when one wants to reinstall Ubuntu.

    Read the article

  • Clone Unity Dock for All Users?

    - by jduc
    I've spent 3 days trying to figure out how to clone the Unity dock from my firstuser profile in Ubuntu 12.04 to all of the other profiles I'm hoping to create. I've read where the /etc/skel folder contents get populated to all new profiles but this doesn't seem to include the dock or the desktop folders and icons. I've also copied the entire contents of my /home/firstuser folder to my /home/seconduser folder (including .* files) but the dock is still showing the stock icons and not the icons I've designated in my firstuser profile dock. Does anyone have any suggestions?

    Read the article

  • Recovering problems

    - by Sova
    Just simple question(s): Is it possible to recover one whole folder? It was with photos..and if no, so when recovering, will it recover also files from the folder? (I use Photorec, but if you know better one..). I tried to get back my pics and what has returned wasn't from folder. ? And..is it normal, that it has recovered me the same things for about 5 times, with different size? Thanks in advance.

    Read the article

  • FxCop ... Where have you been all my life?

    - by PhilSando
    I was recently introduced to microsoft's tool that analyzes managed code assemblies called FxCop. It points out possible design, localization, performance, and security improvements against a pre defined set of rules (and also accepts custom rules). At first I was unsure how to go about using it as it seems to be aimed at software developers (.exe and .dll) . Its easy to get around this with the following steps: 1)Create a new folder (i.e C:\Code Analysis) 2)Publish your web application into the new folder 3)Open FxCop and add all the dll files from the newly created bin folder  to be scrutinized. Lots more info / docs available here on msdn and you can also download fxcop free

    Read the article

  • Cannot add folders to Ubuntu One

    - by Akmur
    I have signed up with Ubuntu One (20GB) and I'm having the following issue: I basically cannot add folders through the panel interface. I have been able to add five folders so far, but I'd like to add some more (yes, they are inside the home folder). I don't get no errors, but nothing is added to the folders list. By using the command line interface like this u1sdtool --create-folder /home/alex/Web it basically hangs. Nothing happens. If I then list the folders with command line, my folder is not there. Any idea? (I'm on 12.10)

    Read the article

  • nothing happen after a command mount -t (worked before)

    - by user3449429
    i'm having a weird problem. i used to lauch manually the mount command to link a folder on my PLEX server with a folder on my NAS since yesterday it was ok, but i had to halt my plex server and when i tried to mount again the folder, nothing happen. it ask me the su password and that's all. here the command i use in my fstab: //192.168.1.2/Series_TV /home/cidou/Series_TV cifs _netdev,credentials=/home/cidou/.smbcredentials 0 0 //192.168.1.2/films /home/cidou/Films cifs _netdev,credentials=/home/cidou/.smbcredentials 0 0 i tried this command too: sudo mount -t smbfs //192.168.1.2/Films /home/cidou/Films -o user=myname,password=mypass,sec=ntlm --verbose i run an ubuntu 12.04 LTS uname -a Linux plex 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 15:31:16 UTC 2013 i686 i686 i386 GNU/Linux Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.8.0-29-generic i686) * Documentation: https://help.ubuntu.com/ System information disabled due to load higher than 2.0 Thanks for reading

    Read the article

  • background: why not sync folders outside home with ubuntu-one?

    - by peer
    It took me a while to find out that with ubuntu-one I can sync only folders in my home-folder. On all other folders the ubuntu-one - option is available in preferences, but the actual actions are greyed out. the ubuntu-faq is quite clear on that: https://wiki.ubuntu.com/UbuntuOne/FAQ/CanISyncAFolderOutsideMyHomeFolder But I actually wonder why and if this is going to change and if there is a trick around it (an other one than setting my home to / ;) ) I personally dont have any important data in my home-folder other than the program-configs. All documents, pictures, music are on a folder called /data that is actually on a different partition. That makes it much easier when one wants to reinstall ubuntu. thnx, p

    Read the article

  • transfer Thunderbird (17) Profile on Win7 to Ubuntu 12.04

    - by William Curran
    I want to transfer Thunderbird Profile on Win7 to thunderbird (17) Ubuntu 12.04. I already copied the profile folder from Windows to Ubuntu and modified progile.ini on the ubuntu machine to include [Profile] Name=Bill IsRelative=1 Path=(the name of the transfered profile folder) I think the problem is that the Win. TB profile content (files and folder structure) look VERY different that the that of the unbuntu TB profile's that was created on installation. The Ubuntu install is new where as the Win TB has undergone many updates. Seems the system for profile storage has changed drastically. I tried to start TB in safe mode but could't get the path correct to start TB in the terminal with the -safe-mode switch. What can I do? Bill

    Read the article

  • To track or not to track user created images for display in a dynamic website?

    - by Question Overflow
    I have a website that allows user to upload and display images. Currently, I am not using a database to track these user images. A folder having the user id as the folder name is created for each user and under each folder, the image files are labelled numerically with filenames ranging from 01.jpg to 20.jpg. Up to 20 such images can be displayed on each user page. I am using javascript to hide these images in case of any of them is absent. I have seen many websites having user images with unique random filenames and possibly tracking these files with a database Since obscurity is not something that I need for these web accessible images, is there any reason why I should track them with a DB? I am not sure if a reduction of 404 errors is a good enough reason to justify the added complexity of maintaining a database or can anyone enlighten me?

    Read the article

  • How to write the path to a Windows shared folders address in terminal?

    - by user206996
    I want to permanently mount a few Windows shared folders. I googled that topic and found a lot of good answers, but with all the solutions I have a little problem: I believe I can't properly give an address of shared folder. I can find it following way: "Network" - "Windows Network" - "WORKGROUP_name" - "Computer_name" - and there is some folder i would like to mount. How can I write the path to the folders in terminal? UPD i can access needed folder throw address in location bar smb://server/ro/, but i try mount it by editing /etc/fstab file (using guide https://wiki.ubuntu.com/MountWindowsSharesPermanently) i get error mount error: could not resolve address for server: Unknown error PS in /etc/fstab : //SERVER/RO /media/server cifs username=user,password=pwd, uid=1000,iocharset=utf8 0 0

    Read the article

  • Dropbox and IIS

    - by DigitalAce7
    I'm running into the problem of using Dropbox and IIS. I am using the Dropbox Folder Sync plugin to sync a folder outside the Dropbox folder. They are synced into my inetpub\root\downloads\ directory. The problem is not that it doesn't sync. The problem is that the files that get synced have no permissions attached to them. I can't open any of them on IIS. Initially the downloads were to be only PDFs but they don't open so I tried an .ASPX file and it fails to load as well. Is there any way around this or another program I can use to sync files but allow them to be opened on an IIS server. Thanks for any help!

    Read the article

  • How to automatically remove Flash history/privacy trail? Or stop Flash from storing it?

    - by Arjan van Bentem
    Many people have heard about third-party cookies, and some browsers even block those by default. Some people may even be using Private Browsing modes. However, only few seem to realise that Adobe's Flash player also leaves a cross-browser trail on your local hard drive, and allows for sending cookie-like information back to the server, including third-party sites. And because it is a plugin, Flash does not take any of the browser's privacy settings into account. Sorry for the long post, but first some details about why using Flash raises a privacy concern, followed by the results of my tests: The Flash player keeps a cross-browser history of the domain names of the Flash-sites your computer has visited. Unlike your browser's history, this history is not limited to a certain number of days. History is also recorded while using so-called Private Browsing modes. It is stored on your hard drive (though, as described below, without going to Adobe's site you won't know what is stored). I am not sure if any date and time information is kept about each visit, but to see the domain names: right-click on some Flash content, open the settings dialog, and click the Help icon or click the Advanced button within the Privacy tab. This opens a browser to the help pages on Adobe.com, where one can click through to the Website Storage Settings panel. One can clear the existing list, but one cannot stop it from being recorded again. Flash allows for storing data on your local hard drive, using so-called Local Shared Objects (aka "Flash Cookies"). Just like HTTP cookies, this data can be sent back to the server, for tracking purposes. They are cross-browser, have no expiration date, and no user defined maximum lifetime can be set in the Flash preferences either. These not being HTTP cookies, they are (of course) not blocked by a browser's cookies preferences and are not removed when the normal HTTP cookies are deleted. Adobe has announced that version 10.1 will obey Private Browsing in most popular browsers, but unfortunately no word about also removing the data whenever normal cookies are deleted manually. And its implementation might be confusing: [..] if the browser is in normal browsing mode when the Flash Player instance is created, then that particular instance will forever be in normal browsing mode (private browsing is turned off). Accordingly, toggling private browsing on or off without refreshing the page or closing the private browsing window will not impact Flash Player. Local Shared Objects are not limited to the site you visit, and third-party storage is enabled by default. At the Global Storage Settings panel one can deselect the default Allow third-party Flash content to store data on your computer. Because of the cross-browser and expiration-less nature (and the fact that few people know about it), I feel that the cross-browser third-party Flash Cookies are more dangerous for visitor tracking than third-party normal HTTP cookies. They are even used to restore plain HTTP cookies that the user tried to delete: "All advertisers, websites and networks use cookies for targeted advertising, but cookies are under attack. According to current research they are being erased by 40% of users creating serious problems," says Mookie Tenembaum, founder of United Virtualities. "From simple frequency capping to the more sophisticated behavioral targeting, cookies are an essential part of any online ad campaign. PIE ["Persistent Identification Element"] will give publishers and third-party providers a persistent backup to cookies effectively rendering them unassailable", adds Tenembaum. [..] To justify this tracking mechanism, UV's Tenembaum said, "The user is not proficient enough in technology to know if the cookie is good or bad, or how it works." When selecting None (zero KB) for Specify the amount of disk space that website websites that you haven't yet visited can use to store information on your computer, and checking Never ask again then some sites do not work. However, the same site might work when setting it to None but without selecting Never ask again, and then choose Deny whenever prompted. Both options would result in zero KB of data being allowed, but the behaviour differs. The plugin also provides a Flash Player cache for Adobe-signed files. I guess these files are not an issue. So: how to automatically delete that information? On a Mac, one can find a settings.sol file and a folder for each visited Flash-website in: $HOME/Library/Preferences/Macromedia/Flash Player/macromedia.com/support/flashplayer/sys/ Deleting the settings.sol file and all the folders in sys, removes the trail from the settings panels. However, the actual Local Shared Ojects are elsewhere (see Wikipedia for locations on other operating systems), in a randomly named subfolder of: $HOME/Library/Preferences/Macromedia/Flash Player/#SharedObjects But then: how to remove this automatically? Simply removing the folders and the settings.sol file every now and then (like by using launchd or Windows' Task Scheduler) may interfere with active browsers. Or is it safe to assume that, given the cross-browser nature, the plugin would not care if things are removed while it is active? Only clearing during log-off may not work for those who hibernate all the time. Firefox users can install BetterPrivacy or Objection to delete the Local Shared Objects (for all others browsers as well). I don't know if that also deletes the trail of website domain names. Or: how to stop Flash from storing a history trail? Change of plans: I'm currently testing prohibiting Flash to write to its own sys and #SharedObjects folders. So far, Flash has not tried to restore permissions (though, when deleting the folders, Flash will of course recreate them). I've not encountered any problems but this may take some while to validate, using multiple browsers and sites. I've not yet found a log that reports errors. On a Mac: cd "$HOME/Library/Preferences/Macromedia/Flash Player/macromedia.com/support/flashplayer" rm -r sys/* chmod u-w sys cd "$HOME/Library/Preferences/Macromedia/Flash Player" # preserve the randomly named subfolders (only preserving the latest would suffice; see below) rm -r \#SharedObjects/*/* chmod -R u-w \#SharedObjects I guess the above chmods cannot be achieved on an old Windows system (I'm not sure about XP and Vista?). Though maybe on Windows one could replace the folders sys and #SharedObjects with dummy files with the same names? Anyone? Obviously, keeping Flash from storing those Local Shared Objects for all sites may cause problems. Some test results (Flash 10 on Mac OS X): When blocking the sys folder (even when leaving the #SharedObjects folder writable) then YouTube won't remember your volume settings while viewing multiple videos. Temporarily allowing write access to the blocked folders while visiting trusted sites (to only create folders for domains you like, maybe including references in settings.sol) solves that. This way, for YouTube, Flash could be allowed to write to sys/#s.ytimg.com and #SharedObjects/s.ytimg.com, while Flash could not create new folders for other domains. One may also need to make settings.sol read-only afterwards, or delete it again. When blocking both the sys and #SharedObjects folders, YouTube and Vimeo work fine (though they might not remember any settings). However, Bits on the Run refuses to even show the video player. This is solved by temporarily unblocking the #SharedObjects folder, to allow Flash to create a subfolder with some random name. Within this folder, it would create yet another folder for the current Flash website (content.bitsontherun.com). Removing that website-specific folder, and blocking both #SharedObjects and the randomly named subfolder, still seems to allow Bits on the Run to operate, even though it still cannot write anything to disk. So: the existence of the randomly named subfolder (even when write protected) is important for some sites. When I first found the #SharedObjects folder, it held many subfolders with random names, some created on the very same day. I wonder when Flash decides it wants a new folder, and how it determines (and remembers) that random name. For a moment I considered not blocking write access for sys and #SharedObjects, but explicitly creating read-only folders for well-known third-party tracking domains (like based on a list from, for example, AdBlock Plus). That way, any other domain could still create Local Shared Objects. But the list would be long, and the domains from AdBlock Plus are probably all third-party domains anyway, so disabling Allow third-party Flash content to store data on your computer might have the very same result. Any experience anyone? (Final notes: if the above links to the settings panels do not work in the future, then use the URL that is known to Flash player as a starting point: www.adobe.com/go/settingsmanager. See also "You Deleted Your Cookies? Think Again" at Wired.com -- which uses Flash cookies itself as well... For the very suspicious using Time Machine: you may want to exclude both folders, for each user, and remove the trace that is already on your backup.)

    Read the article

  • Install Quartz.Net as a windows service and Test installation

    - by Tarun Arora
    In this blog post I’ll be covering, 01: Where to download Quartz.net from 02: How to install Quartz.net as a Windows service 03: Test the Quartz.net Installation If you are new to Quartz.net I would recommend reading the blog post on a brief introduction to Quartz.net. 01 – Where to download Quartz.net? http://sourceforge.net/projects/quartznet/files/quartznet/       Currently version  Quartz.Net 2.0.1 is the recommended download version. 02 – How to install Quartz.net as a Windows service         Go to the download location and unzip the Quartz.net package Navigate to the folder Quartz.Net \ Server \ bin – This is where you will find different .net version installers of the quartz.net packages. For example in the screen shot above, you can see the Quartz.net .net 3.5 and .net 4 packages. Open up the Quartz.net .net 4.0 folder, this folder contains the files you need to install Quartz.net as a windows service Copy the contents of the folder Downloads\Quartz.NET-2.0.1\server\bin\4.0 to the folder %program files%\Quartz.net   5. Open up a new CMD as an administrator and run the below command to install Quartz.net as a windows service /> Quartz.Server.exe install 6. How do I know that Quartz.Net service has installed as a Windows service? Go to run prompt and type ‘services.msc’ you should now see all the windows services installed on your machine. Navigate down to look for Quartz.Net. The service installs itself as an automatic startup Type and log on as ‘Local System’. You can easily change this to your prefer account that you would like to run the service as. If you wanted to name the Quartz service something else then that’s also possible… Can I change the default display name of the quartz.net windows service? Yes, you can! Navigate to C:\Program Files (x86)\Quartz.Net\ and open up the config file ‘quartz.config’ - You can change the instance name - You can change the default thread count of 10 - The port that the service listens to (by default this is port 555) A blog post on more configuration details can be found here. 03 – Test Quartz.Net windows service installation So, I have installed Quartz.Net as a windows service, how do I test whether my installation has been successful. Open up cmd as an administrator and run the below command, C:\Program Files (x86)\Quartz.Net> Quartz.Server.exe –i Since by default the Quartz.net windows service writes INFO level diagnostics (this can be changed from Quartz.Server.exe.config) you should see the service information show up on the console. For instance in the example above I can see that the service is running in a NON CLUSTERED mode, its currently not started and is currently in standby mode with 0 number of jobs executed so far… This was second in the series of posts on enterprise scheduling using Quartz.net, in the next post I’ll be covering how to run your first scheduled task using Quartz.net windows service. Thank you for taking the time out and reading this blog post. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Stay tuned!

    Read the article

  • Add the Recycle Bin to Start Menu in Windows 7

    - by Matthew Guay
    Have you ever tried to open the Recycle Bin by searching for “recycle bin” in the Start menu search, only to find nothing?  Here’s a quick trick that will let you find the Recycle Bin directly from your Windows Start menu search. The Start menu search may be the best timesaver ever added to Windows.  In fact, we use it so much that it seems painful to manually search for a program when using Windows XP or older versions of Windows.  You can easily find files, folders, programs and more through the Start menu search in both Vista and Windows 7. However, one thing you cannot find is the recycle bin; if you enter this in the start menu search it will not find it. Here’s how to add the Recycle Bin to your Start menu search. What to do To access the Recycle Bin from the Start menu search, we need to add a shortcut to the start menu.  Windows includes a personal Start menu folder, and an All Users start menu folder which all users on the computer can see.  This trick only works in the personal Start menu folder. Open up an Explorer window (Simply click the Computer link in the start menu), click the white part of the address bar, and, enter the following (substitute your username for your_user_name) and hit Enter. C:\Users\your_user_name\AppData\Roaming\Microsoft\Windows\Start Menu Now, right-click in the folder, select New, and then click Shortcut. In the location box, enter the following: explorer.exe shell:RecycleBinFolder When you’ve done this, click Next. Now, enter a name for the shortcut.  You can enter Recycle Bin like the standard shortcut, or you could name it something else such as Trash…if that’s easier for you to remember.  Click Finish when your done. By default it will have a folder icon.  Let’s switch that to the standard Recycle Bin icon.  Right-click on the new shortcut and click Properties. Click Change Icon… Type the following in the “Look for icons in this file:” box, and press the Enter key on your keyboard: %SystemRoot%\system32\imageres.dll Now, scroll and find the Recycle Bin icon and click Ok. Click Ok in the previous dialog, and now your Recycle Bin shortcut has the correct icon.   You can even have multiple shortcuts with different names, so when you searched either Recycle Bin or Trash it would come up in the Start menu.  To do that, simply repeat these directions, and enter another name of your choice at the prompt.  Here we have both a Recycle Bin and a Trash icon. Now, when you enter Recycle Bin (or trash, depending on what you chose) in your Start menu search, you will see it at the top of your Start menu.  Simply press Enter or click on the icon to open the Recycle Bin.   This trick will work in Windows Vista too!  Simply follow these same directions, and you can add the Recycle Bin to your Vista Start menu and find it via search. This is a simple trick, but may make it  much easier for you to open your Recycle Bin directly from your Windows Vista or 7 Start menu search.  If you’re using Windows 7, you can also check out our directions on how to Add the Recycle Bin to the Taskbar in Windows 7. Similar Articles Productive Geek Tips Hide, Delete, or Destroy the Recycle Bin Icon in Windows 7 or VistaDisable Deletion of the Recycle Bin in Windows VistaHide the Recycle Bin Icon Text on Windows VistaAdd the Recycle Bin to the Taskbar in Windows 7Resize the Recycle Bin in XP TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional StockFox puts a Lightweight Stock Ticker in your Statusbar Explore Google Public Data Visually The Ultimate Excel Cheatsheet Convert the Quick Launch Bar into a Super Application Launcher Automate Tasks in Linux with Crontab Discover New Bundled Feeds in Google Reader

    Read the article

  • Customize Team Build 2010 – Part 12: How to debug my custom activities

    In the series the following parts have been published Part 1: Introduction Part 2: Add arguments and variables Part 3: Use more complex arguments Part 4: Create your own activity Part 5: Increase AssemblyVersion Part 6: Use custom type for an argument Part 7: How is the custom assembly found Part 8: Send information to the build log Part 9: Impersonate activities (run under other credentials) Part 10: Include Version Number in the Build Number Part 11: Speed up opening my build process template Part 12: How to debug my custom activities Part 13: Get control over the Build Output Part 14: Execute a PowerShell script Part 15: Fail a build based on the exit code of a console application       Developers are “spoilt” persons who expect to be able to have easy debugging experiences for every technique they work with. So they also expect it when developing custom activities for the build process template. This post describes how you can debug your custom activities without having to develop on the build server itself. Remote debugging prerequisites The prerequisite for these steps are to install the Microsoft Visual Studio Remote Debugging Monitor. You can find information how to install this at http://msdn.microsoft.com/en-us/library/bt727f1t.aspx. I chose for the option to run the remote debugger on the build server from a file share. Debugging symbols prerequisites To be able to start the debugging, you need to have the pdb files on the buildserver together with the assembly. The pdb must have been build with Full Debug Info. Steps In my setup I have a development machine and a build server. To setup the remote debugging, I performed the following steps Locate on your development machine the folder C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\Remote Debugger Create a share for the Remote Debugger folder. Make sure that the share (and the folder) has the correct permissions so the user on the build server has access to the share. On the build server go to the shared “Remote Debugger” folder Start msvsmon.exe which is located in the folder that represents the platform of the build server. This will open a winform application like   Go back to your development machine and open the BuildProcess solution. Start the Attach to process command (Ctrl+Alt+P) Type in the Qualifier the name of the build server. In my case the user account that has started the msvsmon is another user then the user on my development machine. In that case you have to type the qualifier in the format that is shown in the Remote Debugging Monitor (in my case LOCAL\Administrator@TFSLAB) and confirm it by pressing <Enter> Since the build service is running with other credentials, check the option “Show processes from all users”. Now the Attach to process dialog shows the TFSBuildServiceHost process Set the breakpoint in the activity you want to debug and kick of a build. Be aware that when you attach to the TFSBuildServiceHost that you debug every single build that is run by this windows service, so make sure you don’t debug the build server that is in production! You can download the full solution at BuildProcess.zip. It will include the sources of every part and will continue to evolve.

    Read the article

  • Read Mobi eBooks on Kindle for PC

    - by Matthew Guay
    Do you use your PC as a eBook reader?  Kindle for PC makes it easy to read thousands of books from the Kindle Store on your computer. What you may not know is that is also works with .mobi format too, so you can increase the amount of books you can read. Amazon has jumpstarted the eBook market with their popular Kindle device.  Last fall Amazon unveiled Kindle for PC, and we reviewed how you can Read Kindle Books On Your Computer with Kindle for PC.  Whether or not you own a Kindle or other eBook reader, this is a great way to take advantage of the thousands of eBooks available from the Kindle Store today. It supports azw, prc, and tpz format, which are sold from the Kindle store, but it also supports Mobipocket (.mobi) eBooks that are not DRM protected.  Here’s how you can add them to Kindle for PC so you can easily read them on your PC Getting Started: First, make sure you have Kindle for PC (link below) installed on your computer. Sign in with your Amazon account when you first run it. Kindle for PC lets you easily read eBooks downloaded from the Kindle Store, but it doesn’t have any way to add other eBooks directly from the program. To add eBooks, you can sometimes download and double-click on the books, and they will open in Kindle for PC and be automatically added to the library.  However, this does not always seem to work. So instead, browse to your Documents folder (simply click on the Documents link on your Start menu), and double-click on the My Kindle Content folder. This folder contains all the Kindle books you have downloaded.  If you have other eBooks you would like to add to Kindle for PC, simply drag-and-drop or copy and paste them into this folder.  Here we have a .mobi formatted book downloaded from the Gutenberg Project that we’re dragging into the folder. Now, close and reopen Kindle for PC.  It should now show your new eBook right beside the eBooks you have downloaded from the Kindle Store. These eBooks work just the same as the ones downloaded from the Kindle store, and you can change font size and add bookmarks just as with other eBooks. The eBooks downloaded this way may show up with either a Amazon logo or a mobile device icon.  You should only see the mobile device icon on .mobi files formatted for mobile devices; other ones should show up with the Amazon logo.  In this screen, Pilgrim’s Progress is a standard .mobi book, The Adventures of Sherlock Holmes is a mobipocket book, and the others are downloaded from the Kindle Store. Conclusion This is a great way to read eBooks from across the internet on Kindle for PC.  Wikipedia’s Kindle page has a list of websites that offer eBooks formatted for the Kindle, so be sure to check it out for more books. Links Download Kindle for PC List of websites that offer eBooks that will work on Kindle – via Wikipedia Similar Articles Productive Geek Tips Read Kindle Books On Your Computer with Kindle for PCInstall Adobe PDF Reader on Ubuntu EdgyHow to Access your Box.Net Account from Ubuntu the Easy Way TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional New Stinger from McAfee Helps Remove ‘FakeAlert’ Threats Google Apps Marketplace: Tools & Services For Google Apps Users Get News Quick and Precise With Newser Scan for Viruses in Ubuntu using ClamAV Replace Your Windows Task Manager With System Explorer Create Talking Photos using Fotobabble

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

< Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131  | Next Page >