Search Results

Search found 18533 results on 742 pages for 'intel visual fortran'.

Page 344/742 | < Previous Page | 340 341 342 343 344 345 346 347 348 349 350 351 | Next Page >

Python module: Trouble Installing Bitarray 0.8.0 on Mac OSX 10.7.4

- by Gabriele

I'm new here! I have trouble installing bitarray (vers 0.8.0) on my Mac OSX 10.7.4. Thanks! ('gcc' does not seem to be the problem) Last login: Sun Sep 9 22:24:25 on ttys000 host-001:~ gabriele$ gcc -version i686-apple-darwin11-llvm-gcc-4.2: no input files host-001:~ gabriele$ Last login: Sun Sep 9 22:18:41 on ttys000 host-001:~ gabriele$ cd /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/bitarray-0.8.0/ host-001:bitarray-0.8.0 gabriele$ python2.7 setup.py installrunning install running bdist_egg running egg_info creating bitarray.egg-info writing bitarray.egg-info/PKG-INFO writing top-level names to bitarray.egg-info/top_level.txt writing dependency_links to bitarray.egg-info/dependency_links.txt writing manifest file 'bitarray.egg-info/SOURCES.txt' reading manifest file 'bitarray.egg-info/SOURCES.txt' writing manifest file 'bitarray.egg-info/SOURCES.txt' installing library code to build/bdist.macosx-10.6-intel/egg running install_lib running build_py creating build creating build/lib.macosx-10.6-intel-2.7 creating build/lib.macosx-10.6-intel-2.7/bitarray copying bitarray/__init__.py -> build/lib.macosx-10.6-intel-2.7/bitarray copying bitarray/test_bitarray.py -> build/lib.macosx-10.6-intel-2.7/bitarray running build_ext building 'bitarray._bitarray' extension creating build/temp.macosx-10.6-intel-2.7 creating build/temp.macosx-10.6-intel-2.7/bitarray gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -isysroot /Developer/SDKs/MacOSX10.6.sdk -arch i386 -arch x86_64 -g -O2 -DNDEBUG -g -O3 -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c bitarray/_bitarray.c -o build/temp.macosx-10.6-intel-2.7/bitarray/_bitarray.o unable to execute gcc-4.2: No such file or directory error: command 'gcc-4.2' failed with exit status 1 host-001:bitarray-0.8.0 gabriele$

Read the article
Recovering data from failed Raid configuration with 4 drives and two raid sets (Asus P6T / Intel ICH10r)

- by user56365

I've added the complete detailed version for my question below for those who can help, but want to quickly summarize my question first. I setup two Raid arrays using (4) WD Raptors, a striped set for the OS and 1+0 set for crucial data. After booting once out of the 50 times a cable fell out, the drive wasn't recognized in the array anymore. After trying to fix it, another drive did the same. I now have two drives remaining, luckily with the parity information. I know the striped set is gone, but I need the data on the other set. Can anyone recommend anything to recover the data, or fix the two drives that doesn't allow the raid controller to recognize the drives, even though they are listed on the utility screen as still apart of the configuration but that they are not found? More Details I recently upgraded to a ASUS P6T motherboard with an Intel ICH10R raid controller and changed my previous 4 drive raid array from strictly a Raid 1+0 set to a Raid 0 for the OS/Page/Scratch drive and a Raid 1+0 set for crucial data. I never had problems after upgrading with my configuration, even when a drive died and was replaced. I managed to rebuild the array fine. Unfortunately this time around, a cable came unattached and I booted my system up until the raid status screen with the degraded error. This shouldn't have been a problem, but after I attached the drive it was no longer recognized as a member in the array. Both drives actually show up as a non-member disk. I've spent a very, very long time online trying to find information or support and haven't had much luck. After spending time trying to scan the drive for errors, damaged partition info, etc.. another drive in the set decided it didn't want to be recognized as a part of the array. At this point, I have two out of the four drives still functioning, but the Raid 1+0 array went from degraded to failed and I must find a way to retrieve that data. I think the two drives still in the array have the parity information because they show up as OS (110GB),BACKUP(80GB) and OS:1(110GB),BACKUP(80GB) under windows data management. The other two are simply 74gb Raw unallocated Is it possible recover the data using those two drive only, and which tool would I use? Could it be a simple partition table or any other error that is repairable with hard drive utilities out there? I know the Raid 0 set is done for, but I would assume because the correct drives failed in a 1+0 config to save the data I can retrieve it some how.

Read the article
Is Visual Source Safe (The latest Version) really that bad? Why? What's the Best Alternative? Why? [closed]

- by hanzolo

Over the years I've constantly heard horror stories, had people say "Real Programmers Dont Use VSS", and so on. BUT, then in the workplace I've worked at two companies, one, a very well known public facing high traffic website, and another high end Financial Services "Web-Based" hosted solution catering to some very large, very well known companies, which is where I currently Reside and everything's working just fine (KNOCK KNOCK!!). I'm constantly interfacing with EXTREMELY Old technology with some of these financial institutions.. OLD LIKE YOU WOULDN'T BELIEVE.. which leads me to the conclusion that if it works "LEAVE IT", and that maybe there's some value in old technology? at least enough value to overrule a rewrite!? right?? Is there something fundamentally flawed with the underlying technology that VSS uses? I have a feeling that if i said "someone said VSS Sucks" they would beg to differ, most likely give me this look like i dont know -ish, and I'd never gain back their respect and my credibility (well, that'll be hard to blow.. lol), BUT, give me an argument that I can take to someone whose been coding for 30 years, that builds Platforms that leverage current technology (.NET 3.5 / SQL 2008 R2 ), write's their own ORM with scaffolding and is able to provide a quality platform that supports thousands of concurrent users on a multi-tenant hosted solution, and does not agree with any benefits from having Source Control Integrated, and yet uses the Infamous Visual Source Safe. I have extensive experience with TFS up to 2010, and honestly I think it's great when a team (beyond developers) can embrace it. I've worked side by side with someone whose a die hard SVN'r and from a purist standpoint, I see the beauty in it (I need a bit more, out of my SS, but it surely suffices). So, why are such smarties not running away from Visual Source Safe? surely if it was so bad, it would've have been realized by now, and I would not be sitting here with this simple old, Check In, Check Out, Version Resistant, Label Intensive system. But here I am... I would love to drop an argument that would be the end all argument, but if it's a matter of opinion and personal experience, there seems to be too much leeway for keeping VSS. UPDATE: I guess the best case is to have the VSS supporters check other people's experiences and draw from that until we (please no) experience the breaking factor ourselves. Until then, i wont be engaging in a discussion to migrate off of VSS.. UPDATE 11-2012: So i was able to convince everyone at my work place that since MS is sun downing Visual Source Safe it might be time to migrate over to TFS. I was able to convince them and have recently upgraded our team to Visual Studio 2012 and TFS 2012. The migration was fairly painless, had to run analyze.exe which found a bunch of errors (not sure they'll ever affect the project) and then manually run the VSSConverter.exe. Again, painless, except it took 16 hours to migrate 5 years worth of everything.. and now we're on TFS.. much more integrated.. much more cooler.. so all in all, VSS served it's purpose for years without hick-up. There were no horror stories and Visual Source Save as source control worked just fine. so to all the nay sayers (me included). there's nothing wrong with using VSS. i wouldnt start a new project with it, and i would definitely consider migrating to TFS. (it's really not super difficult and a new "wizard" type converter is due out any day now so migrating should be painless). But from my experience, it worked just fine and got the job done.

Read the article
Combining two operators in Evil-mode Emacs

- by Dyslexic Tangent

In vim I've remapped > and < when in visual mode to >gv and <gv respectively, like so: vnoremap > >gv vnoremap < <gv Since my target for this question are folks experienced with emacs and not vim, what > and < do is indent/dedent visually selected text. What gv does is reselect the previously selected text. These maps cause > and < to indent/dedent and then reselect the previously selected text. I'm trying out emacs with evil-mode and I'd like to do the same, but I'm having some difficulty figuring out how, exactly, to accomplish the automatic reselection. It looks like I need to somehow call evil-shift-right and evil-visual-restore sequentially, but I don't know how to create a map that will do both, so I tried creating my own function which would call both sequentially and map that instead, but it didn't work, possibly due to the fact that both of them are defined, not as functions with defun but instead as operators with evil-define-operator. I tried creating my own operators: (evil-define-operator shift-left-reselect (beg end) (evil-shift-left beg end) (evil-visual-restore)) (evil-define-operator shift-right-reselect (beg end) (evil-shift-right beg end) (evil-visual-restore)) but that doesn't restore visual as expected. A stab in the dark gave me this: (evil-define-operator shift-left-reselect (beg end) (evil-shift-left beg end) ('evil-visual-restore)) (evil-define-operator shift-right-reselect (beg end) (evil-shift-right beg end) ('evil-visual-restore)) but that selects one additional line whenever it is supposed to reselect. For now I've been using the following, which only has the problem where it reselects an additional line in the < operator. (evil-define-operator shift-right-reselect (beg end) (evil-shift-right beg end) (evil-visual-make-selection beg end)) (evil-define-operator shift-left-reselect (beg end) (evil-shift-left beg end) (evil-visual-make-selection beg end)) and I've mapped them: (define-key evil-visual-state-map ">" 'shift-right-reselect) (define-key evil-visual-state-map "<" 'shift-left-reselect) any help / pointers / tips would be greatly appreciated. Thanks in advance.

Read the article
Suitable Ubuntu distribution

- by Dr AMD

I need help choosing a suitable distrbution for my PC. I am using an HP d530 CMT with: ?• CPU Type: Intel Pentium 4, 3000 MHz (15 x 200) ?• Motherboard Chipset: Intel Springdale-G i865G ?• System Memory: 1015 MB (PC3200 DDR SDRAM) ?• Video Adapter: Intel(R) 82865G Graphics Controller (96 MB) ?• 3D Accelerator: Intel Extreme Graphics 2 ?• Audio Adapter: Analog Devices AD1981B(L) @ Intel 82801EB ICH5 - AC'97 Audio Controller ?• Network Adapter: Broadcom NetXtreme Gigabit Ethernet I have tried to install Ubuntu 13.10 and 12.04 LTS. Everything is OK on Ubuntu 12.04 except, that the video card was not recognized and the media player, YouTube,etc. did not work properly.

Read the article
xutility file???

- by user574290

Hi all. I'm trying to use c code with opencv in face detection and counting, but I cannot build the source. I am trying to compile my project and I am having a lot of problems with a line in the xutility file. the error message show that it error with xutility file. Please help me, how to solve this problem? this is my code // Include header files #include "stdafx.h" #include "cv.h" #include "highgui.h" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <assert.h> #include <math.h> #include <float.h> #include <limits.h> #include <time.h> #include <ctype.h> #include <iostream> #include <fstream> #include <vector> using namespace std; #ifdef _EiC #define WIN32 #endif int countfaces=0; int numFaces = 0; int k=0 ; int list=0; char filelist[512][512]; int timeCount = 0; static CvMemStorage* storage = 0; static CvHaarClassifierCascade* cascade = 0; void detect_and_draw( IplImage* image ); void WriteInDB(); int found_face(IplImage* img,CvPoint pt1,CvPoint pt2); int load_DB(char * filename); const char* cascade_name = "C:\\Program Files\\OpenCV\\OpenCV2.1\\data\\haarcascades\\haarcascade_frontalface_alt_tree.xml"; // BEGIN NEW CODE #define WRITEVIDEO char* outputVideo = "c:\\face_counting1_tracked.avi"; //int faceCount = 0; int posBuffer = 100; int persistDuration = 10; //faces can drop out for 10 frames int timestamp = 0; float sameFaceDistThreshold = 30; //pixel distance CvPoint facePositions[100]; int facePositionsTimestamp[100]; float distance( CvPoint a, CvPoint b ) { float dist = sqrt(float ( (a.x-b.x)*(a.x-b.x) + (a.y-b.y)*(a.y-b.y) ) ); return dist; } void expirePositions() { for (int i = 0; i < posBuffer; i++) { if (facePositionsTimestamp[i] <= (timestamp - persistDuration)) //if a tracked pos is older than three frames { facePositions[i] = cvPoint(999,999); } } } void updateCounter(CvPoint center) { bool newFace = true; for(int i = 0; i < posBuffer; i++) { if (distance(center, facePositions[i]) < sameFaceDistThreshold) { facePositions[i] = center; facePositionsTimestamp[i] = timestamp; newFace = false; break; } } if(newFace) { //push out oldest tracker for(int i = 1; i < posBuffer; i++) { facePositions[i] = facePositions[i - 1]; } //put new tracked position on top of stack facePositions[0] = center; facePositionsTimestamp[0] = timestamp; countfaces++; } } void drawCounter(IplImage* image) { // Create Font char buffer[5]; CvFont font; cvInitFont(&font, CV_FONT_HERSHEY_SIMPLEX, .5, .5, 0, 1); cvPutText(image, "Faces:", cvPoint(20, 20), &font, CV_RGB(0,255,0)); cvPutText(image, itoa(countfaces, buffer, 10), cvPoint(80, 20), &font, CV_RGB(0,255,0)); } #ifdef WRITEVIDEO CvVideoWriter* videoWriter = cvCreateVideoWriter(outputVideo, -1, 30, cvSize(240, 180)); #endif //END NEW CODE int main( int argc, char** argv ) { CvCapture* capture = 0; IplImage *frame, *frame_copy = 0; int optlen = strlen("--cascade="); const char* input_name; if( argc > 1 && strncmp( argv[1], "--cascade=", optlen ) == 0 ) { cascade_name = argv[1] + optlen; input_name = argc > 2 ? argv[2] : 0; } else { cascade_name = "C:\\Program Files\\OpenCV\\OpenCV2.1\\data\\haarcascades\\haarcascade_frontalface_alt_tree.xml"; input_name = argc > 1 ? argv[1] : 0; } cascade = (CvHaarClassifierCascade*)cvLoad( cascade_name, 0, 0, 0 ); if( !cascade ) { fprintf( stderr, "ERROR: Could not load classifier cascade\n" ); fprintf( stderr, "Usage: facedetect --cascade=\"<cascade_path>\" [filename|camera_index]\n" ); return -1; } storage = cvCreateMemStorage(0); //if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0') ) // capture = cvCaptureFromCAM( !input_name ? 0 : input_name[0] - '0' ); //else capture = cvCaptureFromAVI( "c:\\face_counting1.avi" ); cvNamedWindow( "result", 1 ); if( capture ) { for(;;) { if( !cvGrabFrame( capture )) break; frame = cvRetrieveFrame( capture ); if( !frame ) break; if( !frame_copy ) frame_copy = cvCreateImage( cvSize(frame->width,frame->height), IPL_DEPTH_8U, frame->nChannels ); if( frame->origin == IPL_ORIGIN_TL ) cvCopy( frame, frame_copy, 0 ); else cvFlip( frame, frame_copy, 0 ); detect_and_draw( frame_copy ); if( cvWaitKey( 30 ) >= 0 ) break; } cvReleaseImage( &frame_copy ); cvReleaseCapture( &capture ); } else { if( !input_name || (isdigit(input_name[0]) && input_name[1] == '\0')) cvNamedWindow( "result", 1 ); const char* filename = input_name ? input_name : (char*)"lena.jpg"; IplImage* image = cvLoadImage( filename, 1 ); if( image ) { detect_and_draw( image ); cvWaitKey(0); cvReleaseImage( &image ); } else { /* assume it is a text file containing the list of the image filenames to be processed - one per line */ FILE* f = fopen( filename, "rt" ); if( f ) { char buf[1000+1]; while( fgets( buf, 1000, f ) ) { int len = (int)strlen(buf); while( len > 0 && isspace(buf[len-1]) ) len--; buf[len] = '\0'; image = cvLoadImage( buf, 1 ); if( image ) { detect_and_draw( image ); cvWaitKey(0); cvReleaseImage( &image ); } } fclose(f); } } } cvDestroyWindow("result"); #ifdef WRITEVIDEO cvReleaseVideoWriter(&videoWriter); #endif return 0; } void detect_and_draw( IplImage* img ) { static CvScalar colors[] = { {{0,0,255}}, {{0,128,255}}, {{0,255,255}}, {{0,255,0}}, {{255,128,0}}, {{255,255,0}}, {{255,0,0}}, {{255,0,255}} }; double scale = 1.3; IplImage* gray = cvCreateImage( cvSize(img->width,img->height), 8, 1 ); IplImage* small_img = cvCreateImage( cvSize( cvRound (img->width/scale), cvRound (img->height/scale)), 8, 1 ); CvPoint pt1, pt2; int i; cvCvtColor( img, gray, CV_BGR2GRAY ); cvResize( gray, small_img, CV_INTER_LINEAR ); cvEqualizeHist( small_img, small_img ); cvClearMemStorage( storage ); if( cascade ) { double t = (double)cvGetTickCount(); CvSeq* faces = cvHaarDetectObjects( small_img, cascade, storage, 1.1, 2, 0/*CV_HAAR_DO_CANNY_PRUNING*/, cvSize(30, 30) ); t = (double)cvGetTickCount() - t; printf( "detection time = %gms\n", t/((double)cvGetTickFrequency()*1000.) ); if (faces) { //To save the detected faces into separate images, here's a quick and dirty code: char filename[6]; for( i = 0; i < (faces ? faces->total : 0); i++ ) { /* CvRect* r = (CvRect*)cvGetSeqElem( faces, i ); CvPoint center; int radius; center.x = cvRound((r->x + r->width*0.5)*scale); center.y = cvRound((r->y + r->height*0.5)*scale); radius = cvRound((r->width + r->height)*0.25*scale); cvCircle( img, center, radius, colors[i%8], 3, 8, 0 );*/ // Create a new rectangle for drawing the face CvRect* r = (CvRect*)cvGetSeqElem( faces, i ); // Find the dimensions of the face,and scale it if necessary pt1.x = r->x*scale; pt2.x = (r->x+r->width)*scale; pt1.y = r->y*scale; pt2.y = (r->y+r->height)*scale; // Draw the rectangle in the input image cvRectangle( img, pt1, pt2, CV_RGB(255,0,0), 3, 8, 0 ); CvPoint center; int radius; center.x = cvRound((r->x + r->width*0.5)*scale); center.y = cvRound((r->y + r->height*0.5)*scale); radius = cvRound((r->width + r->height)*0.25*scale); cvCircle( img, center, radius, CV_RGB(255,0,0), 3, 8, 0 ); //update counter updateCounter(center); int y=found_face(img,pt1,pt2); if(y==0) countfaces++; }//end for printf("Number of detected faces: %d\t",countfaces); }//end if //delete old track positions from facePositions array expirePositions(); timestamp++; //draw counter drawCounter(img); #ifdef WRITEVIDEO cvWriteFrame(videoWriter, img); #endif cvShowImage( "result", img ); cvDestroyWindow("Result"); cvReleaseImage( &gray ); cvReleaseImage( &small_img ); }//end if } //end void int found_face(IplImage* img,CvPoint pt1,CvPoint pt2) { /*if (faces) {*/ CvSeq* faces = cvHaarDetectObjects( img, cascade, storage, 1.1, 2, CV_HAAR_DO_CANNY_PRUNING, cvSize(40, 40) ); int i=0; char filename[512]; for( i = 0; i < (faces ? faces->total : 0); i++ ) {//int scale = 1, i=0; //i=iface; //char filename[512]; /* extract the rectanlges only */ // CvRect face_rect = *(CvRect*)cvGetSeqElem( faces, i); CvRect face_rect = *(CvRect*)cvGetSeqElem( faces, i); //IplImage* gray_img = cvCreateImage( cvGetSize(img), IPL_DEPTH_8U, 1 ); IplImage* clone = cvCreateImage (cvSize(img->width, img->height), IPL_DEPTH_8U, img->nChannels ); IplImage* gray = cvCreateImage (cvSize(img->width, img->height), IPL_DEPTH_8U, 1 ); cvCopy (img, clone, 0); cvNamedWindow ("ROI", CV_WINDOW_AUTOSIZE); cvCvtColor( clone, gray, CV_RGB2GRAY ); face_rect.x = pt1.x; face_rect.y = pt1.y; face_rect.width = abs(pt1.x - pt2.x); face_rect.height = abs(pt1.y - pt2.y); cvSetImageROI ( gray, face_rect); //// * rectangle = cvGetImageROI ( clone ); face_rect = cvGetImageROI ( gray ); cvShowImage ("ROI", gray); k++; char *name=0; name=(char*) calloc(512, 1); sprintf(name, "Image%d.pgm", k); cvSaveImage(name, gray); //////////////// for(int j=0;j<512;j++) filelist[list][j]=name[j]; list++; WriteInDB(); //int found=SIFT("result.txt",name); cvResetImageROI( gray ); //return found; return 0; // }//end if }//end for }//end void void WriteInDB() { ofstream myfile; myfile.open ("result.txt"); for(int i=0;i<512;i++) { if(strcmp(filelist[i],"")!=0) myfile << filelist[i]<<"\n"; } myfile.close(); } Error 3 error C4430: missing type specifier - int assumed. Note: C++ does not support default-int Error 8 error C4430: missing type specifier - int assumed. Note: C++ does not support default-int Error 13 error C4430: missing type specifier - int assumed. Note: C++ does not support default-int c:\program files\microsoft visual studio 9.0\vc\include\xutility 766 Error 18 error C4430: missing type specifier - int assumed. Note: C++ does not support default-int c:\program files\microsoft visual studio 9.0\vc\include\xutility 768 Error 23 error C4430: missing type specifier - int assumed. Note: C++ does not support default-int c:\program files\microsoft visual studio 9.0\vc\include\xutility 769 Error 10 error C2868: 'std::iterator_traits<_Iter>::value_type' : illegal syntax for using-declaration; expected qualified-name c:\program files\microsoft visual studio 9.0\vc\include\xutility 765 Error 25 error C2868: 'std::iterator_traits<_Iter>::reference' : illegal syntax for using-declaration; expected qualified-name c:\program files\microsoft visual studio 9.0\vc\include\xutility 769 Error 20 error C2868: 'std::iterator_traits<_Iter>::pointer' : illegal syntax for using-declaration; expected qualified-name c:\program files\microsoft visual studio 9.0\vc\include\xutility 768 Error 5 error C2868: 'std::iterator_traits<_Iter>::iterator_category' : illegal syntax for using-declaration; expected qualified-name c:\program files\microsoft visual studio 9.0\vc\include\xutility 764 Error 15 error C2868: 'std::iterator_traits<_Iter>::difference_type' : illegal syntax for using-declaration; expected qualified-name c:\program files\microsoft visual studio 9.0\vc\include\xutility 766 Error 9 error C2602: 'std::iterator_traits<_Iter>::value_type' is not a member of a base class of 'std::iterator_traits<_Iter>' c:\program files\microsoft visual studio 9.0\vc\include\xutility 765 Error 24 error C2602: 'std::iterator_traits<_Iter>::reference' is not a member of a base class of 'std::iterator_traits<_Iter>' c:\program files\microsoft visual studio 9.0\vc\include\xutility 769 Error 19 error C2602: 'std::iterator_traits<_Iter>::pointer' is not a member of a base class of 'std::iterator_traits<_Iter>' c:\program files\microsoft visual studio 9.0\vc\include\xutility 768 Error 4 error C2602: 'std::iterator_traits<_Iter>::iterator_category' is not a member of a base class of 'std::iterator_traits<_Iter>' c:\program files\microsoft visual studio 9.0\vc\include\xutility 764 Error 14 error C2602: 'std::iterator_traits<_Iter>::difference_type' is not a member of a base class of 'std::iterator_traits<_Iter>' c:\program files\microsoft visual studio 9.0\vc\include\xutility 766 Error 7 error C2146: syntax error : missing ';' before identifier 'value_type' c:\program files\microsoft visual studio 9.0\vc\include\xutility 765 Error 22 error C2146: syntax error : missing ';' before identifier 'reference' c:\program files\microsoft visual studio 9.0\vc\include\xutility 769 Error 17 error C2146: syntax error : missing ';' before identifier 'pointer' c:\program files\microsoft visual studio 9.0\vc\include\xutility 768 Error 2 error C2146: syntax error : missing ';' before identifier 'iterator_category' c:\program files\microsoft visual studio 9.0\vc\include\xutility 764 Error 12 error C2146: syntax error : missing ';' before identifier 'difference_type' c:\program files\microsoft visual studio 9.0\vc\include\xutility 766 Error 6 error C2039: 'value_type' : is not a member of 'CvPoint' c:\program files\microsoft visual studio 9.0\vc\include\xutility 765 Error 21 error C2039: 'reference' : is not a member of 'CvPoint' c:\program files\microsoft visual studio 9.0\vc\include\xutility 769 Error 16 error C2039: 'pointer' : is not a member of 'CvPoint' c:\program files\microsoft visual studio 9.0\vc\include\xutility 768 Error 1 error C2039: 'iterator_category' : is not a member of 'CvPoint' c:\program files\microsoft visual studio 9.0\vc\include\xutility 764 Error 11 error C2039: 'difference_type' : is not a member of 'CvPoint' c:\program files\microsoft visual studio 9.0\vc\include\xutility 766

Read the article
No HDMI sound output on Thinkpad X1

- by nickf

I'm having problems getting my sound to output via HDMI to my TV. When I go to Sound Settings, the HDMI device does not appear. ~$ aplay -l **** List of PLAYBACK Hardware Devices **** card 0: PCH [HDA Intel PCH], device 0: CONEXANT Analog [CONEXANT Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 3: HDMI 0 [HDMI 0] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 7: HDMI 1 [HDMI 1] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 8: HDMI 2 [HDMI 2] Subdevices: 1/1 Subdevice #0: subdevice #0 I don't know if the video information is helpful, but anyway: ~$ sudo lshw -C video *-display description: VGA compatible controller product: 2nd Generation Core Processor Family Integrated Graphics Controller vendor: Intel Corporation physical id: 2 bus info: pci@0000:00:02.0 version: 09 width: 64 bits clock: 33MHz capabilities: msi pm vga_controller bus_master cap_list rom configuration: driver=i915 latency=0 resources: irq:46 memory:d0000000-d03fffff memory:c0000000-cfffffff ioport:5000(size=64) Any suggestions for me?

Read the article
CodePlex Daily Summary for Friday, November 02, 2012

CodePlex Daily Summary for Friday, November 02, 2012Popular ReleasesVST.NET: VST.NET 1.0: Long overdue, but here is version 1.0! The zip contains the Debug and Release binaries for x86 and x64 as well as .NET 2.0 and .NET 4.0. Note that the samples sources are not included in the release. Refer to "Building the Source Code" to get them working in the Visual Studio Express editions. The documentation will be uploaded later... Changes: Removed nuget and fixed samples copy operation (*.exe). Finalized build automation support. Bug fixes in VS templates. Compiled and Packaged n...DevTreks -social budgeting that improves lives and livelihoods: DevTreks Version 1.0: This is the first production release.Mouse Jiggler: MouseJiggle-1.3: This adds the much-requested minimize-to-tray feature to Mouse Jiggler.Umbraco CMS: Umbraco 4.10.0 Release Candidate: This is a Release Candidate, which means that if we do not find any major issues in the next week, we will release this version as the final release of 4.10.0 on November 9th, 2012. The documentation for the MVC bits still lives in the Github version of the docs for now and will be updated on our.umbraco.org with the final release of 4.10.0. Browse the documentation here: https://github.com/umbraco/Umbraco4Docs/tree/4.8.0/Documentation/Reference/Mvc If you want to do only MVC then make sur...Skype Auto Recorder: SkypeAutoRecorder 1.3.4: New icon and images. Reworked settings window. Implemented high-quality sound encoding. Implemented a possibility to produce stereo records. Added buttons with system-wide hot keys for manual starting and canceling of recording. Added buttons for opening folder with records. Added Help button. Fixed an issue when recording is continuing after call end. Fixed an issue when recording doesn't start. Fixed several bugs and improved stability. Major refactoring and optimization...Access 2010 Application Platform - Build Your Own Database: Application Platform - 0.0.1: Initial Release This is the first version of the database. At the moment is all contained in one file to make development easier, but the obvious idea would be to split it into Front and Back End for a production version of the tool. The features it contains at the moment are the "Core" features.Python Tools for Visual Studio: Python Tools for Visual Studio 1.5: We’re pleased to announce the release of Python Tools for Visual Studio 1.5 RTM. Python Tools for Visual Studio (PTVS) is an open-source plug-in for Visual Studio which supports programming with the Python language. PTVS supports a broad range of features including CPython/IronPython, Edit/Intellisense/Debug/Profile, Cloud, HPC, IPython, etc. support. For a quick overview of the general IDE experience, please watch this video There are a number of exciting improvement in this release comp...Devpad: 4.25: Whats new for Devpad 4.25: New Theme support New Export Wordpress Minor Bug Fix's, improvements and speed upsAssaultCube Reloaded: 2.5.5: Linux has Ubuntu 11.10 32-bit precompiled binaries and Ubuntu 10.10 64-bit precompiled binaries, but you can compile your own as it also contains the source. If you are using Mac or other operating systems, please wait while we try to package for those OSes. Try to compile it. If it fails, download a virtual machine. The server pack is ready for both Windows and Linux, but you might need to compile your own for Linux (source included) Changelog: Fixed potential bot bugs: Map change, OpenAL...Edi: Edi 1.0 with DarkExpression: Added DarkExpression theme (dialogs and message boxes are not completely themed, yet)DirectX Tool Kit: October 30, 2012 (add WP8 support): October 30, 2012 Added project files for Windows Phone 8MCEBuddy 2.x: MCEBuddy 2.3.6: Changelog for 2.3.6 (32bit and 64bit) 1. Fixed a bug in multichannel audio conversion failure. AAC does not support 6 channel audio, MCEBuddy now checks for it and force the output to 2 channel if AAC codec is specified 2. Fixed a bug in Original Broadcast Date and Time. Original Broadcast Date and Time is reported in UTC timezone in WTV metadata. TVDB and MovieDB dates are reported in network timezone. It is assumed the video is recorded and converted on the same machine, i.e. local timezone...MVVM Light Toolkit: MVVM Light Toolkit V4.1 for Visual Studio 2012: This version only supports Visual Studio 2012 (and all Express editions too). If you use Visual Studio 2010, please stay tuned, we will publish an update in a few days with support for VS10. V4.1 supports: Windows Phone 8 Windows 8 (Windows RT) Silverlight 5 Silverlight 4 WPF 4.5 WPF 4 WPF 3.5 And the following development environments: Visual Studio 2012 (Pro, Premium, Ultimate) Visual Studio 2012 Express for Windows 8 Visual Studio 2012 Express for Windows Phone 8 Visual...Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.73: Fix issue in Discussion #401101 (unreferenced var in a for-in statement was getting removed). add the grouping operator to the parsed output so that unminified parsed code is closer to the original. Will still strip unneeded parens later, if minifying. more cleaning of references as they are minified out of the code.RiP-Ripper & PG-Ripper: PG-Ripper 1.4.03: changes NEW: Added Support for the phun.org forum FIXED: Kitty-Kats new Forum UrlLiberty: v3.4.0.1 Release 28th October 2012: Change Log -Fixed -H4 Fixed the save verification screen showing incorrect mission and difficulty information for some saves -H4 Hopefully fixed the issue where progress did not save between missions and saves would not revert correctly -H3 Fixed crashes that occurred when trying to load player information -Proper exception dialogs will now show in place of crashesPlayer Framework by Microsoft: Player Framework for Windows 8 (Preview 7): This release is compatible with the version of the Smooth Streaming SDK released today (10/26). Release 1 of the player framework is expected to be available next week. IMPROVEMENTS & FIXESIMPORTANT: List of breaking changes from preview 6 Support for the latest smooth streaming SDK. Xaml only: Support for moving any of the UI elements outside the MediaPlayer (e.g. into the appbar). Note: Equivelent changes to the JS version due in coming week. Support for localizing all text used in t...Send multiple SMS via Way2SMS C#: SMS 1.1: Added support for 160by2Quick Launch: Quick Launch 1.0: A Lightweight and Fast Way to Manage and Launch Thousands of Tools and ApplicationsPress Win+Q and start to search and run. http://www.codeplex.com/Download?ProjectName=quicklaunch&DownloadId=523536Orchard Project: Orchard 1.6: Please read our release notes for Orchard 1.6: http://docs.orchardproject.net/Documentation/Orchard-1-6-Release-Notes Please do not post questions as reviews. Questions should be posted in the Discussions tab, where they will usually get promptly responded to. If you post a question as a review, you will pollute the rating, and you won't get an answer.New ProjectsAnother Green World: Another Green WorldApplication Data across Web Farm: A library containing a new version of the HttpApplicationState class that allow the synchronization of data across a web farm. With this library, on the web farm Data can be: • Shared • Auto synchronized • Locked for a useras3 game: it's as3 framework . Context: UIFramework,GameFrameworkBit Moose: Bit Moose is a bitcoin mining assistant program. It allows miners to run under a background windows service. Includes a GUI and console host.BUMO: A TFS Build MOnitoring Tool: BUMO is short name of Build Monitoring tool for TFS. BUMO provides a platform to Monitor TFS builds, Statistics of Builds,View Build History, Clone Builds.cantinho: cantinhoCMTS: CMTSCP866U Encoding: A simple implementation of cp866u encoding class written in C#.D3D9Client: This is a DirectX 9 Graphics Client for Orbiter SpaceFlight SimulatorDatabase Exporter for DotNetNuke By IowaComputerGurus Inc.: Database Exporter is a customized SQL module designed for selects and supports export to XML/CSV of query resultsDistributed Systems at TUM: The initial project is a simple client connecting to a server through sockets. The client may send a message and the server will echo back that message.Enneract Project: Enneract makes it easier for LDS Leadership works in theirs responsibilities through BI techniques.EntityFramework Reverse POCO Code First Generator: Reverse engineers an existing database and generates EntityFramework Code First POCO classes, DbContext and Configuration mappingsfastCSharp: ?????????????????，????????????????????，??????。 ??：????????????????，???????????，???????????。 ??? http://www.51nod.com/topic/index.html#!topicId=100000056Fault Logger!: Fault Logger is a simple web application ideal for schools and small business which allows staff to report computer faults and allows technicians to keep logs.Flow Sequencer: Simple task sequencer using Johnson's algorithm for two and three machines. Client side is made in wpf technology. Free Template Filler (FTF): Tool for generating text from templates, when given parameters. Good for generating code, etc.GameTrakXNA: This project aims to create a simple library to use the unique GameTrak controller within XNA and Flash.Greg DNN Task Manager: This is a test project that I am using to learn about DNN Module Development.GroupToolbar: A Toolbar that can group your items and is totally templatable :)Helper Project: Helpdesk project, using C# .NET, Linq, with Nhibernate ORM and interface written using ExtJS 4.HIPO Legacy Systems Flowcharter: Creates Visio HIPO flowcharts from legacy system batch files.huangli1101: jabbr projectHusqvarna Svishtov: The idea is to create content management system for web site of a small town store.Hybrid Lab Workflow: This workflow allows you to incorporate snapshots into a Build-Deploy-Test using a Standard Environment that is composed of Virtual Machines in TFS 2012.jas: Project to make an application for Jeugdondernemingen Aartselaar Service...keleyi: MD5 C# WinFormKnockout.js Declarations for TypeScript: A set of declarations for intellisense and type completion for Knockout.js in TypeScript. Last edited Today at 8:39 AM by jnosek, version 2List Rollup web part: Hi I was wondering that how to show a list from other site on my home page. But I didn’t get any proper & free solution. So I decided to create a Cross Site LisMidlands Community Management Solution (MCMS): This project is to develop an open source residential community management solution. This initiative has been taken by the IT guys of Srijan Midlands Community.MultipartHttpClient for Windows Phone 7: This is a simple MultipartHttpClient for windows phone 7MyGProject: GprojectN2F Yverdon FirePHP Extension: Extension to add FirePHP support to your N2F Yverdon projects.netbee: ???Over Look Pa Controller: Real World Application for the Collection of Credit Card, Gift Card and Driver License information. Controls access to an Observation Tower via a turnstile.PureSystems DotNetNuke GoSquared tracking module: A DotNetNuke module which adds the GoSquared tracking code to your pages.Quiz Module for DotNetNuke by IowaComputerGurus Inc.: The IowaComputerGurus DotNetNuke Quiz Module is a free extension that allows users to quickly and easily create custom quizzes for their site.RadioSmart: ? ??d??a? t?? radiosmartRSA ID Validation for SQL Server: Solution for validating South African identity numbers. Provides SQL Server CLR bindings which allow identity numbers to be efficiently validated within T-SQLSales Visualization Web App: Sales VisualizationsSonar Connector (Wagga Wagga Christian College Network connector): A network settings manager for Wagga Wagga christian college. Supports switching back and forth settings and can be used for personal use.Stopwatch - Windows Phone: This project is a Stopwatch for windows phone app. Now, it had published in Windows phone store.Study: Study for ExtJS! Come on!Team Foundation Server Test Management Tools: Team Foundation Server Test Management Tools ? Team Foundation Server ???? ???? Visual Studio ??????????????????????????。testdd11012012git01: ttesttom11012012tfs02: gdgf dgf dTimBazinga EVoting: Undergrad project - designing an e-voting software system.TNT Scripts: TNT ScriptsTrombone: Trombone makes it easier for Windows Mobile Professional users to automate status reply through SMS. It's developed in Visual C# 2008.TurbofilmTV: Turbofilm metro is the greate metro app for turbofilm users.VinculacionMicrosoft: Vinculacion Microsoft is a project for distributing Dreamsparks and Faculty Connection codes to students and professors. It is developed in ASP .Net and designed for Universities in Mexico interested in the different benefits that Microsoft has for them. Visual Studio Reference Swap: Winforms app and Nant task that will handle swapping out project references to file references.??????: ??????????: ??

Read the article
Windows Azure Evolution – Welcome to VS2012

- by Shaun

When the Microsoft released the first preview version of Windows 8 and Visual Studio, many people in the community were asking if the windows azure tool is available to it. The answer was “NO”. Microsoft promised that the windows azure tool will only support the Visual Studio 2010 but when the 2012 was final released, windows azure tool should be work. But now alone with the new windows azure platform was published we got the latest Windows Azure SDK 1.7, which is compatible to the Visual Studio 2012 RC. You can retrieve the latest version of the Windows Azure SDK through Web Platform Installer, which I think it’s the easiest and simplest way to download and install, since besides the SDK itself it also needs some other components. To download the latest windows azure SDK from Web Platform Installer, just go to the windows azure website and clicked the Develop, .NET and click the blue “install” button. Then you need to select which version of Visual Studio you want to use, Visual Studio 2010 or Visual Studio 2012 RC. After selected the current version you will download an EXE file. This file will lead you to install the Web Platform Installer 4.0 (if you haven’t installed) and the latest windows azure SDK. You can see the version name is June 2012, 1.7. Finally the WebPI will detect the dependent components you need to download and begin to install. But if you want to challenge yourself you can download the components and install them manually. The standalone installations are listed in this page with the instruction on how to install them with necessary pre-requirements. Once you finished the installation you can open the Visual Studio 2012 RC and as usual, it need to be run as administrator. If you clicked the New Project link from the start page, navigated to Cloud category you will find that there no project template available. Is there anything wrong? So, if you changed the target framework from the default .NET 4.5 to .NET 4 you will see the azure project template. This is because, currently the windows azure instance does not support .NET 4.5. After clicked OK you will see the role creation window, which is similar as what you have seen before. But there are some new role templates in this SDK. Firstly you will have ASP.NET MVC 4 web role available, which means you can create ASP.NET MVC 4 applications for internet, intranet, mobile and WebAPI on the cloud. Then there are two new worker role templates, “Cache Worker Role” and “Worker Role with Service Bus Queue”. “Worker Role with Service Bus Queue” is a worker role which had added necessary references to access the Windows Azure Service Bus Queue. It also have some basic sample code in the worker role class which could read messages from the queue when started. The “Cache Worker Role” is a worker role which has the in-memory distributed cache feature enabled by default. This feature is different than the Windows Azure Caching. It allows the role instance to use its memory as a in-memory distributed cache clusters. By using this feature you can have one or more worker roles as some dedicate cache clusters. Alternatively, you can make part of your web role and worker role’s memory as the cache clusters as well. Let’s just create an ASP.NET MVC 4 Web Role, and click F5 to run it under the local emulator. If you have been working with azure for a while you should know that I need to setup the local storage emulator before running locally if it’s a fresh azure SDK installation. But in this version when we started our azure project the Visual Studio will check if the storage emulator had been initialized. If not, it will run the initializer automatically. And as you can see, in this version the storage emulator relies on the SQL Server 2012 Local DB feature. It will create the emulator database and tables in the default local database. You can set the storage emulator to use a standard SQL Server default instance by using the command “dsinit /instance:.”. The “dsinit” tool now is located at %PROGRAM FILES%\Microsoft SDKs\Windows Azure\Emulator\devstore After the Visual Studio complied and deployed the package our website should be shown in the browser. This is the MVC 4 Web Role home page on my Windows 8 machine in IE10. Another thing you might notice is that, in this version the compute emulator utilizes IIS Express to host the web roles instead of the full IIS. You can add breakpoint in the code and debug, and you can use the local storage emulator to test your code for accessing the storage service. All of them are same as what your are doing now on SDK 1.6. You can switch to use IIS to run your web role in local emulator. Just open the windows azure porject property windows, in the Web page select “Use IIS Web Server”. For more information about this please have a look on Nuno’s blog post. In the role property page in Visual Studio there’s no massive changes. You can configure your role settings such as the endpoints, certificates and local storage, etc.. One thing was added is the Caching tab. Here you can specify enable the caching feature or not, and how much memory you want to use as the cache cluster. I will introduce more details about it in the future posts. The publish and package feature are also no change. You can publish your project to azure directly through Visual Studio 2012, while you can create the package and upload manually. Below is the SDK version of my deployment which is 1.7.30602.1703 in the developer portal. Summary In this post I introduced about the new Windows Azure SDK 1.7 especially on how it works on the latest Visual Studio 2012 RC. There’s no significant changes in the visual studio tool in this version but some small enhancement such as ASP.NET MVC 4, Cache Worker Role, using SQL 2012 Local DB and IIS Express, etc.. Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

Read the article
Migrating ASP.NET MVC 1.0 applications to ASP.NET MVC 2 RTM

- by Eilon

Note: ASP.NET MVC 2 RTM isn’t yet released! But this tool will help you get your ASP.NET MVC 1.0 applications ready for when it is! I have updated the MVC App Converter to convert projects from ASP.NET MVC 1.0 to ASP.NET MVC 2 RTM. This should be last the last major change to the MVC App Converter that I released previews of in the past several months. Download The app is a single executable: Download MvcAppConverter-MVC2RTM.zip (255 KB). Usage The only requirement for this tool is that you have .NET Framework 3.5 SP1 on the machine. You do not need to have Visual Studio or ASP.NET MVC installed (unless you want to open your project!). Even though the tool performs an automatic backup of your solution it is recommended that you perform a manual backup of your solution as well. To convert an ASP.NET MVC 1.0 project built with Visual Studio 2008 to an ASP.NET MVC 2 project in Visual Studio 2008 perform these steps: Launch the converter Select the solution Click the “Convert” button To convert an ASP.NET MVC 1.0 project built with Visual Studio 2008 to an ASP.NET MVC 2 project in Visual Studio 2010: Wait until Visual Studio 2010 is released (next month!) and it will have a built-in version of this tool that will run automatically when you open an ASP.NET MVC 1.0 project Perform the above steps, then open the project in Visual Studio 2010 and it will perform the remaining conversion steps What it can do Open up ASP.NET MVC 1.0 projects from Visual Studio 2008 (no other versions of ASP.NET MVC or Visual Studio are supported) Create a full backup of your solution’s folder For every VB or C# project that has a reference to System.Web.Mvc.dll it will (this includes ASP.NET MVC web application projects as well as ASP.NET MVC test projects): Update references to ASP.NET MVC 2 Add a reference to System.ComponentModel.DataAnnotations 3.5 (if not already present) For every VB or C# ASP.NET MVC Web Application it will: Change the project type to an ASP.NET MVC 2 project Update the root ~/web.config references to ASP.NET MVC 2 Update the root ~/web.config to have a binding redirect from ASP.NET MVC 1.0 to ASP.NET MVC 2 Update the ~/Views/web.config references to ASP.NET MVC 2 Add or update the JavaScript files (add jQuery, add jQuery.Validate, add Microsoft AJAX, add/update Microsoft MVC AJAX, add Microsoft MVC Validation adapter) Unknown project types or project types that have nothing to do with ASP.NET MVC will not be updated What it can’t do It cannot convert projects directly to Visual Studio 2010 or to .NET Framework 4. It can have issues if your solution contains projects that are not located under the solution directory. If you are using a source control system it might have problems overwriting files. It is recommended that before converting you check out all files from the source control system. It cannot change code in the application that might need to be changed due to breaking changes between ASP.NET MVC 1.0 and ASP.NET MVC 2. Feedback, Please! If you need to convert a project to ASP.NET MVC 2 please try out this application and hopefully you’re good to go. If you spot any bugs or features that don’t work leave a comment here and I will try to address these issues in an updated release.

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Solaris X86 AESNI OpenSSL Engine

- by danx

Solaris X86 AESNI OpenSSL Engine Cryptography is a major component of secure e-commerce. Since cryptography is compute intensive and adds a significant load to applications, such as SSL web servers (https), crypto performance is an important factor. Providing accelerated crypto hardware greatly helps these applications and will help lead to a wider adoption of cryptography, and lower cost, in e-commerce and other applications. The Intel Westmere microprocessor has six new instructions to acclerate AES encryption. They are called "AESNI" for "AES New Instructions". These are unprivileged instructions, so no "root", other elevated access, or context switch is required to execute these instructions. These instructions are used in a new built-in OpenSSL 1.0 engine available in Solaris 11, the aesni engine. Previous Work Previously, AESNI instructions were introduced into the Solaris x86 kernel and libraries. That is, the "aes" kernel module (used by IPsec and other kernel modules) and the Solaris pkcs11 library (for user applications). These are available in Solaris 10 10/09 (update 8) and above, and Solaris 11. The work here is to add the aesni engine to OpenSSL. X86 AESNI Instructions Intel's Xeon 5600 is one of the processors that support AESNI. This processor is used in the Sun Fire X4170 M2 As mentioned above, six new instructions acclerate AES encryption in processor silicon. The new instructions are: aesenc performs one round of AES encryption. One encryption round is composed of these steps: substitute bytes, shift rows, mix columns, and xor the round key. aesenclast performs the final encryption round, which is the same as above, except omitting the mix columns (which is only needed for the next encryption round). aesdec performs one round of AES decryption aesdeclast performs the final AES decryption round aeskeygenassist Helps expand the user-provided key into a "key schedule" of keys, one per round aesimc performs an "inverse mixed columns" operation to convert the encryption key schedule into a decryption key schedule pclmulqdq Not a AESNI instruction, but performs "carryless multiply" operations to acclerate AES GCM mode. Since the AESNI instructions are implemented in hardware, they take a constant number of cycles and are not vulnerable to side-channel timing attacks that attempt to discern some bits of data from the time taken to encrypt or decrypt the data. Solaris x86 and OpenSSL Software Optimizations Having X86 AESNI hardware crypto instructions is all well and good, but how do we access it? The software is available with Solaris 11 and is used automatically if you are running Solaris x86 on a AESNI-capable processor. AESNI is used internally in the kernel through kernel crypto modules and is available in user space through the PKCS#11 library. For OpenSSL on Solaris 11, AESNI crypto is available directly with a new built-in OpenSSL 1.0 engine, called the "aesni engine." This is in lieu of the extra overhead of going through the Solaris OpenSSL pkcs11 engine, which accesses Solaris crypto and digest operations. Instead, AESNI assembly is included directly in the new aesni engine. Instead of including the aesni engine in a separate library in /lib/openssl/engines/, the aesni engine is "built-in", meaning it is included directly in OpenSSL's libcrypto.so.1.0.0 library. This reduces overhead and the need to manually specify the aesni engine. Since the engine is built-in (that is, in libcrypto.so.1.0.0), the openssl -engine command line flag or API call is not needed to access the engine—the aesni engine is used automatically on AESNI hardware. Ciphers and Digests supported by OpenSSL aesni engine The Openssl aesni engine auto-detects if it's running on AESNI hardware and uses AESNI encryption instructions for these ciphers: AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CFB128, AES-192-CFB128, AES-256-CFB128, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-OFB, AES-192-OFB, and AES-256-OFB. Implementation of the OpenSSL aesni engine The AESNI assembly language routines are not a part of the regular Openssl 1.0.0 release. AESNI is a part of the "HEAD" ("development" or "unstable") branch of OpenSSL, for future release. But AESNI is also available as a separate patch provided by Intel to the OpenSSL project for OpenSSL 1.0.0. A minimal amount of "glue" code in the aesni engine works between the OpenSSL libcrypto.so.1.0.0 library and the assembly functions. The aesni engine code is separate from the base OpenSSL code and requires patching only a few source files to use it. That means OpenSSL can be more easily updated to future versions without losing the performance from the built-in aesni engine. OpenSSL aesni engine Performance Here's some graphs of aesni engine performance I measured by running openssl speed -evp $algorithm where $algorithm is aes-128-cbc, aes-192-cbc, and aes-256-cbc. These are using the 64-bit version of openssl on the same AESNI hardware, a Sun Fire X4170 M2 with a Intel Xeon E5620 @2.40GHz, running Solaris 11 FCS. "Before" is openssl without the aesni engine and "after" is openssl with the aesni engine. The numbers are MBytes/second. OpenSSL aesni engine performance on Sun Fire X4170 M2 (Xeon E5620 @2.40GHz) (Higher is better; "before"=OpenSSL on AESNI without AESNI engine software, "after"=OpenSSL AESNI engine) As you can see the speedup is dramatic for all 3 key lengths and for data sizes from 16 bytes to 8 Kbytes—AESNI is about 7.5-8x faster over hand-coded amd64 assembly (without aesni instructions). Verifying the OpenSSL aesni engine is present The easiest way to determine if you are running the aesni engine is to type "openssl engine" on the command line. No configuration, API, or command line options are needed to use the OpenSSL aesni engine. If you are running on Intel AESNI hardware with Solaris 11 FCS, you'll see this output indicating you are using the aesni engine: intel-westmere $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support If you are running on Intel without AESNI hardware you'll see this output indicating the hardware can't support the aesni engine: intel-nehalem $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support For Solaris on SPARC or older Solaris OpenSSL software, you won't see any aesni engine line at all. Third-party OpenSSL software (built yourself or from outside Oracle) will not have the aesni engine either. Solaris 11 FCS comes with OpenSSL version 1.0.0e. The output of typing "openssl version" should be "OpenSSL 1.0.0e 6 Sep 2011". 64- and 32-bit OpenSSL OpenSSL comes in both 32- and 64-bit binaries. 64-bit executable is now the default, at /usr/bin/openssl, and OpenSSL 64-bit libraries at /lib/amd64/libcrypto.so.1.0.0 and libssl.so.1.0.0 The 32-bit executable is at /usr/bin/i86/openssl and the libraries are at /lib/libcrytpo.so.1.0.0 and libssl.so.1.0.0. Availability The OpenSSL AESNI engine is available in Solaris 11 x86 for both the 64- and 32-bit versions of OpenSSL. It is not available with Solaris 10. You must have a processor that supports AESNI instructions, otherwise OpenSSL will fallback to the older, slower AES implementation without AESNI. Processors that support AESNI include most Westmere and Sandy Bridge class processor architectures. Some low-end processors (such as for mobile/laptop platforms) do not support AESNI. The easiest way to determine if the processor supports AESNI is with the isainfo -v command—look for "amd64" and "aes" in the output: $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu Conclusion The Solaris 11 OpenSSL aesni engine provides easy access to powerful Intel AESNI hardware cryptography, in addition to Solaris userland PKCS#11 libraries and Solaris crypto kernel modules.

Read the article
La RC2 de ASP.NET MVC3 disponible : encore plus performante, elle est compatible avec la beta du SP 1 de Visual Studio 2010

La RC2 de ASP.NET MVC3 disponible Encore plus performante, elle est compatible avec la beta du SP 1 de Visual Studio 2010 Mise à jour du 13/12/10 Microsoft, par le billet de son vice-président de la division de développement Scott Guthrie, vient d'annoncer la sortie de la Release Candidate 2 d'ASP.NET MVC 3. Au menu de cette nouvelle version : La correction de plusieurs bugs et l'optimisation des performances. Les tests de performance sur cette version, selon Guthrie, permettent de constater qu'ASP.NET MVC 3 est nettement plus rapide que la version 2 et que les applications ASP.net MVC existantes, après une mise à...

Read the article
Solaris X86 AESNI OpenSSL Engine

- by danx

Solaris X86 AESNI OpenSSL Engine Cryptography is a major component of secure e-commerce. Since cryptography is compute intensive and adds a significant load to applications, such as SSL web servers (https), crypto performance is an important factor. Providing accelerated crypto hardware greatly helps these applications and will help lead to a wider adoption of cryptography, and lower cost, in e-commerce and other applications. The Intel Westmere microprocessor has six new instructions to acclerate AES encryption. They are called "AESNI" for "AES New Instructions". These are unprivileged instructions, so no "root", other elevated access, or context switch is required to execute these instructions. These instructions are used in a new built-in OpenSSL 1.0 engine available in Solaris 11, the aesni engine. Previous Work Previously, AESNI instructions were introduced into the Solaris x86 kernel and libraries. That is, the "aes" kernel module (used by IPsec and other kernel modules) and the Solaris pkcs11 library (for user applications). These are available in Solaris 10 10/09 (update 8) and above, and Solaris 11. The work here is to add the aesni engine to OpenSSL. X86 AESNI Instructions Intel's Xeon 5600 is one of the processors that support AESNI. This processor is used in the Sun Fire X4170 M2 As mentioned above, six new instructions acclerate AES encryption in processor silicon. The new instructions are: aesenc performs one round of AES encryption. One encryption round is composed of these steps: substitute bytes, shift rows, mix columns, and xor the round key. aesenclast performs the final encryption round, which is the same as above, except omitting the mix columns (which is only needed for the next encryption round). aesdec performs one round of AES decryption aesdeclast performs the final AES decryption round aeskeygenassist Helps expand the user-provided key into a "key schedule" of keys, one per round aesimc performs an "inverse mixed columns" operation to convert the encryption key schedule into a decryption key schedule pclmulqdq Not a AESNI instruction, but performs "carryless multiply" operations to acclerate AES GCM mode. Since the AESNI instructions are implemented in hardware, they take a constant number of cycles and are not vulnerable to side-channel timing attacks that attempt to discern some bits of data from the time taken to encrypt or decrypt the data. Solaris x86 and OpenSSL Software Optimizations Having X86 AESNI hardware crypto instructions is all well and good, but how do we access it? The software is available with Solaris 11 and is used automatically if you are running Solaris x86 on a AESNI-capable processor. AESNI is used internally in the kernel through kernel crypto modules and is available in user space through the PKCS#11 library. For OpenSSL on Solaris 11, AESNI crypto is available directly with a new built-in OpenSSL 1.0 engine, called the "aesni engine." This is in lieu of the extra overhead of going through the Solaris OpenSSL pkcs11 engine, which accesses Solaris crypto and digest operations. Instead, AESNI assembly is included directly in the new aesni engine. Instead of including the aesni engine in a separate library in /lib/openssl/engines/, the aesni engine is "built-in", meaning it is included directly in OpenSSL's libcrypto.so.1.0.0 library. This reduces overhead and the need to manually specify the aesni engine. Since the engine is built-in (that is, in libcrypto.so.1.0.0), the openssl -engine command line flag or API call is not needed to access the engine—the aesni engine is used automatically on AESNI hardware. Ciphers and Digests supported by OpenSSL aesni engine The Openssl aesni engine auto-detects if it's running on AESNI hardware and uses AESNI encryption instructions for these ciphers: AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CFB128, AES-192-CFB128, AES-256-CFB128, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, AES-128-OFB, AES-192-OFB, and AES-256-OFB. Implementation of the OpenSSL aesni engine The AESNI assembly language routines are not a part of the regular Openssl 1.0.0 release. AESNI is a part of the "HEAD" ("development" or "unstable") branch of OpenSSL, for future release. But AESNI is also available as a separate patch provided by Intel to the OpenSSL project for OpenSSL 1.0.0. A minimal amount of "glue" code in the aesni engine works between the OpenSSL libcrypto.so.1.0.0 library and the assembly functions. The aesni engine code is separate from the base OpenSSL code and requires patching only a few source files to use it. That means OpenSSL can be more easily updated to future versions without losing the performance from the built-in aesni engine. OpenSSL aesni engine Performance Here's some graphs of aesni engine performance I measured by running openssl speed -evp $algorithm where $algorithm is aes-128-cbc, aes-192-cbc, and aes-256-cbc. These are using the 64-bit version of openssl on the same AESNI hardware, a Sun Fire X4170 M2 with a Intel Xeon E5620 @2.40GHz, running Solaris 11 FCS. "Before" is openssl without the aesni engine and "after" is openssl with the aesni engine. The numbers are MBytes/second. OpenSSL aesni engine performance on Sun Fire X4170 M2 (Xeon E5620 @2.40GHz) (Higher is better; "before"=OpenSSL on AESNI without AESNI engine software, "after"=OpenSSL AESNI engine) As you can see the speedup is dramatic for all 3 key lengths and for data sizes from 16 bytes to 8 Kbytes—AESNI is about 7.5-8x faster over hand-coded amd64 assembly (without aesni instructions). Verifying the OpenSSL aesni engine is present The easiest way to determine if you are running the aesni engine is to type "openssl engine" on the command line. No configuration, API, or command line options are needed to use the OpenSSL aesni engine. If you are running on Intel AESNI hardware with Solaris 11 FCS, you'll see this output indicating you are using the aesni engine: intel-westmere $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support If you are running on Intel without AESNI hardware you'll see this output indicating the hardware can't support the aesni engine: intel-nehalem $ openssl engine (aesni) Intel AES-NI engine (no-aesni) (dynamic) Dynamic engine loading support (pkcs11) PKCS #11 engine support For Solaris on SPARC or older Solaris OpenSSL software, you won't see any aesni engine line at all. Third-party OpenSSL software (built yourself or from outside Oracle) will not have the aesni engine either. Solaris 11 FCS comes with OpenSSL version 1.0.0e. The output of typing "openssl version" should be "OpenSSL 1.0.0e 6 Sep 2011". 64- and 32-bit OpenSSL OpenSSL comes in both 32- and 64-bit binaries. 64-bit executable is now the default, at /usr/bin/openssl, and OpenSSL 64-bit libraries at /lib/amd64/libcrypto.so.1.0.0 and libssl.so.1.0.0 The 32-bit executable is at /usr/bin/i86/openssl and the libraries are at /lib/libcrytpo.so.1.0.0 and libssl.so.1.0.0. Availability The OpenSSL AESNI engine is available in Solaris 11 x86 for both the 64- and 32-bit versions of OpenSSL. It is not available with Solaris 10. You must have a processor that supports AESNI instructions, otherwise OpenSSL will fallback to the older, slower AES implementation without AESNI. Processors that support AESNI include most Westmere and Sandy Bridge class processor architectures. Some low-end processors (such as for mobile/laptop platforms) do not support AESNI. The easiest way to determine if the processor supports AESNI is with the isainfo -v command—look for "amd64" and "aes" in the output: $ isainfo -v 64-bit amd64 applications pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu Conclusion The Solaris 11 OpenSSL aesni engine provides easy access to powerful Intel AESNI hardware cryptography, in addition to Solaris userland PKCS#11 libraries and Solaris crypto kernel modules.

Read the article
Announcing the Winnipeg Visual Studio.NET 2010 Launch Event!

- by D'Arcy Lussier

That’s right Winnipeg, we’re having our own Visual Studio.NET 2010 launch event on May 11th brought to you by your local Winnipeg .NET User Group, Anvil Digital, Imaginet, Microsoft, and Protegra! We’re excited to bring a day of sessions highlighting developer productivity, application lifecycle management, and web development using these new technologies! We’re also thrilled to have this event at the IMAX Theatre at Portage Place! The day looks like this: The event is FREE and we’re providing a continental breakfast for attendees. To register for the event, visit our registration site here. If you have any questions, please contact me through comments on this post or via email through my blog. D’Arcy

Read the article
Courier pour Windows 8 : l'équipe Visual C++ de Microsoft sort une "Killer App" sous forme de démonstration de force pour le C++

Courier pour Windows 8 : l'équipe Visual C++ de Microsoft sort une « Killer App » Potentielle sous forme de démonstration de force pour le C++ Et si Courier devenait la « Killer App » qui impose Windows 8 dans sa version tactile auprès du grand public ? Plus simple que OneNote, Courier est un outil de prise de notes qui transforme littéralement une tablette en carnet. Carnet de voyage, de croquis, de photos? ou de notes donc. Le genre de fonctionnalités qui peut faire passer une tablette du stade de simple gadget à celui d'outil informatique dont on se sert dans de multiples situations. Ce n'est d'ailleurs pas un hasard si Microsoft commence à communiquer sur ce qui était au...

Read the article
How much should I rely on Visual Studio's Auto Generated Code?

- by Ant

So I'm reading up on ASP.NET with VB.NET and I want to start making my own, professionally built website using ASP. I'm wondering though; I'm still using the basics so I'm really just a novice, but how much should I rely on Visual Studio to create my elements? Should I make my own text boxes and have my own login routine, or should I just use ASP's login features? I know eventually you have to use your own classes and such which is where the real coding comes in, but I'm not sure how relaible, flexible and secure the pre-wrote elements are? Any help would be greatly appreciated.

Read the article
Employee Info Starter Kit (v4.0.0) - Visual Studio 2010 and .NET 4.0 Version is Available Now

Employee Info Starter Kit is a ASP.NET based web application, which includes very simple user requirements, where we can create, read, update and delete (crud) the employee info of a company. Based on just a database table, it explores and solves most of the major problems in web development architectural space. This open source starter kit extensively uses major features available in latest Visual Studio, ASP.NET and Sql Server to make robust, scalable, secured and maintainable web applications...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
How Visual Studio could help to avoid duplicating code?

- by MegaMind

I work within a team of developers. Everyone is making their changes without carrying too much if the same thing is already implemented in the codebase. This leads to classes constantly growing and to severe duplication. I want to add line items to class definitions from which a developer could judge what this class has. Would it help? How to do it in Visual Studio? If it wouldn't help, what would be the better alternative to encourage the developers to check if something exists before implementing it?

Read the article
Visual Studio 2010 RC and Entity Framework 4 RC Support in the New Version of ADO.NET Data Providers

Devart has recently announced the release of dotConnect products for Oracle, MySQL, PostgreSQL, and SQLite - ADO.NET providers that offer Entity Framework support, LINQ to SQL support, and contain an ORM model designer for developing LINQ to SQL and EF models based on different database engines. New dotConnect ADO.NET providers offer complete support for Visual Studio 2010 Release Candidate and Entity Framework 4 Release Candidate. Entity Developer 2.80, a designer for modeling and code generation...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
Révisions : Microsoft met en ligne ses cahiers de pré-rentrée sur Visual Studio 2012, Windows Phone, Windows 8, Azure et Windows Server 2012

Révision de pré-rentrée : Microsoft a mis en ligne ses cahiers de vacances Sur Visual Studio 2012, Windows Phone, Windows 8, Azure et Windows Server Mea Culpa. Nous aurions pu en parler avant. Mais mieux vaut tard que jamais. D'autant plus que les révisions, c'est bien aussi quand ça ne dure pas toutes les vacances. Quoiqu'il en soit, Microsoft a mis en ligne deux « Cahiers de vacances » pour faire le point, se tester et/ou apprendre à maitrise toutes les nouveautés (et Dieu sait qu'elles sont nombreuses) autour de ses produits. Le premier, « J'en ai rien à déployer », revient sur la RC de Windows Server 2012, et sur la virtualisation av...

Read the article
Microsoft renforce le support de C++ 11 dans Visual Studio 2012, la mise à jour du compilateur C++ disponible

Microsoft renforce le support de C++ 11 dans Visual Studio 2012 la mise à jour du compilateur C++ disponible « Le futur du C++ » : c'est le titre de la session de près d'une heure sur le langage, qui s'est déroulée en fin de semaine dernière lors de la conférence Builds, la grande messe des développeurs organisée par Microsoft. Présentée par Herb Sutter, président du comité de normalisation ISO C++ et platform evangelist chez Microsoft, la conférence a permis de faire le point sur les projets récents, en cours et les futures orientations du C++, aussi bien chez Microsoft que chez les autres acteurs de l'industrie. Pour ce qui est de Microsoft, une mise à jour du compilateu...

Read the article
WCF RIA Services v1.0 and Silverlight Tools for Visual Studio 2010 are Here!

Today both the WCF RIA Services v1.0 and the Silverlight 4 Tools for Visual Studio 2010 are officially released! You can download the the tools right here. You can find full details about this release on the download site. NOTE: To celebrate these releases, Silverlight TV is rolling out 2 shows today instead of our regular schedule. We have recorded 2 new shows of Silverlight TV to ring in these new releases. The first show is Silverlight TV #27 (see details below) where we have Mark Wilson-Thomas...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
[News] NDepend 3.0 en version b?ta, 100% int?gr? dans Visual Studio 2010

Patrick Smacchia nous annonce la disponibilit? de NDepend V3 en b?ta. Cette version s'int?gre d?sormais dans Visual Studio .NET (2008, 2005 et 2010!). Vous pouvez d?couvrir sur le site de NDepend les copies d'?crans et les nouveaut?s. Un gros travail a aussi ?t? apport? aux performances. Voil? un bel outil ? acqu?rir pour Microsoft ...

Read the article
Putting WPF Controls into canvas with Visuals

- by Mikhail

I am writing a WPF chart and use Visuals for performance. The code looks like: public class DrawingCanvas2 : Canvas { private List<Visual> _visuals = new List<Visual>(); private List<Visual> _hits = new List<Visual>(); protected override Visual GetVisualChild( int index ) { return _visuals[index]; } protected override int VisualChildrenCount { get { return _visuals.Count; } } public void AddVisual( Visual visual ) { _visuals.Add( visual ); base.AddVisualChild( visual ); base.AddLogicalChild( visual ); } } Beside DrawingVisual elements (line, text) I need a ComboBox in the chart. So I tried this: public DrawingCanvas2() { ComboBox box = new ComboBox(); AddVisual( box ); box.Width = 100; box.Height = 30; Canvas.SetLeft( box, 10 ); Canvas.SetTop( box, 10 ); } but it does not work, there is no ComboBox displayed. What I am missing?

Read the article

< Previous Page | 340 341 342 343 344 345 346 347 348 349 350 351 | Next Page >