Search Results

Search found 7672 results on 307 pages for 'compiler optimization'.

Page 10/307 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >

  • Performance Optimization for Matrix Rotation

    - by Summer_More_More_Tea
    Hello everyone: I'm now trapped by a performance optimization lab in the book "Computer System from a Programmer's Perspective" described as following: In a N*N matrix M, where N is multiple of 32, the rotate operation can be represented as: Transpose: interchange elements M(i,j) and M(j,i) Exchange rows: Row i is exchanged with row N-1-i A example for matrix rotation(N is 3 instead of 32 for simplicity): ------- ------- |1|2|3| |3|6|9| ------- ------- |4|5|6| after rotate is |2|5|8| ------- ------- |7|8|9| |1|4|7| ------- ------- A naive implementation is: #define RIDX(i,j,n) ((i)*(n)+(j)) void naive_rotate(int dim, pixel *src, pixel *dst) { int i, j; for (i = 0; i < dim; i++) for (j = 0; j < dim; j++) dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)]; } I come up with an idea by inner-loop-unroll. The result is: Code Version Speed Up original 1x unrolled by 2 1.33x unrolled by 4 1.33x unrolled by 8 1.55x unrolled by 16 1.67x unrolled by 32 1.61x I also get a code snippet from pastebin.com that seems can solve this problem: void rotate(int dim, pixel *src, pixel *dst) { int stride = 32; int count = dim >> 5; src += dim - 1; int a1 = count; do { int a2 = dim; do { int a3 = stride; do { *dst++ = *src; src += dim; } while(--a3); src -= dim * stride + 1; dst += dim - stride; } while(--a2); src += dim * (stride + 1); dst -= dim * dim - stride; } while(--a1); } After carefully read the code, I think main idea of this solution is treat 32 rows as a data zone, and perform the rotating operation respectively. Speed up of this version is 1.85x, overwhelming all the loop-unroll version. Here are the questions: In the inner-loop-unroll version, why does increment slow down if the unrolling factor increase, especially change the unrolling factor from 8 to 16, which does not effect the same when switch from 4 to 8? Does the result have some relationship with depth of the CPU pipeline? If the answer is yes, could the degrade of increment reflect pipeline length? What is the probable reason for the optimization of data-zone version? It seems that there is no too much essential difference from the original naive version. EDIT: My test environment is Intel Centrino Duo processor and the verion of gcc is 4.4 Any advice will be highly appreciated! Kind regards!

    Read the article

  • Need help with basic optimization problem

    - by ??iu
    I know little of optimization problems, so hopefully this will be didactic for me: rotors = [1, 2, 3, 4...] widgets = ['a', 'b', 'c', 'd' ...] assert len(rotors) == len(widgets) part_values = [ (1, 'a', 34), (1, 'b', 26), (1, 'c', 11), (1, 'd', 8), (2, 'a', 5), (2, 'b', 17), .... ] Given a fixed number of widgets and a fixed number of rotors, how can you get a series of widget-rotor pairs that maximizes the total value where each widget and rotor can only be used once?

    Read the article

  • Any Javascript optimization benchmarks?

    - by int3
    I watched Nicholas Zakas' talk, Speed up your Javascript, with some interest. I liked how he benchmarked the various performance improvements created by various optimization techniques, e.g. reducing calls to deeply nested objects, changing loops to count down instead of up, etc. I would like to run these benchmarks myself though, to see exactly how our current browsers are faring. I guess it wouldn't be too difficult to cook up some timed loops, but I'd like to know if there are any existing implementations out there.

    Read the article

  • Java code optimization leads to numerical inaccuracies and errors

    - by rano
    I'm trying to implement a version of the Fuzzy C-Means algorithm in Java and I'm trying to do some optimization by computing just once everything that can be computed just once. This is an iterative algorithm and regarding the updating of a matrix, the clusters x pixels membership matrix U, this is the update rule I want to optimize: where the x are the element of a matrix X (pixels x features) and v belongs to the matrix V (clusters x features). And m is a parameter that ranges from 1.1 to infinity. The distance used is the euclidean norm. If I had to implement this formula in a banal way I'd do: for(int i = 0; i < X.length; i++) { int count = 0; for(int j = 0; j < V.length; j++) { double num = D[i][j]; double sumTerms = 0; for(int k = 0; k < V.length; k++) { double thisDistance = D[i][k]; sumTerms += Math.pow(num / thisDistance, (1.0 / (m - 1.0))); } U[i][j] = (float) (1f / sumTerms); } } In this way some optimization is already done, I precomputed all the possible squared distances between X and V and stored them in a matrix D but that is not enough, since I'm cycling througn the elements of V two times resulting in two nested loops. Looking at the formula the numerator of the fraction is independent of the sum so I can compute numerator and denominator independently and the denominator can be computed just once for each pixel. So I came to a solution like this: int nClusters = V.length; double exp = (1.0 / (m - 1.0)); for(int i = 0; i < X.length; i++) { int count = 0; for(int j = 0; j < nClusters; j++) { double distance = D[i][j]; double denominator = D[i][nClusters]; double numerator = Math.pow(distance, exp); U[i][j] = (float) (1f / (numerator * denominator)); } } Where I precomputed the denominator into an additional column of the matrix D while I was computing the distances: for (int i = 0; i < X.length; i++) { for (int j = 0; j < V.length; j++) { double sum = 0; for (int k = 0; k < nDims; k++) { final double d = X[i][k] - V[j][k]; sum += d * d; } D[i][j] = sum; D[i][B.length] += Math.pow(1 / D[i][j], exp); } } By doing so I encounter numerical differences between the 'banal' computation and the second one that leads to different numerical value in U (not in the first iterates but soon enough). I guess that the problem is that exponentiate very small numbers to high values (the elements of U can range from 0.0 to 1.0 and exp , for m = 1.1, is 10) leads to ver y small values, whereas by dividing the numerator and the denominator and THEN exponentiating the result seems to be better numerically. The problem is it involves much more operations. Am I doing something wrong? Is there a possible solution to get both the code optimized and numerically stable? Any suggestion or criticism will be appreciated.

    Read the article

  • Prioritize compiler functionality/tasks, when designing a new language

    - by Mahdi
    Well, the question should be so hard to ask and I expect couple of down votes, however, I'm really interested to have your ideas and recommendations. :) I've already made a very simple compiler, with a few and limited functionality. Now I'm getting more on it to make it more like a real-world compiler. I definitely need to start over 'cause I've much more experience and ideas in this area rather a few years ago. So, I want to know, right now, from the very first step again, which tasks/features for the new compiler should implement first and which tasks has lower priority rather than others? For example, I'd say, first I'd go to decide about the object-oriented structure for the new language, but you might say, hey, just go for a compiler that could define a variable, when you finished that, then start thinking about OOP designs ... I prefer to hear the pros and cons for your suggestions also. Actually I like to start from Bottom to Top, where I could add simplest tasks first, and later adding more complex ones, but I'm totally open for any new ideas, and really appreciate that. Also please consider that I'm thinking about the design concepts. Actually I expect answers like: Priority from Highest to Lowest: variables, because .... functions, because .... loops, because .... ... Not: define a syntax for your new language, and start parsing your source code ...

    Read the article

  • Reliance on the compiler

    - by koan
    I've been programming in C and C++ for some time, although I would say I'm far from being expert. For some time I've been using various strategies to develop my code such as unit tests, test driven design, code reviews and so on. When I wrote my first programs in BASIC I typed in long listings before finding they would not run and they were a nightmare to debug. So I learnt to write a small bit and then test it. These days I often find myself repeatedly writing a small bit of code then using the compiler to find all the mistakes. That's OK if it picks up a typo but when you start adjusting the parameters types etc just to make it compile you can screw up the design. It also seems that the compiler is creeping into the design process when it should only be used for checking syntax. There's a danger here of over reliance on the compiler to make my programs better. Are there better strategies than this ? I vaguely remember some time ago an article on a company developing a type of C compiler where an extra header file also specified the prototypes. The idea was that inconsistencies in the API definition would be easier to catch if you had to define it twice in different ways.

    Read the article

  • MSBuild 4 fails to build VS2008 csproj due to 1 compiler warning

    - by David White
    We have a VS2008 CS DLL project targeting .NET 3.5. It builds successfully on our CI server when using MSBuild 3.5. When CI is upgraded to use MSBuild 4.0, the same project fails to build, due to 1 warning message: c:\WINDOWS\Microsoft.NET\Framework\v3.5\Microsoft.Common.targets(1418,9): warning MSB3283: Cannot find wrapper assembly for type library "ADODB". The warning does not occur with MSBuild 3.5, and I'm surprised that it results in Build FAILED. We do not have the project set to treat warnings as errors. All our other projects build successfully with either version of MSBuild.

    Read the article

  • g++ C++0x enum class Compiler Warnings

    - by Travis G
    I've been refactoring my horrible mess of C++ type-safe psuedo-enums to the new C++0x type-safe enums because they're way more readable. Anyway, I use them in exported classes, so I explicitly mark them to be exported: enum class __attribute__((visibility("default"))) MyEnum : unsigned int { One = 1, Two = 2 }; Compiling this with g++ yields the following warning: type attributes ignored after type is already defined This seems very strange, since, as far as I know, that warning is meant to prevent actual mistakes like: class __attribute__((visibility("default"))) MyClass { }; class __attribute__((visibility("hidden"))) MyClass; Of course, I'm clearly not doing that, since I have only marked the visibility attributes at the definition of the enum class and I'm not re-defining or declaring it anywhere else (I can duplicate this error with a single file). Ultimately, I can't make this bit of code actually cause a problem, save for the fact that, if I change a value and re-compile the consumer without re-compiling the shared library, the consumer passes the new values and the shared library has no idea what to do with them (although I wouldn't expect that to work in the first place). Am I being way too pedantic? Can this be safely ignored? I suspect so, but at the same time, having this error prevents me from compiling with Werror, which makes me uncomfortable. I would really like to see this problem go away.

    Read the article

  • CSharpCodeProvider doesn't return compiler warnings when there are no errors

    - by Sandor Drieënhuizen
    I'm using the CSharpCodeProvider class to compile a C# script which I use as a DSL in my application. When there are warnings but no errors, the Errors property of the resulting CompilerResults instance contains no items. But when I introduce an error, the warnings suddenly get listed in the Errors property as well. string script = @" using System; using System; \\ generate a warning namespace MyNamespace { public class MyClass { public void MyMethod() { \\ uncomment the next statement to generate an error \\intx = 0; } } } "; CSharpCodeProvider provider = new CSharpCodeProvider( new Dictionary<string, string>() { { "CompilerVersion", "v4.0" } }); CompilerParameters compilerParameters = new CompilerParameters(); compilerParameters.GenerateExecutable = false; compilerParameters.GenerateInMemory = true; CompilerResults results = provider.CompileAssemblyFromSource( compilerParameters, script); foreach (CompilerError error in results.Errors) { Console.Write(error.IsWarning ? "Warning: " : "Error: "); Console.WriteLine(error.ErrorText); } So how to I get hold of the warnings when there are no errors? By the way, I don't want to set TreatWarningsAsErrors to true.

    Read the article

  • creating an executable file without a compiler

    - by Alterlife
    I came across an article a long while ago on how to write out a .com file directly without using any external tools. the method was to basically copy con myfile.com and then hit ctrl+alt+number for each instruction. I've lost the url for the guide... Google isn't helping much either. If you have the link, please could you post it.

    Read the article

  • Need help figuring out scala compiler errors.

    - by klactose
    Hello all, I have been working on a project in scala, but I am getting some error messages that I don't quite understand. The classes that I am working with are relatively simple. For example: abstract class Shape case class Point(x: Int, y: Int) extends Shape case class Polygon(points: Point*) extends Shape Now suppose that I create a Polygon: val poly = new Polygon(new Point(2,5), new Point(7,0), new Point(3,1)) Then if I attempt to determine the location and size of the smallest possible rectangle that could contain the polygon, I get various errors that I don't quite understand. Below are snippets of different attempts and the corresponding error messages that they produce. val upperLeftX = poly.points.reduceLeft(Math.min(_.x, _.x)) Gives the error: "missing parameter type for expanded function ((x$1) = x$1.x)" val upperLeftX = poly.points.reduceLeft((a: Point, b: Point) => (Math.min(a.x, b.x))) Gives this error: "type mismatch; found : (Point, Point) = Int required: (Any, Point) = Any" I am very confused about both of these error messages. If anyone could explain more clearly what I am doing incorrectly, I would really appreciate it. Yes, I see that the second error says that I need type "Any" but I don't understand exactly how to implement a change that would work as I need it. Obviously simply changing "a: Point" to "a: Any" is not a viable solution, so what am I missing?

    Read the article

  • Most popular compiler?

    - by Russel
    Hello, I've been using MS Visual Studio for a lot of projects, but I notice a lot of people here like to complain about Microsoft and Visual Studio. So I'm wondering, what does everyone use? Dev-C++? mingw? What is popular? Also, what is bad about MSVS? What is "better" about the others? Thanks! --RKL

    Read the article

  • PHP website Optimization

    - by ana
    I have a high traffic website and I need make sure my site is fast enough to display my pages to everyone rapidly. I searched on Google many articles about speed and optimization and here's what I found: Cache the page Save it to the disk Caching the page in memory: This is very fast but if I need to change the content of my page I have to remove it from cache and then re-save the file on the disk. Save it to disk This is very easy to maintain but every time the page is accessed I have to read on the disk. Which method should I go with? Thanks

    Read the article

  • Bug in VS2008 compiler : DLL cannot be found

    - by BDotA
    I made some changes to my solution which contains a couple of project and wanted to compile it again .. now it says Metadata file C:\myproject\bin\myproject.DLL could not be found... I closed the VS and opened again and also deleted the bin and obj folder of that project, but still the same compile error...

    Read the article

  • iPhone Development - Compiler Warning!

    - by Mustafa
    Sometimes when i try to "build"/compile a downloaded source, i get following warning: ld: warning: directory '/Volumes/Skiiing2/CD/ViewBased/Unknown Path/System/Library/Frameworks' following -F not found Has anyone else seen this issue?

    Read the article

  • Performance optimization strategies of last resort?

    - by jerryjvl
    There are plenty of performance questions on this site already, but it occurs to me that almost all are very problem-specific and fairly narrow. And almost all repeat the advice to avoid premature optimization. Let's assume: the code already is working correctly the algorithms chosen are already optimal for the circumstances of the problem the code has been measured, and the offending routines have been isolated all attempts to optimize will also be measured to ensure they do not make matters worse What I am looking for here is strategies and tricks to squeeze out up to the last few percent in a critical algorithm when there is nothing else left to do but whatever it takes. Ideally, try to make answers language agnostic, and indicate any down-sides to the suggested strategies where applicable. I'll add a reply with my own initial suggestions, and look forward to whatever else the SO community can think of.

    Read the article

  • G++, compiler warnings, c++ templates

    - by Ian
    During the compilatiion of the C++ program those warnings appeared: c:/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bc:/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_algo.h:2317: instantiated from `void std::partial_sort(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<Object<double>**, std::vector<Object<double>*, std::allocator<Object<double>*> > >, _Compare = sortObjects<double>]' c:/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_algo.h:2506: instantiated from `void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<Object<double>**, std::vector<Object<double>*, std::allocator<Object<double>*> > >, _Size = int, _Compare = sortObjects<double>]' c:/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_algo.h:2589: instantiated from `void std::sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<Object<double>**, std::vector<Object<double>*, std::allocator<Object<double>*> > >, _Compare = sortObjects<double>]' io/../structures/objects/../../algorithm/analysis/../../structures/list/ObjectsList.hpp:141: instantiated from `void ObjectsList <T>::sortObjects(unsigned int, T, T, T, T, unsigned int) [with T = double]' I do not why, because all objects have only template parameter T, their local variables are also T. The only place, where I am using double is main. There are objects of type double creating and adding into the ObjectsList... Object <double> o1; ObjectsList <double> olist; olist.push_back(o1); .... T xmin = ..., ymin = ..., xmax = ..., ymax = ...; unsigned int n = ...; olist.sortAllObjects(xmin, ymin, xmax, ymax, n); and comparator template <class T> class sortObjects { private: unsigned int n; T xmin, ymin, xmax, ymax; public: sortObjects ( const T xmin_, const T ymin_, const T xmax_, const T ymax_, const int n_ ) : xmin ( xmin_ ), ymin ( ymin_ ), xmax ( xmax_ ), ymax ( ymax_ ), n ( n_ ) {} bool operator() ( const Object <T> *o1, const Object <T> *o2 ) const { T dmax = (std::max) ( xmax - xmin, ymax - ymin ); T x_max = ( xmax - xmin ) / dmax; T y_max = ( ymax - ymin ) / dmax; ... return ....; } representing ObjectsList method: template <class T> void ObjectsList <T> ::sortAllObjects ( const T xmin, const T ymin, const T xmax, const T ymax, const unsigned int n ) { std::sort ( objects.begin(), objects.end(), sortObjects <T> ( xmin, ymin, xmax, ymax, n ) ); }

    Read the article

  • Compiler/Linking Error: Freedup

    - by nym
    I've been trying to compile a program for hardlinking duplicate files called freedup. I did try to email the author/maintainer of the program, but it's been a long time and I haven't heard anything back from him. I'm trying to compile the program from a cygwin environment using the latest stable versions of gcc (3.4.4-999) and make (3.81-2). I've tried deleting all the object files and running make, but I always get the following error: freedup.o: In function 'main': /home/[user]/freedup-1.5/freedup.c:1791: undefined reference to '_hashed' collect2: ld returned 1 exit status make: * * * [freedup] Error 1 I did take a look at the source code and saw that the "hashed" function is an inline function (which I didn't think had to be declared outside of the source file... but that's just what I gathered from some preliminary googling). If anyone would be kind enough to try compiling this program in a windows environment and has any luck, I'd really appreciate it. Thanks The direct link for the source files is: http://freedup.org/freedup-1.5-3-src.tgz

    Read the article

  • Does MATLAB perform tail call optimization?

    - by Shea Levy
    I've recently learned Haskell, and am trying to carry the pure functional style over to my other code when possible. An important aspect of this is treating all variables as immutable, i.e. constants. In order to do so, many computations that would be implemented using loops in an imperative style have to be performed using recursion, which typically incurs a memory penalty due to the allocation a new stack frame for each function call. In the special case of a tail call (where the return value of a called function is immediately returned to the callee's caller), however, this penalty can be bypassed by a process called tail call optimization (in one method, this can be done by essentially replacing a call with a jmp after setting up the stack properly). Does MATLAB perform TCO by default, or is there a way to tell it to?

    Read the article

  • compiler directive defensive programming for adding ints to nsmuatablearray FMDB/EGODB

    - by johndpope
    I would like to throw a warning message when users try to add an int to an nsmutablearray basically any insert statement that includes values that are not nsstring / nsnumber cause run time crashes. It's exactly the same crash you get when you type %@ instead of %d NSLog(int); The crash is ok, but I want to throw a friendly 'FATAL' message to user. so far I have this try catch with isKindOfClass NSObject but ints are slipping through. #define FATAL_MSG "FATAL: object is not an NSObject subclass. Are you using int? use [NSNumber numberWithInt:1] \n" #define VAToArray(firstarg) ({\ NSMutableArray* valistArray = [NSMutableArray array];\ id obj = nil;\ va_list arguments;\ va_start(arguments, sql);\ @try { \ while ((obj = va_arg(arguments, id))) {\ if([obj isKindOfClass:[NSObject class]]) [valistArray addObject:obj];\ else printf(FATAL_MSG); \ }\ } \ @catch(NSException *exception){ \ printf(FATAL_MSG); \ } \ va_end(arguments);\ valistArray;\ }) - (void)test:(NSString*)sql,... { NSLog(@"VAToArray :%@",VAToArray(sql)); } // then call this [self test:@"str",@"test",nil]; when I call this [self test:@"str",2,nil]; throw the error message.

    Read the article

  • A compiler for automata theory

    - by saadtaame
    I'm designing a programming language for automata theory. My goal is to allow programmers to use machines (DFA, NFA, etc...) as units in expressions. I'm confused whether the language should be compiled, interpreted, or jit-compiled! My intuition is that compilation is a good choice, for some operations might take too much time (converting NFA's to equivalent DFA's can be expensive). Translating to x86 seems good. There is one issue however: I want the user to be able to plot machines. Any ideas?

    Read the article

  • when is java faster than c++ (or when is JIT faster then precompiled)?

    - by kostja
    I have heard that under certain circumstances, Java programs or rather parts of java programs are able to be executed faster than the "same" code in C++ (or other precompiled code) due to JIT optimizations. This is due to the compiler being able to determine the scope of some variables, avoid some conditionals and pull similar tricks at runtime. Could you give an (or better - some) example, where this applies? And maybe outline the exact conditions under which the compiler is able to optimize the bytecode beyond what is possible with precompiled code? NOTE : This question is not about comparing Java to C++. Its about the possibilities of JIT compiling. Please no flaming. I am also not aware of any duplicates. Please point them out if you are.

    Read the article

  • switching C compiler causes: Error Initializer cannot be specified for a flexible array member

    - by user1054210
    I am trying to convert our code from one IDE to be used in a different one. The current one uses gcc which allows for this structure to be initialized from a variable array. The new tool does not use gcc gives me an error "Initializer cannot be specified for a flexible array member". So can someone help me understand how to set this up? Should I set up a blank array of variable size and then somewhere assign the #define array as seen below? Here would be an example of the code…(this is current implementation current IDE) In one header file that is Build switchable so we can build this on different hardware platforms we have the following #define #define GPIOS \ /* BANK, PIN, SPD, MODE,… */ GPIOINIT( A, 0, 2, AIN, …) \ GPIOINIT( A, 1, 2, AIN, …) \ GPIOINTINIT(A, 2, 2, AIN, …) \ . . . Then in a different header file that is used in all builds we have PLATFORM_CONFIG_T g_platformConfig = { .name = {PLATFORM_NAME}, (bunch of other stuff), .allGpios = { GPIOS /* here I get the error */ }, }; So I am thinking I can make the error line a variable array and assign to it later in some other way? The problem is the actual array "GPIO" is of different types and pin orders on different designs are different.

    Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >