Search Results

Search found 7672 results on 307 pages for 'compiler optimization'.

Page 91/307 | < Previous Page | 87 88 89 90 91 92 93 94 95 96 97 98  | Next Page >

  • Does the <script> tag position in HTML affects performance of the webpage?

    - by Rahul Joshi
    If the script tag is above or below the body in a HTML page, does it matter for the performance of a website? And what if used in between like this: <body> ..blah..blah.. <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> ... some text here too ... </body> Or is this better?: <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> <body> ..blah..blah.. ..call above functions on some events like onclick,onfocus,etc.. </body> Or this one?: <body> ..blah..blah.. ..call above functions on some events like onclick,onfocus,etc.. <script language="JavaScript" src="JS_File_100_KiloBytes"> function f1() { .. some logic reqd. for manipulating contents in a webpage } </script> </body> Need not tell everything is again in the <html> tag!! How does it affect performance of webpage while loading? Does it really? Which one is the best, either out of these 3 or some other which you know? And one more thing, I googled a bit on this, from which I went here: Best Practices for Speeding Up Your Web Site and it suggests put scripts at the bottom, but traditionally many people put it in <head> tag which is above the <body> tag. I know it's NOT a rule but many prefer it that way. If you don't believe it, just view source of this page! And tell me what's the better style for best performance.

    Read the article

  • Type classe, generic memoization

    - by nicolas
    Something quite odd is happening with y types and I quite dont understand if this is justified or not. I would tend to think not. This code works fine : type DictionarySingleton private () = static let mutable instance = Dictionary<string*obj, obj>() static member Instance = instance let memoize (f:'a -> 'b) = fun (x:'a) -> let key = f.ToString(), (x :> obj) if (DictionarySingleton.Instance).ContainsKey(key) then let r = (DictionarySingleton.Instance).[key] r :?> 'b else let res = f x (DictionarySingleton.Instance).[key] <- (res :> obj) res And this ones complains type DictionarySingleton private () = static let mutable instance = Dictionary<string*obj, _>() static member Instance = instance let memoize (f:'a -> 'b) = fun (x:'a) -> let key = f.ToString(), (x :> obj) if (DictionarySingleton.Instance).ContainsKey(key) then let r = (DictionarySingleton.Instance).[key] r :?> 'b else let res = f x (DictionarySingleton.Instance).[key] <- (res :> obj) res The difference is only the underscore in the dictionary definition. The infered types are the same, but the dynamic cast from r to type 'b exhibits an error. 'this runtime coercition ... runtime type tests are not allowed on some types, etc..' Am I missing something or is it a rough edge ?

    Read the article

  • Where are the static methods in gcc's dump file.c.135r.jump

    - by Customizer
    When I run gcc with the parameter -fdump-rtl-jump, I get a dump file with the name file.c.135r.jump, where I can read some information about the intermediate representation of the methods in my C or C++ file. I just recently discovered, that the static methods of a project are missing in this dump file. Do you know, why they are missing in that representation and if there is a possibility to include the static methods in this file, too. Update (some additional information): The test program, I'm using here, is the Hybrid OpenMP MPI Benchmark. Update2: I just reproduced the problem with a serial application, so it has nothing to do with parallel sections.

    Read the article

  • "variable tracking" is eating my compile time!

    - by wowus
    I have an auto-generated file which looks something like this... static void do_SomeFunc1(void* parameter) { // Do stuff. } // Continues on for another 4000 functions... void dispatch(int id, void* parameter) { switch(id) { case ::SomeClass1::id: return do_SomeFunc1(parameter); case ::SomeClass2::id: return do_SomeFunc2(parameter); // This continues for the next 4000 cases... } } When I build it like this, the build time is enormous. If I inline all the functions automagically into their respective cases using my script, the build time is cut in half. GCC 4.5.0 says ~50% of the build time is being taken up by "variable tracking" when I use -ftime-report. What does this mean and how can I speed compilation while still maintaining the superior cache locality of pulling out the functions from the switch? EDIT: Interestingly enough, the build time has exploded only on debug builds, as per the following profiling information of the whole project (which isn't just the file in question, but still a good metric; the file in question takes the most time to build): Debug: 8 minutes 50 seconds Release: 4 minutes, 25 seconds

    Read the article

  • most efficient method of turning multiple 1D arrays into columns of a 2D array

    - by Ty W
    As I was writing a for loop earlier today, I thought that there must be a neater way of doing this... so I figured I'd ask. I looked briefly for a duplicate question but didn't see anything obvious. The Problem: Given N arrays of length M, turn them into a M-row by N-column 2D array Example: $id = [1,5,2,8,6] $name = [a,b,c,d,e] $result = [[1,a], [5,b], [2,c], [8,d], [6,e]] My Solution: Pretty straight forward and probably not optimal, but it does work: <?php // $row is returned from a DB query // $row['<var>'] is a comma separated string of values $categories = array(); $ids = explode(",", $row['ids']); $names = explode(",", $row['names']); $titles = explode(",", $row['titles']); for($i = 0; $i < count($ids); $i++) { $categories[] = array("id" => $ids[$i], "name" => $names[$i], "title" => $titles[$i]); } ?> note: I didn't put the name = value bit in the spec, but it'd be awesome if there was some way to keep that as well.

    Read the article

  • Optimizing an embedded SELECT query in mySQL

    - by Crazy Serb
    Ok, here's a query that I am running right now on a table that has 45,000 records and is 65MB in size... and is just about to get bigger and bigger (so I gotta think of the future performance as well here): SELECT count(payment_id) as signup_count, sum(amount) as signup_amount FROM payments p WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30' AND completed > 0 AND tm_completed IS NOT NULL AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id) And as you might or might not imagine - it chokes the mysql server to a standstill... What it does is - it simply pulls the number of new users who signed up, have at least one "completed" payment, tm_completed is not empty (as it is only populated for completed payments), and (the embedded Select) that member has never had a "completed" payment before - meaning he's a new member (just because the system does rebills and whatnot, and this is the only way to sort of differentiate between an existing member who just got rebilled and a new member who got billed for the first time). Now, is there any possible way to optimize this query to use less resources or something, and to stop taking my mysql resources down on their knees...? Am I missing any info to clarify this any further? Let me know... EDIT: Here are the indexes already on that table: PRIMARY PRIMARY 46757 payment_id member_id INDEX 23378 member_id payer_id INDEX 11689 payer_id coupon_id INDEX 1 coupon_id tm_added INDEX 46757 tm_added, product_id tm_completed INDEX 46757 tm_completed, product_id

    Read the article

  • Optimizing Vector elements swaps using CUDA

    - by Orion Nebula
    Hi all, Since I am new to cuda .. I need your kind help I have this long vector, for each group of 24 elements, I need to do the following: for the first 12 elements, the even numbered elements are multiplied by -1, for the second 12 elements, the odd numbered elements are multiplied by -1 then the following swap takes place: Graph: because I don't yet have enough points, I couldn't post the image so here it is: http://www.freeimagehosting.net/image.php?e4b88fb666.png I have written this piece of code, and wonder if you could help me further optimize it to solve for divergence or bank conflicts .. //subvector is a multiple of 24, Mds and Nds are shared memory _shared_ double Mds[subVector]; _shared_ double Nds[subVector]; int tx = threadIdx.x; int tx_mod = tx ^ 0x0001; int basex = __umul24(blockDim.x, blockIdx.x); Mds[tx] = M.elements[basex + tx]; __syncthreads(); // flip the signs if (tx < (tx/24)*24 + 12) { //if < 12 and even if ((tx & 0x0001)==0) Mds[tx] = -Mds[tx]; } else if (tx < (tx/24)*24 + 24) { //if >12 and < 24 and odd if ((tx & 0x0001)==1) Mds[tx] = -Mds[tx]; } __syncthreads(); if (tx < (tx/24)*24 + 6) { //for the first 6 elements .. swap with last six in the 24elements group (see graph) Nds[tx] = Mds[tx_mod + 18]; Mds [tx_mod + 18] = Mds [tx]; Mds[tx] = Nds[tx]; } else if (tx < (tx/24)*24 + 12) { // for the second 6 elements .. swp with next adjacent group (see graph) Nds[tx] = Mds[tx_mod + 6]; Mds [tx_mod + 6] = Mds [tx]; Mds[tx] = Nds[tx]; } __syncthreads(); Thanks in advance ..

    Read the article

  • approximating log10[x^k0 + k1]

    - by Yale Zhang
    Greetings. I'm trying to approximate the function Log10[x^k0 + k1], where .21 < k0 < 21, 0 < k1 < ~2000, and x is integer < 2^14. k0 & k1 are constant. For practical purposes, you can assume k0 = 2.12, k1 = 2660. The desired accuracy is 5*10^-4 relative error. This function is virtually identical to Log[x], except near 0, where it differs a lot. I already have came up with a SIMD implementation that is ~1.15x faster than a simple lookup table, but would like to improve it if possible, which I think is very hard due to lack of efficient instructions. My SIMD implementation uses 16bit fixed point arithmetic to evaluate a 3rd degree polynomial (I use least squares fit). The polynomial uses different coefficients for different input ranges. There are 8 ranges, and range i spans (64)2^i to (64)2^(i + 1). The rational behind this is the derivatives of Log[x] drop rapidly with x, meaning a polynomial will fit it more accurately since polynomials are an exact fit for functions that have a derivative of 0 beyond a certain order. SIMD table lookups are done very efficiently with a single _mm_shuffle_epi8(). I use SSE's float to int conversion to get the exponent and significand used for the fixed point approximation. I also software pipelined the loop to get ~1.25x speedup, so further code optimizations are probably unlikely. What I'm asking is if there's a more efficient approximation at a higher level? For example: Can this function be decomposed into functions with a limited domain like log2((2^x) * significand) = x + log2(significand) hence eliminating the need to deal with different ranges (table lookups). The main problem I think is adding the k1 term kills all those nice log properties that we know and love, making it not possible. Or is it? Iterative method? don't think so because the Newton method for log[x] is already a complicated expression Exploiting locality of neighboring pixels? - if the range of the 8 inputs fall in the same approximation range, then I can look up a single coefficient, instead of looking up separate coefficients for each element. Thus, I can use this as a fast common case, and use a slower, general code path when it isn't. But for my data, the range needs to be ~2000 before this property hold 70% of the time, which doesn't seem to make this method competitive. Please, give me some opinion, especially if you're an applied mathematician, even if you say it can't be done. Thanks.

    Read the article

  • Does the order of columns in a query matter?

    - by James Simpson
    When selecting columns from a MySQL table, is performance affected by the order that you select the columns as compared to their order in the table (not considering indexes that may cover the columns)? For example, you have a table with rows uid, name, bday, and you have the following query. SELECT uid, name, bday FROM table Does MySQL see the following query any differently and thus cause any sort of performance hit? SELECT uid, bday, name FROM table

    Read the article

  • install python modules on shared web hosting

    - by Ali
    I am using a shared hosting environment that will not give me access to the command line. Can I download the python module on my computer, compile it using python setup.py installand then simply upload a .py file to the web host? If yes, where does the install statement place the compiled file?

    Read the article

  • Script Speed vs Memory Usage

    - by Doug Neiner
    I am working on an image generation script in PHP and have gotten it working two ways. One way is slow but uses a limited amount of memory, the second is much faster, but uses 6x the memory . There is no leakage in either script (as far as I can tell). In a limited benchmark, here is how they performed: -------------------------------------------- METHOD | TOTAL TIME | PEAK MEMORY | IMAGES -------------------------------------------- One | 65.626 | 540,036 | 200 Two | 20.207 | 3,269,600 | 200 -------------------------------------------- And here is the average of the previous numbers (if you don't want to do your own math): -------------------------------------------- METHOD | TOTAL TIME | PEAK MEMORY | IMAGES -------------------------------------------- One | 0.328 | 540,036 | 1 Two | 0.101 | 3,269,600 | 1 -------------------------------------------- Which method should I use and why? I anticipate this being used by a high volume of users, with each user making 10-20 requests to this script during a normal visit. I am leaning toward the faster method because though it uses more memory, it is for a 1/3 of the time and would reduce the number of concurrent requests.

    Read the article

  • Is SQL DATEDIFF(year, ..., ...) an Expensive Computation?

    - by rlb.usa
    I'm trying to optimize up some horrendously complicated SQL queries because it takes too long to finish. In my queries, I have dynamically created SQL statements with lots of the same functions, so I created a temporary table where each function is only called once instead of many, many times - this cut my execution time by 3/4. So my question is, can I expect to see much of a difference if say, 1,000 datediff computations are narrowed to 100?

    Read the article

  • Data Access from single table in sql server 2005 is too slow

    - by Muhammad Kashif Nadeem
    Following is the script of table. Accessing data from this table is too slow. SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[Emails]( [id] [int] IDENTITY(1,1) NOT NULL, [datecreated] [datetime] NULL CONSTRAINT [DF_Emails_datecreated] DEFAULT (getdate()), [UID] [nvarchar](250) COLLATE Latin1_General_CI_AS NULL, [From] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL, [To] [nvarchar](100) COLLATE Latin1_General_CI_AS NULL, [Subject] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL, [Body] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL, [HTML] [nvarchar](max) COLLATE Latin1_General_CI_AS NULL, [AttachmentCount] [int] NULL, [Dated] [datetime] NULL ) ON [PRIMARY] Following query takes 50 seconds to fetch data. select id, datecreated, UID, [From], [To], Subject, AttachmentCount, Dated from emails If I include Body and Html in select then time is event worse. indexes are on: id unique clustered From Non unique non clustered To Non unique non clustered Tabls has currently 180000+ records. There might be 100,000 records each month so this will become more slow as time will pass. Does splitting data into two table will solve the problem? What other indexes should be there?

    Read the article

  • How do polymorphic inline caches work with mutable types?

    - by kingkilr
    A polymorphic inline cache works by caching the actual method by the type of the object, in order to avoid the expensive lookup procedures (usually a hashtable lookup). How does one handle the type comparison if the type objects are mutable (i.e. the method might be monkey patched into something different at run time). The one idea I've come up with would be a "class counter" that gets incremented each time a method is adjusted, however this seems like it would be exceptionally expensive in a heavily monkey patched environ since it would kill all the PICs for that class, even if the methods for them weren't altered. I'm sure there must be a good solution to this, as this issue is directly applicable to Javascript and AFAIK all 3 of the big JS VMs have PICs (wow acronym ahoy).

    Read the article

  • Is this implementation truely tail-recursive?

    - by CFP
    Hello everyone! I've come up with the following code to compute in a tail-recursive way the result of an expression such as 3 4 * 1 + cos 8 * (aka 8*cos(1+(3*4))) The code is in OCaml. I'm using a list refto emulate a stack. type token = Num of float | Fun of (float->float) | Op of (float->float->float);; let pop l = let top = (List.hd !l) in l := List.tl (!l); top;; let push x l = l := (x::!l);; let empty l = (l = []);; let pile = ref [];; let eval data = let stack = ref data in let rec _eval cont = match (pop stack) with | Num(n) -> cont n; | Fun(f) -> _eval (fun x -> cont (f x)); | Op(op) -> _eval (fun x -> cont (op x (_eval (fun y->y)))); in _eval (fun x->x) ;; eval [Fun(fun x -> x**2.); Op(fun x y -> x+.y); Num(1.); Num(3.)];; I've used continuations to ensure tail-recursion, but since my stack implements some sort of a tree, and therefore provides quite a bad interface to what should be handled as a disjoint union type, the call to my function to evaluate the left branch with an identity continuation somehow irks a little. Yet it's working perfectly, but I have the feeling than in calling the _eval (fun y->y) bit, there must be something wrong happening, since it doesn't seem that this call can replace the previous one in the stack structure... Am I misunderstanding something here? I mean, I understand that with only the first call to _eval there wouldn't be any problem optimizing the calls, but here it seems to me that evaluation the _eval (fun y->y) will require to be stacked up, and therefore will fill the stack, possibly leading to an overflow... Thanks!

    Read the article

  • Main Function Error C++

    - by Arjun Nayini
    I have this main function: #ifndef MAIN_CPP #define MAIN_CPP #include "dsets.h" using namespace std; int main(){ DisjointSets s; s.uptree.addelements(4); for(int i=0; i<s.uptree.size(); i++) cout <<uptree.at(i) << endl; return 0; } #endif And the following class: class DisjointSets { public: void addelements(int x); int find(int x); void setunion(int x, int y); private: vector<int> uptree; }; #endif My implementation is this: void DisjointSets::addelements(int x){ for(int i=0; i<x; i++) uptree.push_back(-1); } //Given an int this function finds the root associated with that node. int DisjointSets::find(int x){ //need path compression if(uptree.at(x) < 0) return x; else return find(uptree.at(x)); } //This function reorders the uptree in order to represent the union of two //subtrees void DisjointSets::setunion(int x, int y){ } Upon compiling main.cpp (g++ main.cpp) I'm getting these errors: dsets.h: In function \u2018int main()\u2019: dsets.h:25: error: \u2018std::vector DisjointSets::uptree\u2019 is private main.cpp:9: error: within this context main.cpp:9: error: \u2018class std::vector \u2019 has no member named \u2018addelements\u2019 dsets.h:25: error: \u2018std::vector DisjointSets::uptree\u2019 is private main.cpp:10: error: within this context main.cpp:11: error: \u2018uptree\u2019 was not declared in this scope I'm not sure exactly whats wrong. Any help would be appreciated.

    Read the article

  • unroll nested for loops in C++

    - by Hristo
    How would I unroll the following nested loops? for(k = begin; k != end; ++k) { for(j = 0; j < Emax; ++j) { for(i = 0; i < N; ++i) { if (j >= E[i]) continue; array[k] += foo(i, tr[k][i], ex[j][i]); } } } I tried the following, but my output isn't the same, and it should be: for(k = begin; k != end; ++k) { for(j = 0; j < Emax; ++j) { for(i = 0; i+4 < N; i+=4) { if (j >= E[i]) continue; array[k] += foo(i, tr[k][i], ex[j][i]); array[k] += foo(i+1, tr[k][i+1], ex[j][i+1]); array[k] += foo(i+2, tr[k][i+2], ex[j][i+2]); array[k] += foo(i+3, tr[k][i+3], ex[j][i+3]); } if (i < N) { for (; i < N; ++i) { if (j >= E[i]) continue; array[k] += foo(i, tr[k][i], ex[j][i]); } } } } I will be running this code in parallel using Intel's TBB so that it takes advantage of multiple cores. After this is finished running, another function prints out what is in array[] and right now, with my unrolling, the output isn't identical. Any help is appreciated. Thanks, Hristo

    Read the article

  • Is it a solvable problem to generate a regular expression that matches some input set?

    - by Roman
    I provide some input set which contains known separated number of text blocks. I want to make a program that automatically generate 1 or more regular expressions each of which matches every text block in the input set. I see some relatively easy ways to implement a brute-force search. But I'm not an expert in compilers theory. That's why I'm curious: 1) is this problem solvable? or there are some principle impossibility to make such algorithm? 2) is it possible to achieve polynomial complexity for this algorithm and avoid brute forcing?

    Read the article

  • What does ER_WARN_FIELD_RESOLVED mean?

    - by VolkerK
    When SHOW WARNINGS after a EXPLAIN EXTENDED shows a Note 1276 Field or reference 'test.foo.bar' of SELECT #2 was resolved in SELECT #1 what exactly does that mean and what impact does it have? In my case it prevents mysql from using what seems to be a perfectly good index. But it's not about fixing that specific query (as it is an irrelevant test). I found http://dev.mysql.com/doc/refman/5.0/en/error-messages-server.html butError: 1276 SQLSTATE: HY000 (ER_WARN_FIELD_RESOLVED) Message: Field or reference '%s%s%s%s%s' of SELECT #%d was resolved in SELECT #%d isn't much of an explaination.

    Read the article

  • Where is the bottleneck in this code?

    - by Mikhail
    I have the following tight loop that makes up the serial bottle neck of my code. Ideally I would parallelize the function that calls this but that is not possible. //n is about 60 for (int k = 0;k < n;k++) { double fone = z[k*n+i+1]; double fzer = z[k*n+i]; z[k*n+i+1]= s*fzer+c*fone; z[k*n+i] = c*fzer-s*fone; } Are there any optimizations that can be made such as vectorization or some evil inline that can help this code? I am looking into finding eigen solutions of tridiagonal matrices. http://www.cimat.mx/~posada/OptDoglegGraph/DocLogisticDogleg/projects/adjustedrecipes/tqli.cpp.html

    Read the article

  • How to store static content across branches in a single location in version control

    - by Shravan
    [Just a random thought] I have a pdf doc that is downloaded when the user clicks on 'help' on my website. Now, this is a pretty huge document and is saved in version control (SVN) and is thus copied for all branches that exist in SVN. This is static content and something that developers are not working on, and does not change often. Is there a more efficient way to store it (that would not hamper local deployments) that would make SVN checkouts and updates relatively faster. I know the benefit we get is not huge, this is something that came to my head none the less.

    Read the article

  • Visual C++ doesn't operator<< overload

    - by PierreBdR
    I have a vector class that I want to be able to input/output from a QTextStream object. The forward declaration of my vector class is: namespace util { template <size_t dim, typename T> class Vector; } I define the operator<< as: namespace util { template <size_t dim, typename T> QTextStream& operator<<(QTextStream& out, const util::Vector<dim,T>& vec) { ... } template <size_t dim, typename T> QTextStream& operator>>(QTextStream& in,util::Vector<dim,T>& vec) { .. } } However, if I ty to use these operators, Visual C++ returns this error: error C2678: binary '<<' : no operator found which takes a left-hand operand of type 'QTextStream' (or there is no acceptable conversion) A few things I tried: Originaly, the methods were defined as friends of the template, and it is working fine this way with g++. The methods have been moved outside the namespace util I changed the definition of the templates to fit what I found on various Visual C++ websites. The original friend declaration is: friend QTextStream& operator>>(QTextStream& ss, Vector& in) { ... } The "Visual C++ adapted" version is: friend QTextStream& operator>> <dim,T>(QTextStream& ss, Vector<dim,T>& in); with the function pre-declared before the class and implemented after. I checked the file is correctly included using: #pragma message ("Including vector header") And everything seems fine. Doesn anyone has any idea what might be wrong?

    Read the article

  • Invalid instruction suffix for push when assembling with gas

    - by vitaut
    When assembling a file with GNU assembler I get the following error: hello.s:6: Error: invalid instruction suffix for `push' Here's the file that I'm trying to assemble: .text LC0: .ascii "Hello, world!\12\0" .globl _main _main: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp movl $0, %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax call __alloca call ___main movl $LC0, (%esp) call _printf movl $0, %eax leave ret What is wrong here and how do I fix it?

    Read the article

< Previous Page | 87 88 89 90 91 92 93 94 95 96 97 98  | Next Page >