Search Results

Search found 3641 results on 146 pages for 'threads'.

Page 5/146 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Is my way of doing threads in Android correct?

    - by Charlie
    Hi, I'm writing a live wallpaper, and I'm forking off two separate threads in my main wallpaper service. One updates, and the other draws. I was under the impression that once you call thread.start(), it took care of everything for you, but after some trial and error, it seems that if I want my update and draw threads to keep running, I have to manually keep calling their run() methods? In other words, instead of calling start() on both threads and forgetting, I have to manually set up a delayed handler event that calls thread.run() on both the update and draw threads every 16 milliseconds. Is this the correct way of having a long running thread? Also, to kill threads, I'm just setting them to be daemons, then nulling them out. Is this method ok? Most examples I see use some sort of join() / interrupt() in a while loop...I don't understand that one...

    Read the article

  • Performance degrades for more than 2 threads on Xeon X5355

    - by zoolii
    Hi All, I am writing an application using boost threads and using boost barriers to synchronize the threads. I have two machines to test the application. Machine 1 is a core2 duo (T8300) cpu machine (windows XP professional - 4GB RAM) where I am getting following performance figures : Number of threads :1 , TPS :21 Number of threads :2 , TPS :35 (66 % improvement) further increase in number of threads decreases the TPS but that is understandable as the machine has only two cores. Machine 2 is a 2 quad core ( Xeon X5355) cpu machine (windows 2003 server with 4GB RAM) and has 8 effective cores. Number of threads :1 , TPS :21 Number of threads :2 , TPS :27 (28 % improvement) Number of threads :4 , TPS :25 Number of threads :8 , TPS :24 As you can see, performance is degrading after 2 threads (though it has 8 cores). If the program has some bottle neck , then for 2 thread also it should have degraded. Any idea? , Explanations ? , Does the OS has some role in performance ? - It seems like the Core2duo (2.4GHz) scales better than Xeon X5355 (2.66GHz) though it has better clock speed. Thank you -Zoolii

    Read the article

  • Linux per-process resource limits - a deep Red Hat Mystery

    - by BobBanana
    I have my own multithreaded C program which scales in speed smoothly with the number of CPU cores.. I can run it with 1, 2, 3, etc threads and get linear speedup.. up to about 5.5x speed on a 6-core CPU on a Ubuntu Linux box. I had an opportunity to run the program on a very high end Sunfire x4450 with 4 quad-core Xeon processors, running Red Hat Enterprise Linux. I was eagerly anticipating seeing how fast the 16 cores could run my program with 16 threads.. But it runs at the same speed as just TWO threads! Much hair-pulling and debugging later, I see that my program really is creating all the threads, they really are running simultaneously, but the threads themselves are slower than they should be. 2 threads runs about 1.7x faster than 1, but 3, 4, 8, 10, 16 threads all run at just net 1.9x! I can see all the threads are running (not stalled or sleeping), they're just slow. To check that the HARDWARE wasn't at fault, I ran SIXTEEN copies of my program independently, simultaneously. They all ran at full speed. There really are 16 cores and they really do run at full speed and there really is enough RAM (in fact this machine has 64GB, and I only use 1GB per process). So, my question is if there's some OPERATING SYSTEM explanation, perhaps some per-process resource limit which automatically scales back thread scheduling to keep one process from hogging the machine. Clues are: My program does not access the disk or network. It's CPU limited. Its speed scales linearly on a single CPU box in Ubuntu Linux with a hexacore i7 for 1-6 threads. 6 threads is effectively 6x speedup. My program never runs faster than 2x speedup on this 16 core Sunfire Xeon box, for any number of threads from 2-16. Running 16 copies of my program single threaded runs perfectly, all 16 running at once at full speed. top shows 1600% of CPUs allocated. /proc/cpuinfo shows all 16 cores running at full 2.9GHz speed (not low frequency idle speed of 1.6GHz) There's 48GB of RAM free, it is not swapping. What's happening? Is there some process CPU limit policy? How could I measure it if so? What else could explain this behavior? Thanks for your ideas to solve this, the Great Xeon Slowdown Mystery of 2010!

    Read the article

  • Threads slowing down application and not working properly

    - by Belgin
    I'm making a software renderer which does per-polygon rasterization using a floating point digital differential analyzer algorithm. My idea was to create two threads for rasterization and have them work like so: one thread draws each even scanline in a polygon and the other thread draws each odd scanline, and they both start working at the same time, but the main application waits for both of them to finish and then pauses them before continuing with other computations. As this is the first time I'm making a threaded application, I'm not sure if the following method for thread synchronization is correct: First of all, I use two global variables to control the two threads, if a global variable is set to 1, that means the thread can start working, otherwise it must not work. This is checked by the thread running an infinite loop and if it detects that the global variable has changed its value, it does its job and then sets the variable back to 0 again. The main program also uses an empty while to check when both variables become 0 after setting them to 1. Second, each thread is assigned a global structure which contains information about the triangle that is about to be rasterized. The structures are filled in by the main program before setting the global variables to 1. My dilemma is that, while this process works under some conditions, it slows down the program considerably, and also it fails to run properly when compiled for Release in Visual Studio, or when compiled with any sort of -O optimization with gcc (i.e. nothing on screen, even SEGFAULTs). The program isn't much faster by default without threads, which you can see for yourself by commenting out the #define THREADS directive, but if I apply optimizations, it becomes much faster (especially with gcc -Ofast -march=native). N.B. It might not compile with gcc because of fscanf_s calls, but you can replace those with the usual fscanf, if you wish to use gcc. Because there is a lot of code, too much for here or pastebin, I created a git repository where you can view it. My questions are: Why does adding these two threads slow down my application? Why doesn't it work when compiling for Release or with optimizations? Can I speed up the application with threads? If so, how? Thanks in advance.

    Read the article

  • Does Apache ever give incorrect "out of threads" errors?

    - by Eli Courtwright
    Lately our Apache web server has been giving us this error multiple times per day: [Tue Apr 06 01:07:10 2010] [error] Server ran out of threads to serve requests. Consider raising the ThreadsPerChild setting We raised our ThreadsPerChild setting from 50 to 100, but we still get the error. Our access logs indicate that these errors never even happen at periods of high load. For example, here's an excerpt from our access log (ip addresses and some urls are edited for privacy). As you can see, the above error happened at 1:07 and only a small handful of requests occurred in the several minutes leading up to the error: 99.88.77.66 - - [06/Apr/2010:00:59:33 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-icons_222222_256x240.png HTTP/1.1" 304 - 99.88.77.66 - - [06/Apr/2010:00:59:34 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_glass_75_dadada_1x400.png HTTP/1.1" 200 111 99.88.77.66 - - [06/Apr/2010:00:59:34 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_glass_75_dadada_1x400.png HTTP/1.1" 200 111 99.88.77.66 - mpeu [06/Apr/2010:00:59:40 -0400] "GET /some/dynamic/content HTTP/1.1" 200 145049 55.44.33.22 - mpeu [06/Apr/2010:01:06:56 -0400] "GET /other/dynamic/content HTTP/1.1" 200 12311 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/jquery-ui-1.7.1.custom.css HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/js/jquery-1.3.2.min.js HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/js/jquery-ui-1.7.1.custom.min.js HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/jquery.tablesorter.min.js HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/date.js HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image1.gif HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image2.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image3.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image4.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image5.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image6.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:56 -0400] "GET /WebRepository/pdfs/image7.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:57 -0400] "GET /WebRepository/pdfs/image8.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:57 -0400] "GET /WebRepository/pdfs/image9.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:57 -0400] "GET /WebRepository/pdfs/imageA.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:57 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_flat_75_ffffff_40x100.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:59 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_highlight-soft_75_cccccc_1x100.png HTTP/1.1" 304 - 55.44.33.22 - - [06/Apr/2010:01:06:59 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_glass_75_e6e6e6_1x400.png HTTP/1.1" 200 110 55.44.33.22 - - [06/Apr/2010:01:06:59 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/images/ui-bg_glass_75_e6e6e6_1x400.png HTTP/1.1" 200 110 11.22.33.44 - mpeu [06/Apr/2010:01:18:03 -0400] "GET /other/dynamic/content HTTP/1.1" 200 12311 11.22.33.44 - - [06/Apr/2010:01:18:03 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/js/jquery-1.3.2.min.js HTTP/1.1" 304 - 11.22.33.44 - - [06/Apr/2010:01:18:04 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/css/smoothness/jquery-ui-1.7.1.custom.css HTTP/1.1" 200 27374 11.22.33.44 - - [06/Apr/2010:01:18:04 -0400] "GET /WebRepository/jquery/jquery-ui-1.7.1.custom/js/jquery-ui-1.7.1.custom.min.js HTTP/1.1" 304 - 11.22.33.44 - - [06/Apr/2010:01:18:04 -0400] "GET /WebRepository/jquery.tablesorter.min.js HTTP/1.1" 200 12795 11.22.33.44 - - [06/Apr/2010:01:18:04 -0400] "GET /WebRepository/date.js HTTP/1.1" 200 25809 For what it's worth, we're running the version of Apache that ships with Oracle 10g (some 2.0 version), and we're using mod_plsql to generate our dynamic content. Since the Apache server runs as a separate process and the database doesn't record any problems when this error occurs, I'm doubtful that Oracle is the problem. Unfortunately, the errors are freaking out our sysadmins, who are inclined to blame any and all problems which occur with the server on this error. Is this a known bug in Apache that I simply haven't been able to find any reference to through Google?

    Read the article

  • Why am I getting a segmentation fault when I use binmode with threads in Perl?

    - by jAndy
    Hi Folks, this call my $th = threads->create(\&print, "Hello thread World!\n"); $th->join(); works fine. But as soon as I add binmode(STDOUT, ":encoding(ISO-8859-1)"); to my script file, I get an error like "segmentation fault", "access denied". What is wrong to define an encoding type when trying to call a perl thread? Example: use strict; use warnings; use threads; binmode(STDOUT, ":encoding(ISO-8859-1)"); my $th = threads->create(\&print, "Hello thread World!\n"); $th->join(); sub print { print @_; } This code does not work for me. Kind Regards --Andy

    Read the article

  • Multithreading synchronization interview question: Find n words given m threads

    - by rplusg
    I came across this question: You are given a paragraph , which contain n number of words, you are given m threads. What you need to do is , each thread should print one word and give the control to next thread, this way each thread will keep on printing one word , in case last thread come, it should invoke the first thread. Printing will repeat until all the words are printed in paragraph. Finally all threads should exit gracefully. What kind of synchronization will use? I strongly feel we cannot take any advantage of threads here but interviewer is trying to understand my synchronization skills? No need of code, just put some thoughts. I will implement by myself.

    Read the article

  • Sharing SCTP connection with multiple threads

    - by poly
    I have an application that needs to run in SCTP environment, I have a question in sharing the connection among multiple threads for packet receiving only, I've tried with the sctp_sendmsg and it worked without even locking the threads (is that been taking care of by the OS, in other words, is it thread safe to do that). I've tested many cases with the send and I can't see them out of sync. Anyway, back to the receiving, is it possible to create multiple threads and send each thread the sctp descriptor to start receiving messages? Do I need a lock here or is it ok without lock? I'm using C in linux.

    Read the article

  • 3 threads printing numbers of different range

    - by user875036
    This question was asked in an Electronic Arts interview: There are 3 threads. The first thread prints the numbers 1 to 10. The second thread prints the numbers 11 to 20. The third thread prints the numbers from from 21 to 30. Now, all three threads are running. The numbers are printed in an irregular order like 1, 11, 2, 21, 12 etc. If I want numbers to be printed in sorted order like 1, 2, 3, 4..., what should I do with these threads?

    Read the article

  • Setting minimum threads in thread pool

    - by expert
    I have an application with 4 worker threads from the thread pool. It was waking up every 0.5 second. as written in msdn the thread pool monitors every 0,5 second to create idle threads. I set the nuber of minimum threads to 4 and it solved the problem - no more background activity all the time. My question is - I have another applicatiopn which has the same number of threads threads-4, but here setting min thread to 4 doesn't help but when setting min thread to 5 then the background monitoring stops. What might be the difference between 2 application with the same number of threads from the thread pool- 4 threads.On one setting min threads to 4 helps and the other only setting min threads to 5 helps?

    Read the article

  • Why the performance of following code is degrading when I use threads ?

    - by DotNetBeginner
    Why the performance of following code is degrading when I use threads ? **1.Without threads int[] arr = new int[100000000]; //Array elements - [0][1][2][3]---[100000000-1] addWithOutThreading(arr); // Time required for this operation - 1.16 sec Definition for addWithOutThreading public void addWithOutThreading(int[] arr) { UInt64 result = 0; for (int i = 0; i < 100000000; i++) { result = result + Convert.ToUInt64(arr[i]); } Console.WriteLine("Addition = " + result.ToString()); } **2.With threads int[] arr = new int[100000000]; int part = (100000000 / 4); UInt64 res1 = 0, res2 = 0, res3 = 0, res4 = 0; ThreadStart starter1 = delegate { addWithThreading(arr, 0, part, ref res1); }; ThreadStart starter2 = delegate { addWithThreading(arr, part, part * 2, ref res2); }; ThreadStart starter3 = delegate { addWithThreading(arr, part * 2, part * 3, ref res3); }; ThreadStart starter4 = delegate { addWithThreading(arr, part * 3, part * 4, ref res4); }; Thread t1 = new Thread(starter1); Thread t2 = new Thread(starter2); Thread t3 = new Thread(starter3); Thread t4 = new Thread(starter4); t1.Start(); t2.Start(); t3.Start(); t4.Start(); t1.Join(); t2.Join(); t3.Join(); t4.Join(); Console.WriteLine("Addition = "+(res1+res2+res3+res4).ToString()); // Time required for this operation - 1.30 sec Definition for addWithThreading public void addWithThreading(int[] arr,int startIndex, int endIndex,ref UInt64 result) { for (int i = startIndex; i < endIndex; i++) { result = result + Convert.ToUInt64(arr[i]); } }

    Read the article

  • Threads, événements et QObject, première partie sur les boucles événementielles, traduction par Vivien Duboué

    Le but de ce document n'est pas de vous enseigner la manière d'utiliser les threads, de faire des verrouillages appropriés, d'exploiter le parallélisme ou encore d'écrire des programmes extensibles : il y a de nombreux bons livres sur ce sujet ; par exemple, jetez un coup d'%u0153il sur la liste des lectures recommandées par la documentation. Ce petit article est plutôt voué à être un guide pour introduire les utilisateurs dans le monde des threads de Qt 4, afin d'éviter les écueils les plus récurrents et de les aider à développer des codes à la fois plus robustes et mieux structurés.

    Read the article

  • shared transaction ID function among multiple threads

    - by poly
    I'm writing an application in C that requires multiple threads to request a unique transaction ID from a function as shown below; struct list{ int id; struct list *next }; function generate_id() { linked-list is built here to hold 10 millions } my concern is how to sync between two or more threads so that transaction id can be unique among them without using mutex, is this possible? Please share anything even if I need to change linked list to something else.

    Read the article

  • CUDA, more threads for same work = Longer run time despite better occupancy, Why?

    - by zenna
    I encountered a strange problem where increasing my occupancy by increasing the number of threads reduced performance. I created the following program to illustrate the problem: #include <stdio.h> #include <stdlib.h> #include <cuda_runtime.h> __global__ void less_threads(float * d_out) { int num_inliers; for (int j=0;j<800;++j) { //Do 12 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; num_inliers += threadIdx.x*5; num_inliers += threadIdx.x*6; num_inliers += threadIdx.x*7; num_inliers += threadIdx.x*8; num_inliers += threadIdx.x*9; num_inliers += threadIdx.x*10; num_inliers += threadIdx.x*11; num_inliers += threadIdx.x*12; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } __global__ void more_threads(float *d_out) { int num_inliers; for (int j=0;j<800;++j) { // Do 4 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } int main(int argc, char* argv[]) { float *d_out = NULL; cudaMalloc((void**)&d_out,sizeof(float)*25000); more_threads<<<780,128>>>(d_out); less_threads<<<780,32>>>(d_out); return 0; } Note both kernels should do the same amount of work in total, the (if threadIdx.x == -1 is a trick to stop the compiler optimising everything out and leaving an empty kernel). The work should be the same as more_threads is using 4 times as many threads but with each thread doing 4 times less work. Significant results form the profiler results are as followsL: more_threads: GPU runtime = 1474 us,reg per thread = 6,occupancy=1,branch=83746,divergent_branch = 26,instructions = 584065,gst request=1084552 less_threads: GPU runtime = 921 us,reg per thread = 14,occupancy=0.25,branch=20956,divergent_branch = 26,instructions = 312663,gst request=677381 As I said previously, the run time of the kernel using more threads is longer, this could be due to the increased number of instructions. Why are there more instructions? Why is there any branching, let alone divergent branching, considering there is no conditional code? Why are there any gst requests when there is no global memory access? What is going on here! Thanks

    Read the article

  • Thread placement policies on NUMA systems - update

    - by Dave
    In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

    Read the article

  • Performance degrades for more than 2 threads on Xeon X5355

    - by zoolii
    Hi All, I am writing an application using boost threads and using boost barriers to synchronize the threads. I have two machines to test the application. Machine 1 is a core2 duo (T8300) cpu machine (windows XP professional - 4GB RAM) where I am getting following performance figures : Number of threads :1 , TPS :21 Number of threads :2 , TPS :35 (66 % improvement) further increase in number of threads decreases the TPS but that is understandable as the machine has only two cores. Machine 2 is a 2 quad core ( Xeon X5355) cpu machine (windows 2003 server with 4GB RAM) and has 8 effective cores. Number of threads :1 , TPS :21 Number of threads :2 , TPS :27 (28 % improvement) Number of threads :4 , TPS :25 Number of threads :8 , TPS :24 As you can see, performance is degrading after 2 threads (though it has 8 cores). If the program has some bottle neck , then for 2 thread also it should have degraded. Any idea? , Explanations ? , Does the OS has some role in performance ? - It seems like the Core2duo (2.4GHz) scales better than Xeon X5355 (2.66GHz) though it has better clock speed. Thank you -Zoolii

    Read the article

  • Coloring of collapsed threads in mutt

    - by Rich
    I'm trying to figure out the syntax of colouring collapsed threads in the mutt index. The documentation for mutt patterns doesn't seem to include a description of how this works, and so far I've been completely unable to figure it out by trial and error. What I'd like is for collapsed threads that contain any unread (new) messages to be always coloured green. If collapsed threads with no unread messages contain any flagged messages, then I'd like them to be red. So far, every set of patterns I've tried results in threads that contain both flagged and unread messages being coloured red (I want them green). These work: color index green default "~N" # unread messages color index green default "~N~F" # unread flagged messages color index red default "~F" # flagged messages color index green default "~v~(~N)" # collapsed thread with unread But these don't: color index green default "~v~(~N~F)" # attempt to keep threads with unread green color index red default "~v~(~F)" # colours collapsed threads with flagged and unread red color index red default "~v~(!~N~F)" # ditto color index red default "~v~(^!~N~F)" # ditto color index red default "~v~(~F)~(!~N)" # ditto color index red default "~v~(~F)~v~(!~N)" # ditto I've also tried switching the order of the "~v~(~F)" and "~v~(~N)" commands in the file, but the "flagged" rule always seems to take precedence over the "new" rule. Ideally I'd like to understand how the syntax for colouring collapsed threads works, but at this point I'd happily settle for a set of rules that achieves the colourscheme described above.

    Read the article

  • Forum that integrates into CMS and has curated category pages with tagged threads

    - by user6172
    I'm looking for a forum that meets these requirements: Login using Facebook/Twitter/OpenID etc. User profiles with reward system Voting/thumbs up function Categories and tags for sorting threads Custom category pages with moderated static header Embeddable threads and categories (For example, a whole category or single thread can be integrated into wordpress) API to users, discussions etc. I've looked at forums like Vanilla, Disqus, OSQA etc, but none seem to match the above "hybrid criteria". Hosted or self-hosted doesn't matter but I'm really looking for something that can be integrated into an existing CMS to replace comments while at the same time have curated category pages and user profiles. Thanks.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >