Search Results

Search found 1933 results on 78 pages for 'genetic algorithms'.

Page 5/78 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Is wikipedia a valuable resource for studying data structures? (can we call it complete?)

    - by Amir Nasr
    Can I depend on wikipedia to learn data structures fully using the list of data structures http://en.wikipedia.org/wiki/List_of_data_structures and the links they refer to? The same question for algorithms http://en.wikipedia.org/wiki/List_of_algorithm_general_topics ?... What's after learning algorithms and data structures? Specializing in a certain field of algorithms such as computer graohics, memory management...etc? or what could be the plan for mastering programming after knowing the language syntax and the background about program design and programming logic? I asked about wikipedia because i would like to find a complete resource or are least a resource which would be enough for the field of data structures instead of searching for separate articles in different places in other words an alternative to books which may even be more complete.

    Read the article

  • How important is it for a programmer to know how to implement a QuickSort/MergeSort algorithm from memory?

    - by John Smith
    I was reviewing my notes and stumbled across the implementation of different sorting algorithms. As I attempted to make sense of the implementation of QuickSort and MergeSort, it occurred to me that although I do programming for a living and consider myself decent at what I do, I have neither the photographic memory nor the sheer brainpower to implement those algorithms without relying on my notes. All I remembered is that some of those algorithms are stable and some are not. Some take O(nlog(n)) or O(n^2) time to complete. Some use more memory than others... I'd feel like I don't deserve this kind of job if it weren't because my position doesn't require that I use any sorting algorithm other than those found in standard APIs. I mean, how many of you have a programming position where it actually is essential that you can remember or come up with this kind of stuff on your own?

    Read the article

  • How could I apply a genetic algorithm to a simple game that follows rollercoaster tracks?

    - by Chris
    I have free-rein over what I do on a final assignment for school, with respect to modifying a simple direct-x game that currently just has the camera follow some rollercoaster rails. I've developed an interest in genetic algorithms and would like to take this opportunity to apply one and learn something about them. However, I can't think of any way I could possibly apply one in this case. What are some options available to me?

    Read the article

  • Are proofs worth the effort?

    - by Shashank Jain
    I bought the de-facto book for learning about data structures and algorithms (CLRS). The book is though quite good but the singularity is in the proofs. The book is filled with Lemmas, theorems, peculiar symbols and unimaginable recurrence relations which are very hard to understand. I am able to somehow get the algorithms but the discrete mathematics just not for me. So should I leave them out and just concentrate on algorithims?

    Read the article

  • Taking a Projects Development to the Next Level

    - by user1745022
    I have been looking for some advice for a while on how to handle a project I am working on, but to no avail. I am pretty much on my fourth iteration of improving an "application" I am working on; the first two times were in Excel, the third Time in Access, and now in Visual Studio. The field is manufacturing. The basic idea is I am taking read-only data from a massive Sybase server, filtering it and creating much smaller tables in Access daily (using delete and append Queries) and then doing a bunch of stuff. More specifically, I use a series of queries to either combine data from multiple tables or group data in specific ways (aggregate functions), and then I place this data into a table (so I can sort and manipulate data using DAO.recordset and run multiple custom algorithms). This process is then repeated multiple times throughout the database until a set of relevant tables are created. Many times I will create a field in a query with a value such as 1.1 so that when I append it to a table I can store information in the field from the algorithms. So as the process continues the number of fields for the tables change. The overall application consists of 4 "back-end" databases linked together on a shared drive, with various output (either front-end access applications or Excel). So my question is is this how many data driven applications that solve problems essentially work? Each backend database is updated with fresh data daily and updating each takes around 10 seconds (for three) and 2 minutes(for 1). Project Objectives. I want/am moving to SQL Server soon. Front End will be a Web Application (I know basic web-development and like the administration flexibility) and visual-studio will be IDE with c#/.NET. Should these algorithms be run "inside the database," or using a series of C# functions on each server request. I know you're not supposed to store data in a database unless it is an actual data point, and in Access I have many columns that just hold calculations from algorithms in vba. The truth is, I have seen multiple professional Access applications, and have never seen one that has the complexity or does even close to what mine does (for better or worse). But I know some professional software applications are 1000 times better then mine. So Please Please Please give me a suggestion of some sort. I have been completely on my own and need some guidance on how to approach this project the right way.

    Read the article

  • What's the "Hello World!" of genetic algorithms good for?

    - by JohnIdol
    I found this very cool C++ sample , literally the "Hello World!" of genetic algorithms. I so decided to re-code the whole thing in C# and this is the result. Now I am asking myself: is there any practical application along the lines of generating a target string starting from a population of random strings? EDIT: my buddy on twitter just tweeted that "is useful for transcription type things such as translation. Does not have to be Monkey's". I wish I had a clue.

    Read the article

  • is c++ STL algorithms and containers same across platforms and performance?

    - by Abhilash M
    After learning good amount of c++, i'm now into STL containers and algorithms template library, my major concerns are, 1) Is this library same across different platforms like MS, linux n other os? 2) will quality or efficiency of program c++ module decrease with more use of STL containers and algorithms, i think i can't customize it to all needs. 3) Is this template library good to use in linux system programming, kernel modules? 4) lastly can i use this in programming contests, because it relives a lot of coding and pressure off shoulders.

    Read the article

  • Implementing algorithms via compute shaders vs. pipeline shaders

    - by TravisG
    With the availability of compute shaders for both DirectX and OpenGL it's now possible to implement many algorithms without going through the rasterization pipeline and instead use general purpose computing on the GPU to solve the problem. For some algorithms this seems to become the intuitive canonical solution because they're inherently not rasterization based, and rasterization-based shaders seemed to be a workaround to harness GPU power (simple example: creating a noise texture. No quad needs to be rasterized here). Given an algorithm that can be implemented both ways, are there general (potential) performance benefits over using compute shaders vs. going the normal route? Are there drawbacks that we should watch out for (for example, is there some kind of unusual overhead to switching from/to compute shaders at runtime)? Are there perhaps other benefits or drawbacks to consider when choosing between the two?

    Read the article

  • Collision Detection algorithms with early Collision exit

    - by Grieverheart
    I'm using collision detection in Monte Carlo simulations and at the moment I'm using GJK which is quite fast. I can't help to think it could be done even faster though. In the simulations, about 70% of the time GJK is run, it detects a collision. Thus collisions are more than non-collisions in my case. Most collision detection algorithms I know have an early non-collision exit test. Are there any collision detection algorithms that have an early collision detect instead of non-collision and could be potentially faster than GJK in case of collision?

    Read the article

  • Why does adding Crossover to my Genetic Algorithm gives me worse results?

    - by MahlerFive
    I have implemented a Genetic Algorithm to solve the Traveling Salesman Problem (TSP). When I use only mutation, I find better solutions than when I add in crossover. I know that normal crossover methods do not work for TSP, so I implemented both the Ordered Crossover and the PMX Crossover methods, and both suffer from bad results. Here are the other parameters I'm using: Mutation: Single Swap Mutation or Inverted Subsequence Mutation (as described by Tiendil here) with mutation rates tested between 1% and 25%. Selection: Roulette Wheel Selection Fitness function: 1 / distance of tour Population size: Tested 100, 200, 500, I also run the GA 5 times so that I have a variety of starting populations. Stop Condition: 2500 generations With the same dataset of 26 points, I usually get results of about 500-600 distance using purely mutation with high mutation rates. When adding crossover my results are usually in the 800 distance range. The other confusing thing is that I have also implemented a very simple Hill-Climbing algorithm to solve the problem and when I run that 1000 times (faster than running the GA 5 times) I get results around 410-450 distance, and I would expect to get better results using a GA. Any ideas as to why my GA performing worse when I add crossover? And why is it performing much worse than a simple Hill-Climb algorithm which should get stuck on local maxima as it has no way of exploring once it finds a local max?

    Read the article

  • Why does adding Crossover to my Genetic Algorithm give me worse results?

    - by MahlerFive
    I have implemented a Genetic Algorithm to solve the Traveling Salesman Problem (TSP). When I use only mutation, I find better solutions than when I add in crossover. I know that normal crossover methods do not work for TSP, so I implemented both the Ordered Crossover and the PMX Crossover methods, and both suffer from bad results. Here are the other parameters I'm using: Mutation: Single Swap Mutation or Inverted Subsequence Mutation (as described by Tiendil here) with mutation rates tested between 1% and 25%. Selection: Roulette Wheel Selection Fitness function: 1 / distance of tour Population size: Tested 100, 200, 500, I also run the GA 5 times so that I have a variety of starting populations. Stop Condition: 2500 generations With the same dataset of 26 points, I usually get results of about 500-600 distance using purely mutation with high mutation rates. When adding crossover my results are usually in the 800 distance range. The other confusing thing is that I have also implemented a very simple Hill-Climbing algorithm to solve the problem and when I run that 1000 times (faster than running the GA 5 times) I get results around 410-450 distance, and I would expect to get better results using a GA. Any ideas as to why my GA performing worse when I add crossover? And why is it performing much worse than a simple Hill-Climb algorithm which should get stuck on local maxima as it has no way of exploring once it finds a local max?

    Read the article

  • 2D vector value replacement using classes; genetic algorithm mutation

    - by gcolumbus
    I have a 2D vector as defined by the classes below. Note that I've used classes because I'm trying to program a genetic algorithm such that many, many 2D vectors will be created and they will all be different. class Quad: public std::vector<int> { public: Quad() : std::vector<int>(4,0) {} }; class QuadVec : public std::vector<Quad> { }; An important part of my algorithm, however, is that I need to be able to "mutate" (randomly change) particular values in a certain number of randomly chosen 2D vectors. This has me stumped. I can easily write code to randomly select the value within the 2D vector that will be selected for "mutation" but how do I actually enact that change using classes? I understand how this would be done with one 2D vector that has already been initialized but how do I do this if it hasn't? Please let me know if I haven't provided enough info or am not clear as I tend to do that. Thanks for your time and help!

    Read the article

  • Shortest-path algorithms which use a space-time tradeoff?

    - by Chris Mounce
    I need to find shortest paths in an unweighted, undirected graph. There are algorithms which can find a shortest path between two nodes, but this can take time. There are also algorithms for computing shortest paths for all pairs of nodes in the graph, but storing such a lookup table would take lots of disk space. What I'm wondering: Is there an algorithm which offers a space-time tradeoff that's somewhere between these two extremes? In other words, is there a way to speed up a shortest-path search, while using less disk space than would be occupied by an all-pairs shortest-path table? I know there are ways to efficiently store lookup tables for this problem, and I already have a couple of ideas for speeding up shortest-path searches using precomputed data. But I don't want to reinvent the wheel if there's already some established algorithm that solves this problem.

    Read the article

  • How to familiarize myself with Python

    - by Zel
    I am Python beginner. Started Python 1.5 months back. I downloaded the Python docs and read some part of the tutorial. I have been programming on codechef.com and solving problems of projecteuler. I am thinking of reading Introduction to algorithms and following this course on MIT opencourse ware as I haven't been getting much improvement in programming and I am wasting much time thinking just what should I do when faced with any programming problem. But I think that I still don't know the correct way to learn the language itself. Should I start the library reference or continue with Python tutorial? Is learning algorithms useful for language such as C and not so much for Python as it has "batteries included"? Are there some other resources for familiarization with the language and in general for learning to solve programming problems? Or do I need to just devote some more time?

    Read the article

  • Chambers In A Castle Algorithm

    - by 7Aces
    Problem Statement - Given a NxM grid of 1s & 0s (1s mark walls, while 0s indicate empty chambers), the task is to identify the number of chambers & the size of the largest. And just to whet my curiosity, to find in which chamber, a cell belongs. It seems like an ad hoc problem, since the regular algorithms just don't fit in. I just can't get the logic for writing an algorithm for the problem. If you get it, pseudo-code would be of great help! Note - I have tried the regular grid search algorithms, but they don't suffice the problem requirements. Source - INOI Q Paper 2003

    Read the article

  • What modern alternatives to Numerical Recipes exist?

    - by Stewart
    In the past, the Numerical Recipes book was considered the gold standard reference for numerical algorithms. The earliest Fortran Edition was followed by editions in C and C++ and others, bringing it then more up-to-date. Through these, it provided reference code for the state-of-the-art algorithms of the day. Older editions are available online for free nowadays. Unfortunately, I think it is now mostly useful only as a historic tome. The "software engineering" practises seem to me to be outdated, and the actual content hasn't kept pace with the literature. What comprehensive yet approachable references should the modern programmer look at instead?

    Read the article

  • Sorting Algorithms

    - by MarkPearl
    General Every time I go back to university I find myself wading through sorting algorithms and their implementation in C++. Up to now I haven’t really appreciated their true value. However as I discovered this last week with Dictionaries in C# – having a knowledge of some basic programming principles can greatly improve the performance of a system and make one think twice about how to tackle a problem. I’m going to cover briefly in this post the following: Selection Sort Insertion Sort Shellsort Quicksort Mergesort Heapsort (not complete) Selection Sort Array based selection sort is a simple approach to sorting an unsorted array. Simply put, it repeats two basic steps to achieve a sorted collection. It starts with a collection of data and repeatedly parses it, each time sorting out one element and reducing the size of the next iteration of parsed data by one. So the first iteration would go something like this… Go through the entire array of data and find the lowest value Place the value at the front of the array The second iteration would go something like this… Go through the array from position two (position one has already been sorted with the smallest value) and find the next lowest value in the array. Place the value at the second position in the array This process would be completed until the entire array had been sorted. A positive about selection sort is that it does not make many item movements. In fact, in a worst case scenario every items is only moved once. Selection sort is however a comparison intensive sort. If you had 10 items in a collection, just to parse the collection you would have 10+9+8+7+6+5+4+3+2=54 comparisons to sort regardless of how sorted the collection was to start with. If you think about it, if you applied selection sort to a collection already sorted, you would still perform relatively the same number of iterations as if it was not sorted at all. Many of the following algorithms try and reduce the number of comparisons if the list is already sorted – leaving one with a best case and worst case scenario for comparisons. Likewise different approaches have different levels of item movement. Depending on what is more expensive, one may give priority to one approach compared to another based on what is more expensive, a comparison or a item move. Insertion Sort Insertion sort tries to reduce the number of key comparisons it performs compared to selection sort by not “doing anything” if things are sorted. Assume you had an collection of numbers in the following order… 10 18 25 30 23 17 45 35 There are 8 elements in the list. If we were to start at the front of the list – 10 18 25 & 30 are already sorted. Element 5 (23) however is smaller than element 4 (30) and so needs to be repositioned. We do this by copying the value at element 5 to a temporary holder, and then begin shifting the elements before it up one. So… Element 5 would be copied to a temporary holder 10 18 25 30 23 17 45 35 – T 23 Element 4 would shift to Element 5 10 18 25 30 30 17 45 35 – T 23 Element 3 would shift to Element 4 10 18 25 25 30 17 45 35 – T 23 Element 2 (18) is smaller than the temporary holder so we put the temporary holder value into Element 3. 10 18 23 25 30 17 45 35 – T 23   We now have a sorted list up to element 6. And so we would repeat the same process by moving element 6 to a temporary value and then shifting everything up by one from element 2 to element 5. As you can see, one major setback for this technique is the shifting values up one – this is because up to now we have been considering the collection to be an array. If however the collection was a linked list, we would not need to shift values up, but merely remove the link from the unsorted value and “reinsert” it in a sorted position. Which would reduce the number of transactions performed on the collection. So.. Insertion sort seems to perform better than selection sort – however an implementation is slightly more complicated. This is typical with most sorting algorithms – generally, greater performance leads to greater complexity. Also, insertion sort performs better if a collection of data is already sorted. If for instance you were handed a sorted collection of size n, then only n number of comparisons would need to be performed to verify that it is sorted. It’s important to note that insertion sort (array based) performs a number item moves – every time an item is “out of place” several items before it get shifted up. Shellsort – Diminishing Increment Sort So up to now we have covered Selection Sort & Insertion Sort. Selection Sort makes many comparisons and insertion sort (with an array) has the potential of making many item movements. Shellsort is an approach that takes the normal insertion sort and tries to reduce the number of item movements. In Shellsort, elements in a collection are viewed as sub-collections of a particular size. Each sub-collection is sorted so that the elements that are far apart move closer to their final position. Suppose we had a collection of 15 elements… 10 20 15 45 36 48 7 60 18 50 2 19 43 30 55 First we may view the collection as 7 sub-collections and sort each sublist, lets say at intervals of 7 10 60 55 – 20 18 – 15 50 – 45 2 – 36 19 – 48 43 – 7 30 10 55 60 – 18 20 – 15 50 – 2 45 – 19 36 – 43 48 – 7 30 (Sorted) We then sort each sublist at a smaller inter – lets say 4 10 55 60 18 – 20 15 50 2 – 45 19 36 43 – 48 7 30 10 18 55 60 – 2 15 20 50 – 19 36 43 45 – 7 30 48 (Sorted) We then sort elements at a distance of 1 (i.e. we apply a normal insertion sort) 10 18 55 60 2 15 20 50 19 36 43 45 7 30 48 2 7 10 15 18 19 20 30 36 43 45 48 50 55 (Sorted) The important thing with shellsort is deciding on the increment sequence of each sub-collection. From what I can tell, there isn’t any definitive method and depending on the order of your elements, different increment sequences may perform better than others. There are however certain increment sequences that you may want to avoid. An even based increment sequence (e.g. 2 4 8 16 32 …) should typically be avoided because it does not allow for even elements to be compared with odd elements until the final sort phase – which in a way would negate many of the benefits of using sub-collections. The performance on the number of comparisons and item movements of Shellsort is hard to determine, however it is considered to be considerably better than the normal insertion sort. Quicksort Quicksort uses a divide and conquer approach to sort a collection of items. The collection is divided into two sub-collections – and the two sub-collections are sorted and combined into one list in such a way that the combined list is sorted. The algorithm is in general pseudo code below… Divide the collection into two sub-collections Quicksort the lower sub-collection Quicksort the upper sub-collection Combine the lower & upper sub-collection together As hinted at above, quicksort uses recursion in its implementation. The real trick with quicksort is to get the lower and upper sub-collections to be of equal size. The size of a sub-collection is determined by what value the pivot is. Once a pivot is determined, one would partition to sub-collections and then repeat the process on each sub collection until you reach the base case. With quicksort, the work is done when dividing the sub-collections into lower & upper collections. The actual combining of the lower & upper sub-collections at the end is relatively simple since every element in the lower sub-collection is smaller than the smallest element in the upper sub-collection. Mergesort With quicksort, the average-case complexity was O(nlog2n) however the worst case complexity was still O(N*N). Mergesort improves on quicksort by always having a complexity of O(nlog2n) regardless of the best or worst case. So how does it do this? Mergesort makes use of the divide and conquer approach to partition a collection into two sub-collections. It then sorts each sub-collection and combines the sorted sub-collections into one sorted collection. The general algorithm for mergesort is as follows… Divide the collection into two sub-collections Mergesort the first sub-collection Mergesort the second sub-collection Merge the first sub-collection and the second sub-collection As you can see.. it still pretty much looks like quicksort – so lets see where it differs… Firstly, mergesort differs from quicksort in how it partitions the sub-collections. Instead of having a pivot – merge sort partitions each sub-collection based on size so that the first and second sub-collection of relatively the same size. This dividing keeps getting repeated until the sub-collections are the size of a single element. If a sub-collection is one element in size – it is now sorted! So the trick is how do we put all these sub-collections together so that they maintain their sorted order. Sorted sub-collections are merged into a sorted collection by comparing the elements of the sub-collection and then adjusting the sorted collection. Lets have a look at a few examples… Assume 2 sub-collections with 1 element each 10 & 20 Compare the first element of the first sub-collection with the first element of the second sub-collection. Take the smallest of the two and place it as the first element in the sorted collection. In this scenario 10 is smaller than 20 so 10 is taken from sub-collection 1 leaving that sub-collection empty, which means by default the next smallest element is in sub-collection 2 (20). So the sorted collection would be 10 20 Lets assume 2 sub-collections with 2 elements each 10 20 & 15 19 So… again we would Compare 10 with 15 – 10 is the winner so we add it to our sorted collection (10) leaving us with 20 & 15 19 Compare 20 with 15 – 15 is the winner so we add it to our sorted collection (10 15) leaving us with 20 & 19 Compare 20 with 19 – 19 is the winner so we add it to our sorted collection (10 15 19) leaving us with 20 & _ 20 is by default the winner so our sorted collection is 10 15 19 20. Make sense? Heapsort (still needs to be completed) So by now I am tired of sorting algorithms and trying to remember why they were so important. I think every year I go through this stuff I wonder to myself why are we made to learn about selection sort and insertion sort if they are so bad – why didn’t we just skip to Mergesort & Quicksort. I guess the only explanation I have for this is that sometimes you learn things so that you can implement them in future – and other times you learn things so that you know it isn’t the best way of implementing things and that you don’t need to implement it in future. Anyhow… luckily this is going to be the last one of my sorts for today. The first step in heapsort is to convert a collection of data into a heap. After the data is converted into a heap, sorting begins… So what is the definition of a heap? If we have to convert a collection of data into a heap, how do we know when it is a heap and when it is not? The definition of a heap is as follows: A heap is a list in which each element contains a key, such that the key in the element at position k in the list is at least as large as the key in the element at position 2k +1 (if it exists) and 2k + 2 (if it exists). Does that make sense? At first glance I’m thinking what the heck??? But then after re-reading my notes I see that we are doing something different – up to now we have really looked at data as an array or sequential collection of data that we need to sort – a heap represents data in a slightly different way – although the data is stored in a sequential collection, for a sequential collection of data to be in a valid heap – it is “semi sorted”. Let me try and explain a bit further with an example… Example 1 of Potential Heap Data Assume we had a collection of numbers as follows 1[1] 2[2] 3[3] 4[4] 5[5] 6[6] For this to be a valid heap element with value of 1 at position [1] needs to be greater or equal to the element at position [3] (2k +1) and position [4] (2k +2). So in the above example, the collection of numbers is not in a valid heap. Example 2 of Potential Heap Data Lets look at another collection of numbers as follows 6[1] 5[2] 4[3] 3[4] 2[5] 1[6] Is this a valid heap? Well… element with the value 6 at position 1 must be greater or equal to the element at position [3] and position [4]. Is 6 > 4 and 6 > 3? Yes it is. Lets look at element 5 as position 2. It must be greater than the values at [4] & [5]. Is 5 > 3 and 5 > 2? Yes it is. If you continued to examine this second collection of data you would find that it is in a valid heap based on the definition of a heap.

    Read the article

  • List of generic algorithms and data structures

    - by Jake Petroules
    As part of a library project, I want to include a plethora of generic algorithms and data structures. This includes algorithms for searching and sorting, data structures like linked lists and binary trees, path-finding algorithms like A*... the works. Basically, any generic algorithm or data structure you can think of that you think might be useful in such a library, please post or add it to the list. Thanks! (NOTE: Because there is no single right answer I've of course placed this in community wiki... and also, please don't suggest algorithms which are too specialized to be provided by a generic library) The List: Data structures AVL tree B-tree B*-tree B+-tree Binary tree Binary heap Binary search tree Linked lists Singly linked list Doubly linked list Stack Queue Sorting algorithms Binary tree sort Bubble sort Heapsort Insertion sort Merge sort Quicksort Selection sort Searching algorithms

    Read the article

  • Sublinear Extra Space MergeSort

    - by hulkmeister
    I am reviewing basic algorithms from a book called Algorithms by Robert Sedgewick, and I came across a problem in MergeSort that I am, sad to say, having difficulty solving. The problem is below: Sublinear Extra Space. Develop a merge implementation that reduces that extra space requirement to max(M, N/M), based on the following idea: Divide the array into N/M blocks of size M (for simplicity in this description, assume that N is a multiple of M). Then, (i) considering the blocks as items with their first key as the sort key, sort them using selection sort; and (ii) run through the array merging the first block with the second, then the second block with the third, and so forth. The problem I have with the problem is that based on the idea Sedgewick recommends, the following set of arrays will not be sorted: {0, 10, 12}, {3, 9, 11}, {5, 8, 13}. The algorithm I use is the following: Divide the full array into subarrays of size M. Run Selection Sort on each of the subarrays. Merge each of the subarrays using the method Sedgwick recommends in (ii). (This is where I encounter the problem of where to store the results after the merge.) This leads to wanting to increase the size of the auxiliary space needed to handle at least two subarrays at a time (for merging), but based on the specifications of the problem, that is not allowed. I have also considered using the original array as space for one subarray and using the auxiliary space for the second subarray. However, I can't envision a solution that does not end up overwriting the entries of the first subarray. Any ideas on other ways this can be done? NOTE: If this is suppose to be on StackOverflow.com, please let me know how I can move it. I posted here because the question was academic.

    Read the article

  • Parallelism in .NET – Part 11, Divide and Conquer via Parallel.Invoke

    - by Reed
    Many algorithms are easily written to work via recursion.  For example, most data-oriented tasks where a tree of data must be processed are much more easily handled by starting at the root, and recursively “walking” the tree.  Some algorithms work this way on flat data structures, such as arrays, as well.  This is a form of divide and conquer: an algorithm design which is based around breaking up a set of work recursively, “dividing” the total work in each recursive step, and “conquering” the work when the remaining work is small enough to be solved easily. Recursive algorithms, especially ones based on a form of divide and conquer, are often a very good candidate for parallelization. This is apparent from a common sense standpoint.  Since we’re dividing up the total work in the algorithm, we have an obvious, built-in partitioning scheme.  Once partitioned, the data can be worked upon independently, so there is good, clean isolation of data. Implementing this type of algorithm is fairly simple.  The Parallel class in .NET 4 includes a method suited for this type of operation: Parallel.Invoke.  This method works by taking any number of delegates defined as an Action, and operating them all in parallel.  The method returns when every delegate has completed: Parallel.Invoke( () => { Console.WriteLine("Action 1 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 2 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 3 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); } ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Running this simple example demonstrates the ease of using this method.  For example, on my system, I get three separate thread IDs when running the above code.  By allowing any number of delegates to be executed directly, concurrently, the Parallel.Invoke method provides us an easy way to parallelize any algorithm based on divide and conquer.  We can divide our work in each step, and execute each task in parallel, recursively. For example, suppose we wanted to implement our own quicksort routine.  The quicksort algorithm can be designed based on divide and conquer.  In each iteration, we pick a pivot point, and use that to partition the total array.  We swap the elements around the pivot, then recursively sort the lists on each side of the pivot.  For example, let’s look at this simple, sequential implementation of quicksort: public static void QuickSort<T>(T[] array) where T : IComparable<T> { QuickSortInternal(array, 0, array.Length - 1); } private static void QuickSortInternal<T>(T[] array, int left, int right) where T : IComparable<T> { if (left >= right) { return; } SwapElements(array, left, (left + right) / 2); int last = left; for (int current = left + 1; current <= right; ++current) { if (array[current].CompareTo(array[left]) < 0) { ++last; SwapElements(array, last, current); } } SwapElements(array, left, last); QuickSortInternal(array, left, last - 1); QuickSortInternal(array, last + 1, right); } static void SwapElements<T>(T[] array, int i, int j) { T temp = array[i]; array[i] = array[j]; array[j] = temp; } Here, we implement the quicksort algorithm in a very common, divide and conquer approach.  Running this against the built-in Array.Sort routine shows that we get the exact same answers (although the framework’s sort routine is slightly faster).  On my system, for example, I can use framework’s sort to sort ten million random doubles in about 7.3s, and this implementation takes about 9.3s on average. Looking at this routine, though, there is a clear opportunity to parallelize.  At the end of QuickSortInternal, we recursively call into QuickSortInternal with each partition of the array after the pivot is chosen.  This can be rewritten to use Parallel.Invoke by simply changing it to: // Code above is unchanged... SwapElements(array, left, last); Parallel.Invoke( () => QuickSortInternal(array, left, last - 1), () => QuickSortInternal(array, last + 1, right) ); } This routine will now run in parallel.  When executing, we now see the CPU usage across all cores spike while it executes.  However, there is a significant problem here – by parallelizing this routine, we took it from an execution time of 9.3s to an execution time of approximately 14 seconds!  We’re using more resources as seen in the CPU usage, but the overall result is a dramatic slowdown in overall processing time. This occurs because parallelization adds overhead.  Each time we split this array, we spawn two new tasks to parallelize this algorithm!  This is far, far too many tasks for our cores to operate upon at a single time.  In effect, we’re “over-parallelizing” this routine.  This is a common problem when working with divide and conquer algorithms, and leads to an important observation: When parallelizing a recursive routine, take special care not to add more tasks than necessary to fully utilize your system. This can be done with a few different approaches, in this case.  Typically, the way to handle this is to stop parallelizing the routine at a certain point, and revert back to the serial approach.  Since the first few recursions will all still be parallelized, our “deeper” recursive tasks will be running in parallel, and can take full advantage of the machine.  This also dramatically reduces the overhead added by parallelizing, since we’re only adding overhead for the first few recursive calls.  There are two basic approaches we can take here.  The first approach would be to look at the total work size, and if it’s smaller than a specific threshold, revert to our serial implementation.  In this case, we could just check right-left, and if it’s under a threshold, call the methods directly instead of using Parallel.Invoke. The second approach is to track how “deep” in the “tree” we are currently at, and if we are below some number of levels, stop parallelizing.  This approach is a more general-purpose approach, since it works on routines which parse trees as well as routines working off of a single array, but may not work as well if a poor partitioning strategy is chosen or the tree is not balanced evenly. This can be written very easily.  If we pass a maxDepth parameter into our internal routine, we can restrict the amount of times we parallelize by changing the recursive call to: // Code above is unchanged... SwapElements(array, left, last); if (maxDepth < 1) { QuickSortInternal(array, left, last - 1, maxDepth); QuickSortInternal(array, last + 1, right, maxDepth); } else { --maxDepth; Parallel.Invoke( () => QuickSortInternal(array, left, last - 1, maxDepth), () => QuickSortInternal(array, last + 1, right, maxDepth)); } We no longer allow this to parallelize indefinitely – only to a specific depth, at which time we revert to a serial implementation.  By starting the routine with a maxDepth equal to Environment.ProcessorCount, we can restrict the total amount of parallel operations significantly, but still provide adequate work for each processing core. With this final change, my timings are much better.  On average, I get the following timings: Framework via Array.Sort: 7.3 seconds Serial Quicksort Implementation: 9.3 seconds Naive Parallel Implementation: 14 seconds Parallel Implementation Restricting Depth: 4.7 seconds Finally, we are now faster than the framework’s Array.Sort implementation.

    Read the article

  • Algorithm to use for shop floor layout?

    - by jkohlhepp
    I ran into a classroom problem yesterday (business oriented class, not computer science) and I found it interesting from an algorithmic perspective. The problem goes something like this: Assume there is a shop floor with N different rooms, and you have N different departments that need to go in those rooms. The departments and the rooms are all the same size, so any department could go in any room. There is a known travel distance from each room to each other room. There is also a known amount of trips necessary from one department to another (trips are counted the same regardless which room they originate from, so a trip from A to B is equivalent to a trip from B to A). Given those inputs, determine a layout of departments into rooms which minimizes travel time. What is the best way to approach this problem algorithmically? Is there already a particular algorithm or class of algorithms designed to solve this type of problem? Does this type of problem have a name in computer science? I am not looking for you to design an algorithm to solve this, although feel free to do so if you would like. I'm wondering if this is a problem space that has already been well defined and studied algorithmically and if so get some links to research further. I can see a lot of different data structures and algorithms that might apply to this and I'm curious which approach would be "best". And don't worry, you are not doing my homework for me. This is not a homework problem per se, as this is a business course and we were simply discussing the concepts and not trying to solve the problem algorithmically.

    Read the article

  • Why is the use of abstractions (such as LINQ) so taboo?

    - by Matthew Patrick Cashatt
    I am an independent contractor and, as such, I interview 3-4 times a year for new gigs. I am in the midst of that cycle now and got turned down for an opportunity even though I felt like the interview went well. The same thing has happened to me a couple of times this year. Now, I am not a perfect guy and I don't expect to be a good fit for every organization. That said, my batting average is lower than usual so I politely asked my last interviewer for some constructive feedback, and he delivered! The main thing, according to the interviewer, was that I seemed to lean too much towards the use of abstractions (such as LINQ) rather than towards lower-level, organically grown algorithms. On the surface, this makes sense--in fact, it made the other rejections make sense too because I blabbed about LINQ in those interviews as well and it didn't seem that the interviewers knew much about LINQ (even though they were .NET guys). So now I am left with this question: If we are supposed to be "standing on the shoulders of giants" and using abstractions that are available to us (like LINQ), then why do some folks consider it so taboo? Doesn't it make sense to pull code "off the shelf" if it accomplishes the same goals without extra cost? It would seem to me that LINQ, even if it is an abstraction, is simply an abstraction of all the same algorithms one would write to accomplish exactly the same end. Only a performance test could tell you if your custom approach was better, but if something like LINQ met the requirements, why bother writing your own classes in the first place? I don't mean to focus on LINQ here. I am sure that the JAVA world has something comparable, I just would like to know why some folks get so uncomfortable with the idea of using an abstraction that they themselves did not write. UPDATE As Euphoric pointed out, there isn't anything comparable to LINQ in the Java world. So, if you are developing on the .NET stack, why not always try and make use of it? Is it possible that people just don't fully understand what it does?

    Read the article

  • Efficient algorithm for Virtual Machine(VM) Consolidation in Cloud

    - by devansh dalal
    PROBLEM: We have N physical machines(PMs) each with ram Ri, cpu Ci and a set of currently scheduled VMs each with ram requirement ri and ci respectively Moving(Migrating) any VM from one PM to other has a cost associated which depends on its ram ri. A PM with no VMs is shut down to save power. Our target is to minimize the weighted sum of (N,migration cost) by migrating some VMs i.e. minimize the number of working PMs as well as not to degrade the service level due to excessive migrations. My Approach: Brute Force approach is choosing the minimum loaded PM and try to fit its VMs to other PMs by First Fit Decreasing algorithm or we can select the victim PMs and target PMs based on their loading level and shut down victims if possible by moving their VMs to targets. I tried this Greedy approach on the Data of Baadal(IIT-D cloud) but It isn't giving promising results. I have also tried to study the Ant colony optimization for dynamic VM consolidating but was unable to understand very much. I used the links. http://dumas.ccsd.cnrs.fr/docs/00/72/52/15/PDF/Esnault.pdf http://hal.archives-ouvertes.fr/docs/00/72/38/56/PDF/RR-8032.pdf Would anyone please clarify the solution or suggest any new approach/resources for better performance. I am basically searching for the algorithms not the physical optimizations and I also know that many commercial organizations have provided these solution but I just wanted to know more the underlying algorithms. Thanks in advance.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >