Search Results

Search found 1614 results on 65 pages for 'emps (expensive managemen'.

Page 7/65 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • Partition Wise Joins

    - by jean-pierre.dijcks
    Some say they are the holy grail of parallel computing and PWJ is the basis for a shared nothing system and the only join method that is available on a shared nothing system (yes this is oversimplified!). The magic in Oracle is of course that is one of many ways to join data. And yes, this is the old flexibility vs. simplicity discussion all over, so I won't go there... the point is that what you must do in a shared nothing system, you can do in Oracle with the same speed and methods. The Theory A partition wise join is a join between (for simplicity) two tables that are partitioned on the same column with the same partitioning scheme. In shared nothing this is effectively hard partitioning locating data on a specific node / storage combo. In Oracle is is logical partitioning. If you now join the two tables on that partitioned column you can break up the join in smaller joins exactly along the partitions in the data. Since they are partitioned (grouped) into the same buckets, all values required to do the join live in the equivalent bucket on either sides. No need to talk to anyone else, no need to redistribute data to anyone else... in short, the optimal join method for parallel processing of two large data sets. PWJ's in Oracle Since we do not hard partition the data across nodes in Oracle we use the Partitioning option to the database to create the buckets, then set the Degree of Parallelism (or run Auto DOP - see here) and get our PWJs. The main questions always asked are: How many partitions should I create? What should my DOP be? In a shared nothing system the answer is of course, as many partitions as there are nodes which will be your DOP. In Oracle we do want you to look at the workload and concurrency, and once you know that to understand the following rules of thumb. Within Oracle we have more ways of joining of data, so it is important to understand some of the PWJ ideas and what it means if you have an uneven distribution across processes. Assume we have a simple scenario where we partition the data on a hash key resulting in 4 hash partitions (H1 -H4). We have 2 parallel processes that have been tasked with reading these partitions (P1 - P2). The work is evenly divided assuming the partitions are the same size and we can scan this in time t1 as shown below. Now assume that we have changed the system and have a 5th partition but still have our 2 workers P1 and P2. The time it takes is actually 50% more assuming the 5th partition has the same size as the original H1 - H4 partitions. In other words to scan these 5 partitions, the time t2 it takes is not 1/5th more expensive, it is a lot more expensive and some other join plans may now start to look exciting to the optimizer. Just to post the disclaimer, it is not as simple as I state it here, but you get the idea on how much more expensive this plan may now look... Based on this little example there are a few rules of thumb to follow to get the partition wise joins. First, choose a DOP that is a factor of two (2). So always choose something like 2, 4, 8, 16, 32 and so on... Second, choose a number of partitions that is larger or equal to 2* DOP. Third, make sure the number of partitions is divisible through 2 without orphans. This is also known as an even number... Fourth, choose a stable partition count strategy, which is typically hash, which can be a sub partitioning strategy rather than the main strategy (range - hash is a popular one). Fifth, make sure you do this on the join key between the two large tables you want to join (and this should be the obvious one...). Translating this into an example: DOP = 8 (determined based on concurrency or by using Auto DOP with a cap due to concurrency) says that the number of partitions >= 16. Number of hash (sub) partitions = 32, which gives each process four partitions to work on. This number is somewhat arbitrary and depends on your data and system. In this case my main reasoning is that if you get more room on the box you can easily move the DOP for the query to 16 without repartitioning... and of course it makes for no leftovers on the table... And yes, we recommend up-to-date statistics. And before you start complaining, do read this post on a cool way to do stats in 11.

    Read the article

  • multiple count Pivot table in Excel

    - by Sivakanesh
    Hi all, I'm trying to put togeter a pivot table from an Excel spreadsheet. The spreadsheets look similar to the following: DeptHead, Emp, Increment x, A, 2.5% x, B, y, C, 1.5% y, D, y, E, 2.0% I would like to make a pivot table that looks like the following; DeptHead, CountOfEmp, CountOfIncrement x, 2, 1 y, 3, 2 So it provides a count of total number of Emps and total number Increments for each DeptHead ignoring the blanks. I have tried to do this in many ways in Pivot table, but the two counts are only appearing in rows and not in columns as above. Is there any way to achieve this please? Thanks

    Read the article

  • Show choosen option in a notification Feed, Django

    - by apoo
    Hey I have a model where : LIST_OPTIONS = ( ('cheap','cheap'), ('expensive','expensive'), ('normal', 'normal'), ) then I have assigned the LIST_OPTIONS to nature variable. nature = models.CharField(max_length=15, choices=LIST_OPTIONS, null=False, blank=False). then I save it: if self.pk: new=False else: new=True super(Listing, self).save(force_insert, force_update) if new and notification: notification.send(User.objects.all().exclude(id=self.owner.id), "listing_new", {'listing':self, }, ) then in my management.py: def create_notice_types(app, created_models,verbosity, **kwargs): notification.create_notice_type("listing_new", _("New Listing"), _("someone has posted a new listing"), default=2) and now in my notice.html I want to show to users different sentences based on the options that they have choose so something like this: LINK href="{{ listing.owner.get_absolute_url }} {{listing.owner}} {% ifequal listing.nature "For Sale" %} created a {{ listing.nature }} listing, <a href="{{ listing.get_absolute_url }}">{{listing.title}}</a>. {% ifequals listing.equal "Give Away"%} is {{ listing.nature }} , LINK href="{{ listing.get_absolute_url }}" {{listing.title}}. {% ifequal listing.equal "Looking For"%} is {{ listing.nature }} , LINK href="{{ listing.get_absolute_url }}" {{listing.title}} {% endifequal %} {% endifequal %} {% endifequal %} Could you please help me out with this. Thank you

    Read the article

  • How to find two most distant points?

    - by depesz
    This is a question that I was asked on a job interview some time ago. And I still can't figure out sensible answer. Question is: you are given set of points (x,y). Find 2 most distant points. Distant from each other. For example, for points: (0,0), (1,1), (-8, 5) - the most distant are: (1,1) and (-8,5) because the distance between them is larger from both (0,0)-(1,1) and (0,0)-(-8,5). The obvious approach is to calculate all distances between all points, and find maximum. The problem is that it is O(n^2), which makes it prohibitively expensive for large datasets. There is approach with first tracking points that are on the boundary, and then calculating distances for them, on the premise that there will be less points on boundary than "inside", but it's still expensive, and will fail in worst case scenario. Tried to search the web, but didn't find any sensible answer - although this might be simply my lack of search skills.

    Read the article

  • What are the Options for Storing Hierarchical Data in a Relational Database?

    - by orangepips
    Good Overviews One more Nested Intervals vs. Adjacency List comparison: the best comparison of Adjacency List, Materialized Path, Nested Set and Nested Interval I've found. Models for hierarchical data: slides with good explanations of tradeoffs and example usage Representing hierarchies in MySQL: very good overview of Nested Set in particular Hierarchical data in RDBMSs: most comprehensive and well organized set of links I've seen, but not much in the way on explanation Options Ones I am aware of and general features: Adjacency List: Columns: ID, ParentID Easy to implement. Cheap node moves, inserts, and deletes. Expensive to find level (can store as a computed column), ancestry & descendants (Bridge Hierarchy combined with level column can solve), path (Lineage Column can solve). Use Common Table Expressions in those databases that support them to traverse. Nested Set (a.k.a Modified Preorder Tree Traversal) First described by Joe Celko - covered in depth in his book Trees and Hierarchies in SQL for Smarties Columns: Left, Right Cheap level, ancestry, descendants Compared to Adjacency List, moves, inserts, deletes more expensive. Requires a specific sort order (e.g. created). So sorting all descendants in a different order requires additional work. Nested Intervals Combination of Nested Sets and Materialized Path where left/right columns are floating point decimals instead of integers and encode the path information. Bridge Table (a.k.a. Closure Table: some good ideas about how to use triggers for maintaining this approach) Columns: ancestor, descendant Stands apart from table it describes. Can include some nodes in more than one hierarchy. Cheap ancestry and descendants (albeit not in what order) For complete knowledge of a hierarchy needs to be combined with another option. Flat Table A modification of the Adjacency List that adds a Level and Rank (e.g. ordering) column to each record. Expensive move and delete Cheap ancestry and descendants Good Use: threaded discussion - forums / blog comments Lineage Column (a.k.a. Materialized Path, Path Enumeration) Column: lineage (e.g. /parent/child/grandchild/etc...) Limit to how deep the hierarchy can be. Descendants cheap (e.g. LEFT(lineage, #) = '/enumerated/path') Ancestry tricky (database specific queries) Database Specific Notes MySQL Use session variables for Adjacency List Oracle Use CONNECT BY to traverse Adjacency Lists PostgreSQL ltree datatype for Materialized Path SQL Server General summary 2008 offers HierarchyId data type appears to help with Lineage Column approach and expand the depth that can be represented.

    Read the article

  • Garbage Collection Java

    - by simion
    On the slides i am revising from it says the following; Live objects can be identified either by maintaining a count of the number of references to each object, or by tracing chains of references from the roots. Reference counting is expensive – it needs action every time a reference changes and it doesn’t spot cyclical structures, but it can reclaim space incrementally. Tracing involves identifying live objects only when you need to reclaim space – moving the cost from general access to the time at which the GC runs, typically only when you are out of memory. I understand the principles of why reference counting is expensive but do not understand what "doesn’t spot cyclical structures, but it can reclaim space incrementally." means. Could anyone help me out a little bit please? Thanks

    Read the article

  • Why doesn't ConcurrentQueue<T>.Count return 0 when IsEmpty == true?

    - by DanTup - Danny Tuppeny
    I was reading about the new concurrent collection classes in .NET 4 on James Michael Hare's blog, and the page talking about ConcurrentQueue<T> says: It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. I'm curious - if there is a reason to use IsEmpty instead of comparing Count to 0, why does the class not internally check IsEmpty and return 0 before doing any of the expensive work to count? E.g.: public int Count { get { // Check IsEmpty so we can bail out quicker if (this.IsEmpty) return 0; // Rest of "expensive" counting code } } It seems strange to suggest this if it could be "fixed" so easily with no side-effects?

    Read the article

  • Performance implications of finalizers on JVM

    - by Alexey Romanov
    According to this post, in .Net, Finalizers are actually even worse than that. Besides that they run late (which is indeed a serious problem for many kinds of resources), they are also less powerful because they can only perform a subset of the operations allowed in a destructor (e.g., a finalizer cannot reliably use other objects, whereas a destructor can), and even when writing in that subset finalizers are extremely difficult to write correctly. And collecting finalizable objects is expensive: Each finalizable object, and the potentially huge graph of objects reachable from it, is promoted to the next GC generation, which makes it more expensive to collect by some large multiple. Does this also apply to JVMs in general and to HotSpot in particular?

    Read the article

  • Garbage Collection in Java

    - by simion
    On the slides I am revising from it says the following: Live objects can be identified either by maintaining a count of the number of references to each object, or by tracing chains of references from the roots. Reference counting is expensive – it needs action every time a reference changes and it doesn’t spot cyclical structures, but it can reclaim space incrementally. Tracing involves identifying live objects only when you need to reclaim space – moving the cost from general access to the time at which the GC runs, typically only when you are out of memory. I understand the principles of why reference counting is expensive but do not understand what "doesn’t spot cyclical structures, but it can reclaim space incrementally." means. Could anyone help me out a little bit please? Thanks

    Read the article

  • How do polymorphic inline caches work with mutable types?

    - by kingkilr
    A polymorphic inline cache works by caching the actual method by the type of the object, in order to avoid the expensive lookup procedures (usually a hashtable lookup). How does one handle the type comparison if the type objects are mutable (i.e. the method might be monkey patched into something different at run time). The one idea I've come up with would be a "class counter" that gets incremented each time a method is adjusted, however this seems like it would be exceptionally expensive in a heavily monkey patched environ since it would kill all the PICs for that class, even if the methods for them weren't altered. I'm sure there must be a good solution to this, as this issue is directly applicable to Javascript and AFAIK all 3 of the big JS VMs have PICs (wow acronym ahoy).

    Read the article

  • Symfony caching question (caching a partial)

    - by morpheous
    I am using Symfony 1.3.2 and I have a page that uses a partial from another module. I have two modules: 'foo' and 'foobar'. In module 'foo', I have an 'index' action, which uses a partial from the 'foobar' module. so foo/indexSuccess.php looks something like this: Some data here ? I want to cache 'part2' of my foo/indexSuccess.php page, because it is very expensive (slow). I want the cache to have a lifetime of about 10 minutes. In apps/frontend/modules/foo/config/cache.yml I need to know how to cache 'part2' of the page (i.e. the [very expensive] partial part of the page. can anyone tell me what entries are required in the cache.yml file?

    Read the article

  • Akka framework support for finding duplicate messages

    - by scala_is_awesome
    I'm trying to build a high-performance distributed system with Akka and Scala. If a message requesting an expensive (and side-effect-free) computation arrives, and the exact same computation has already been requested before, I want to avoid computing the result again. If the computation requested previously has already completed and the result is available, I can cache it and re-use it. However, the time window in which duplicate computation can be requested may be arbitrarily small. e.g. I could get a thousand or a million messages requesting the same expensive computation at the same instant for all practical purposes. There is a commercial product called Gigaspaces that supposedly handles this situation. However there seems to be no framework support for dealing with duplicate work requests in Akka at the moment. Given that the Akka framework already has access to all the messages being routed through the framework, it seems that a framework solution could make a lot of sense here. Here is what I am proposing for the Akka framework to do: 1. Create a trait to indicate a type of messages (say, "ExpensiveComputation" or something similar) that are to be subject to the following caching approach. 2. Smartly (hashing etc.) identify identical messages received by (the same or different) actors within a user-configurable time window. Other options: select a maximum buffer size of memory to be used for this purpose, subject to (say LRU) replacement etc. Akka can also choose to cache only the results of messages that were expensive to process; the messages that took very little time to process can be re-processed again if needed; no need to waste precious buffer space caching them and their results. 3. When identical messages (received within that time window, possibly "at the same time instant") are identified, avoid unnecessary duplicate computations. The framework would do this automatically, and essentially, the duplicate messages would never get received by a new actor for processing; they would silently vanish and the result from processing it once (whether that computation was already done in the past, or ongoing right then) would get sent to all appropriate recipients (immediately if already available, and upon completion of the computation if not). Note that messages should be considered identical even if the "reply" fields are different, as long as the semantics/computations they represent are identical in every other respect. Also note that the computation should be purely functional, i.e. free from side-effects, for the caching optimization suggested to work and not change the program semantics at all. If what I am suggesting is not compatible with the Akka way of doing things, and/or if you see some strong reasons why this is a very bad idea, please let me know. Thanks, Is Awesome, Scala

    Read the article

  • How can I catch a change event from an HTML text field when a user clicks on a link after editing?

    - by Eric the Red
    Our webapp has a form with fields and values that change depending on values entered. Each time there is a change event on one of the form elements, we use an expensive AJAX call to update other elements on the page. This works great for selects, radio buttons and checkboxes. The issue comes in when a user adds content to a text field, and then clicks a link without taking the focus from the text field. The browser moves to a new page, and the contents of the text field are never saved. Is there an easy way to code around this? The AJAX call is too expensive to call with each key press. Here's an example of my Prototype code at the moment: $$('.productOption input.text').invoke('observe', 'change', saveChangeEvent);

    Read the article

  • How to improve performance of opening Microsoft Word when automated from c#?

    - by Abdullah BaMusa
    I have Microsoft Word template that I automated filling it’s fields from my application, and when the user request print I open this template. but creating word application every time user request print after filling fields is very expensive and lead to some delay while opening the template, so I choose to cache the reference to Word then just open the new filled template. that solve the performance issue as opening file is less expensive than recreating Word each time, but this work while the user just close the document not the entire Word application which when happened my reference to Word become invalid and return with exception says: “The RPC server is unavailable” next time request opening template . I tried to subscribe to BeforClosing event but his trigger for Quitting Word as well as Closing documents. My question is how to know if the word is closing document or quit the entire application so I take the proper action, or any hint for another direction of thinking about improve performance of opening word template.

    Read the article

  • Symfony cacheing question (cacheing a partial)

    - by morpheous
    I am using Symfony 1.3.2 and I have a page that uses a partial from another module. I have two modules: 'foo' and 'foobar'. In module 'foo', I have an 'index' action, which uses a partial from the 'foobar' module. so foo/indexSuccess.php looks something like this: Some data here ? I want to cache 'part2' of my foo/indexSuccess.php page, because it is very expensive (slow). I want the cache to have a lifetime of about 10 minutes. In apps/frontend/modules/foo/config/cache.yml I need to know how to cache 'part2' of the page (i.e. the [very expensive] partial part of the page. can anyone tell me what entries are required in the cache.yml file?

    Read the article

  • How to reverse a dictionary that it has repeated values (python)

    - by Galois
    Hi guys! So, I have a dictionary with almost 100,000 (key, values) pairs and the majority of the keys map to the same values. For example imagine something like that: dict = {'a': 1, 'c': 2, 'b': 1, 'e': 2, 'd': 3, 'h': 1, 'j': 3} What I want to do, is to reverse the dictionary so that each value in dict is going to be a key at the reverse_dict and is going to map to a list of all the dict.keys that used to map to that value at the dict. So based on the example above I would get: reversed_dict = {1: ['a', 'b', 'h'], 2:['e', 'c'] , 3:['d', 'j']} I came up with a solution that is very expensive and I would really want to hear any ideas more efficient than mine. my expensive solution: reversed_dict = {} for value in dict.values(): reversed_dict[value] = [] for key in dict.keys(): if dict[key] == value: if key not in reversed_dict[value]: reversed_dict[value].append(key) Output >> reversed_dict = {1: ['a', 'b', 'h'], 2: ['c', 'e'], 3: ['d', 'j']} I would really appreciate to hear any ideas better and more efficient than than mine. Thanks!

    Read the article

  • SQLAuthority News – SafePeak’s SQL Server Performance Contest – Winners

    - by pinaldave
    SafePeak, the unique automated SQL performance acceleration and performance tuning software vendor, announced the winners of their SQL Performance Contest 2011. The contest quite unique: the writer of the best / most interesting and most community liked “performance story” would win an expensive gadget. The judges were the community DBAs that could participating and Like’ing stories and could also win expensive prizes. Robert Pearl SQL MVP, was the contest supervisor. I liked most of the stories and decided then to contact SafePeak and suggested to participate in the give-away and they have gladly accepted the same. The winner of best story is: Jason Brimhall (USA) with a story about a proc with a fair amount of business logic. Congratulations Jason! The 3 participants won the second prize of $100 gift card on amazon.com are: Michael Corey (USA), Hakim Ali (USA) and Alex Bernal (USA). And 5 participants won a printed copy of a book of mine (Book Reviews of SQL Wait Stats Joes 2 Pros: SQL Performance Tuning Techniques Using Wait Statistics, Types & Queues) are: Patrick Kansa (USA), Wagner Bianchi (USA), Riyas.V.K (India), Farzana Patwa (USA) and Wagner Crivelini (Brazil). The winners are welcome to send safepeak their mail address to receive the prizes (to “info ‘at’ safepeak.com”). Also SafePeak team asked me to welcome you all to continue sending stories, simply because they (and we all) like to read interesting stuff) as well as to send them ideas for future contests. You can do it from here: www.safepeak.com/SQL-Performance-Contest-2011/Submit-Story Congratulations to everybody! I found this very funny video about SafePeak: It looks like someone (maybe the vendor) played with video’s once and created this non-commercial like video: SafePeak dynamic caching is an immediate plug-n-play performance acceleration and scalability solution for cloud, hosted and business SQL server applications. By caching in memory result sets of queries and stored procedures, while keeping all those cache correct and up to date using unique patent pending technology, SafePeak can fix SQL performance problems and bottlenecks of most applications – most importantly: without actual code changes. By the way, I checked their website prior this contest announcement and noticed that they are running these days a special end year promotion giving between 30% to 45% discounts. Since the installation is quick and full testing can be done within couple of days – those have the need (performance problems) and have budget leftovers: I suggest you hurry. A free fully functional trial is here: www.safepeak.com/download, while those that want to start with a quote should ping here www.safepeak.com/quote. Good luck! Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Use XQuery to Access XML in Emacs

    - by Gregory Burd
    There you are working on a multi-MB/GB/TB XML document or set of documents, you want to be able to quickly query the content but you don't want to load the XML into a full-blown XML database, the time spent setting things up is simply too expensive. Why not combine a great open source editor, Emacs, and a great XML XQuery engine, Berkeley DB XML? That is exactly what Donnie Cameron did. Give it a try.

    Read the article

  • An XEvent a Day (19 of 31) – Using Customizable Fields

    - by Jonathan Kehayias
    Today’s post will be somewhat short, but we’ll look at Customizable Fields on Events in Extended Events and how they are used to collect additional information.  Customizable Fields generally represent information of potential interest that may be expensive to collect, and is therefore made available for collection if specified by the Event Session.  In SQL Server 2008 and 2008 R2, there are 50 Events that have customizable columns in their payload.  In SQL Server Denali CTP1, there...(read more)

    Read the article

  • ASP.NET Asynchronous Pages and when to use them

    - by rajbk
    There have been several articles posted about using  asynchronous pages in ASP.NET but none of them go into detail as to when you should use them. I finally found a great post by Thomas Marquardt that explains the process in depth. He addresses a key misconception also: So, in your ASP.NET application, when should you perform work asynchronously instead of synchronously? Well, only 1 thread per CPU can execute at a time.  Did you catch that?  A lot of people seem to miss this point...only one thread executes at a time on a CPU. When you have more than this, you pay an expensive penalty--a context switch. However, if a thread is blocked waiting on work...then it makes sense to switch to another thread, one that can execute now.  It also makes sense to switch threads if you want work to be done in parallel as opposed to in series, but up until a certain point it actually makes much more sense to execute work in series, again, because of the expensive context switch. Pop quiz: If you have a thread that is doing a lot of computational work and using the CPU heavily, and this takes a while, should you switch to another thread? No! The current thread is efficiently using the CPU, so switching will only incur the cost of a context switch. Ok, well, what if you have a thread that makes an HTTP or SOAP request to another server and takes a long time, should you switch threads? Yes! You can perform the HTTP or SOAP request asynchronously, so that once the "send" has occurred, you can unwind the current thread and not use any threads until there is an I/O completion for the "receive". Between the "send" and the "receive", the remote server is busy, so locally you don't need to be blocking on a thread, but instead make use of the asynchronous APIs provided in .NET Framework so that you can unwind and be notified upon completion. Again, it only makes sense to switch threads if the benefit from doing so out weights the cost of the switch. Read more about it in these posts: Performing Asynchronous Work, or Tasks, in ASP.NET Applications http://blogs.msdn.com/tmarq/archive/2010/04/14/performing-asynchronous-work-or-tasks-in-asp-net-applications.aspx ASP.NET Thread Usage on IIS 7.0 and 6.0 http://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx   PS: I generally do not write posts that simply link to other posts but think it is warranted in this case.

    Read the article

  • Game Design Schools in Canada

    - by CptJackLoder
    I am a High School student in Ontario and i am trying looking for college/university programs the are specifically about game design. There are quite a few at most colleges near me, but they are all BA's and I am looking for a BSc. The only one i have been able to find is at digipen but that is across the continent and more importantly outrageously expensive. Does anyone know of and programs in Canada or the US that offer a BSc in Game design?

    Read the article

  • MySQL – Scalability on Amazon RDS: Scale out to multiple RDS instances

    - by Pinal Dave
    Today, I’d like to discuss getting better MySQL scalability on Amazon RDS. The question of the day: “What can you do when a MySQL database needs to scale write-intensive workloads beyond the capabilities of the largest available machine on Amazon RDS?” Let’s take a look. In a typical EC2/RDS set-up, users connect to app servers from their mobile devices and tablets, computers, browsers, etc.  Then app servers connect to an RDS instance (web/cloud services) and in some cases they might leverage some read-only replicas.   Figure 1. A typical RDS instance is a single-instance database, with read replicas.  This is not very good at handling high write-based throughput. As your application becomes more popular you can expect an increasing number of users, more transactions, and more accumulated data.  User interactions can become more challenging as the application adds more sophisticated capabilities. The result of all this positive activity: your MySQL database will inevitably begin to experience scalability pressures. What can you do? Broadly speaking, there are four options available to improve MySQL scalability on RDS. 1. Larger RDS Instances – If you’re not already using the maximum available RDS instance, you can always scale up – to larger hardware.  Bigger CPUs, more compute power, more memory et cetera. But the largest available RDS instance is still limited.  And they get expensive. “High-Memory Quadruple Extra Large DB Instance”: 68 GB of memory 26 ECUs (8 virtual cores with 3.25 ECUs each) 64-bit platform High I/O Capacity Provisioned IOPS Optimized: 1000Mbps 2. Provisioned IOPs – You can get provisioned IOPs and higher throughput on the I/O level. However, there is a hard limit with a maximum instance size and maximum number of provisioned IOPs you can buy from Amazon and you simply cannot scale beyond these hardware specifications. 3. Leverage Read Replicas – If your application permits, you can leverage read replicas to offload some reads from the master databases. But there are a limited number of replicas you can utilize and Amazon generally requires some modifications to your existing application. And read-replicas don’t help with write-intensive applications. 4. Multiple Database Instances – Amazon offers a fourth option: “You can implement partitioning,thereby spreading your data across multiple database Instances” (Link) However, Amazon does not offer any guidance or facilities to help you with this. “Multiple database instances” is not an RDS feature.  And Amazon doesn’t explain how to implement this idea. In fact, when asked, this is the response on an Amazon forum: Q: Is there any documents that describe the partition DB across multiple RDS? I need to use DB with more 1TB but exist a limitation during the create process, but I read in the any FAQ that you need to partition database, but I don’t find any documents that describe it. A: “DB partitioning/sharding is not an official feature of Amazon RDS or MySQL, but a technique to scale out database by using multiple database instances. The appropriate way to split data depends on the characteristics of the application or data set. Therefore, there is no concrete and specific guidance.” So now what? The answer is to scale out with ScaleBase. Amazon RDS with ScaleBase: What you get – MySQL Scalability! ScaleBase is specifically designed to scale out a single MySQL RDS instance into multiple MySQL instances. Critically, this is accomplished with no changes to your application code.  Your application continues to “see” one database.   ScaleBase does all the work of managing and enforcing an optimized data distribution policy to create multiple MySQL instances. With ScaleBase, data distribution, transactions, concurrency control, and two-phase commit are all 100% transparent and 100% ACID-compliant, so applications, services and tooling continue to interact with your distributed RDS as if it were a single MySQL instance. The result: now you can cost-effectively leverage multiple MySQL RDS instance to scale out write-intensive workloads to an unlimited number of users, transactions, and data. Amazon RDS with ScaleBase: What you keep – Everything! And how does this change your Amazon environment? 1. Keep your application, unchanged – There is no change your application development life-cycle at all.  You still use your existing development tools, frameworks and libraries.  Application quality assurance and testing cycles stay the same. And, critically, you stay with an ACID-compliant MySQL environment. 2. Keep your RDS value-added services – The value-added services that you rely on are all still available. Amazon will continue to handle database maintenance and updates for you. You can still leverage High Availability via Multi A-Z.  And, if it benefits youra application throughput, you can still use read replicas. 3. Keep your RDS administration – Finally the RDS monitoring and provisioning tools you rely on still work as they did before. With your one large MySQL instance, now split into multiple instances, you can actually use less expensive, smallersmaller available RDS hardware and continue to see better database performance. Conclusion Amazon RDS is a tremendous service, but it doesn’t offer solutions to scale beyond a single MySQL instance. Larger RDS instances get more expensive.  And when you max-out on the available hardware, you’re stuck.  Amazon recommends scaling out your single instance into multiple instances for transaction-intensive apps, but offers no services or guidance to help you. This is where ScaleBase comes in to save the day. It gives you a simple and effective way to create multiple MySQL RDS instances, while removing all the complexities typically caused by “DIY” sharding andwith no changes to your applications . With ScaleBase you continue to leverage the AWS/RDS ecosystem: commodity hardware and value added services like read replicas, multi A-Z, maintenance/updates and administration with monitoring tools and provisioning. SCALEBASE ON AMAZON If you’re curious to try ScaleBase on Amazon, it can be found here – Download NOW. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Of C# Iterators and Performance

    - by James Michael Hare
    Some of you reading this will be wondering, "what is an iterator" and think I'm locked in the world of C++.  Nope, I'm talking C# iterators.  No, not enumerators, iterators.   So, for those of you who do not know what iterators are in C#, I will explain it in summary, and for those of you who know what iterators are but are curious of the performance impacts, I will explore that as well.   Iterators have been around for a bit now, and there are still a bunch of people who don't know what they are or what they do.  I don't know how many times at work I've had a code review on my code and have someone ask me, "what's that yield word do?"   Basically, this post came to me as I was writing some extension methods to extend IEnumerable<T> -- I'll post some of the fun ones in a later post.  Since I was filtering the resulting list down, I was using the standard C# iterator concept; but that got me wondering: what are the performance implications of using an iterator versus returning a new enumeration?   So, to begin, let's look at a couple of methods.  This is a new (albeit contrived) method called Every(...).  The goal of this method is to access and enumeration and return every nth item in the enumeration (including the first).  So Every(2) would return items 0, 2, 4, 6, etc.   Now, if you wanted to write this in the traditional way, you may come up with something like this:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         List<T> newList = new List<T>();         int count = 0;           foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 newList.Add(i);             }         }           return newList;     }     So basically this method takes any IEnumerable<T> and returns a new IEnumerable<T> that contains every nth item.  Pretty straight forward.   The problem?  Well, Every<T>(...) will construct a list containing every nth item whether or not you care.  What happens if you were searching this result for a certain item and find that item after five tries?  You would have generated the rest of the list for nothing.   Enter iterators.  This C# construct uses the yield keyword to effectively defer evaluation of the next item until it is asked for.  This can be very handy if the evaluation itself is expensive or if there's a fair chance you'll never want to fully evaluate a list.   We see this all the time in Linq, where many expressions are chained together to do complex processing on a list.  This would be very expensive if each of these expressions evaluated their entire possible result set on call.    Let's look at the same example function, this time using an iterator:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         int count = 0;         foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 yield return i;             }         }     }   Notice it does not create a new return value explicitly, the only evidence of a return is the "yield return" statement.  What this means is that when an item is requested from the enumeration, it will enter this method and evaluate until it either hits a yield return (in which case that item is returned) or until it exits the method or hits a yield break (in which case the iteration ends.   Behind the scenes, this is all done with a class that the CLR creates behind the scenes that keeps track of the state of the iteration, so that every time the next item is asked for, it finds that item and then updates the current position so it knows where to start at next time.   It doesn't seem like a big deal, does it?  But keep in mind the key point here: it only returns items as they are requested. Thus if there's a good chance you will only process a portion of the return list and/or if the evaluation of each item is expensive, an iterator may be of benefit.   This is especially true if you intend your methods to be chainable similar to the way Linq methods can be chained.    For example, perhaps you have a List<int> and you want to take every tenth one until you find one greater than 10.  We could write that as:       List<int> someList = new List<int>();         // fill list here         someList.Every(10).TakeWhile(i => i <= 10);     Now is the difference more apparent?  If we use the first form of Every that makes a copy of the list.  It's going to copy the entire list whether we will need those items or not, that can be costly!    With the iterator version, however, it will only take items from the list until it finds one that is > 10, at which point no further items in the list are evaluated.   So, sounds neat eh?  But what's the cost is what you're probably wondering.  So I ran some tests using the two forms of Every above on lists varying from 5 to 500,000 integers and tried various things.    Now, iteration isn't free.  If you are more likely than not to iterate the entire collection every time, iterator has some very slight overhead:   Copy vs Iterator on 100% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 5 Copy 5 5 5 Iterator 5 50 50 Copy 28 50 50 Iterator 27 500 500 Copy 227 500 500 Iterator 247 5000 5000 Copy 2266 5000 5000 Iterator 2444 50,000 50,000 Copy 24,443 50,000 50,000 Iterator 24,719 500,000 500,000 Copy 250,024 500,000 500,000 Iterator 251,521   Notice that when iterating over the entire produced list, the times for the iterator are a little better for smaller lists, then getting just a slight bit worse for larger lists.  In reality, given the number of items and iterations, the result is near negligible, but just to show that iterators come at a price.  However, it should also be noted that the form of Every that returns a copy will have a left-over collection to garbage collect.   However, if we only partially evaluate less and less through the list, the savings start to show and make it well worth the overhead.  Let's look at what happens if you stop looking after 80% of the list:   Copy vs Iterator on 80% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 4 Copy 5 5 4 Iterator 5 50 40 Copy 27 50 40 Iterator 23 500 400 Copy 215 500 400 Iterator 200 5000 4000 Copy 2099 5000 4000 Iterator 1962 50,000 40,000 Copy 22,385 50,000 40,000 Iterator 19,599 500,000 400,000 Copy 236,427 500,000 400,000 Iterator 196,010       Notice that the iterator form is now operating quite a bit faster.  But the savings really add up if you stop on average at 50% (which most searches would typically do):     Copy vs Iterator on 50% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 2 Copy 5 5 2 Iterator 4 50 25 Copy 25 50 25 Iterator 16 500 250 Copy 188 500 250 Iterator 126 5000 2500 Copy 1854 5000 2500 Iterator 1226 50,000 25,000 Copy 19,839 50,000 25,000 Iterator 12,233 500,000 250,000 Copy 208,667 500,000 250,000 Iterator 122,336   Now we see that if we only expect to go on average 50% into the results, we tend to shave off around 40% of the time.  And this is only for one level deep.  If we are using this in a chain of query expressions it only adds to the savings.   So my recommendation?  If you have a resonable expectation that someone may only want to partially consume your enumerable result, I would always tend to favor an iterator.  The cost if they iterate the whole thing does not add much at all -- and if they consume only partially, you reap some really good performance gains.   Next time I'll discuss some of my favorite extensions I've created to make development life a little easier and maintainability a little better.

    Read the article

  • Routing to a Controller with no View in Angular

    - by Rick Strahl
    Angular provides a nice routing, and controller to view model that makes it easy to create sophisticated JavaScript views fairly easily. But Angular's views are destroyed and re-rendered each time they are activated - what if you need to work with a persisted view that's too expensive to re-render? Here's how to build a headless controller that doesn't render a view through Angular, but rather manages the the view or markup manually.

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >