Search Results

Search found 294 results on 12 pages for 'algorithmic trading'.

Page 1/12 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • What technical skills needed for algorithmic trading, HFT, etc?

    - by alchemical
    I'm interested in getting into developing trading systems, black box, HFT, etc. My primary experience is with C# and .Net (7 years). I've also done some sockets programming. I have some experience in finance working on analysis applications (2 years). My goal is to move into developing automated trading systems for a hedge fund, bank, etc. Is there any way to learn the skills needed for this without somehow getting the job first? I've looked at the open source tradelink, IB interactive brokerage, etc. I'm playing around with this framework, and may hook it up and do some paper trading. However, I'm not sure if this has much relationship with how a well-funded entity would be conducting a high-level automated trading operation. I.e. would the tools and frameworks they prefer be a totally different skill-set? Also wondering if I need to learn C++ and/or Java for these types of apps.

    Read the article

  • java+netbeans+mysql+ubuntu 64 bits LTS VS C#+MS SQL for fast develop trading system

    - by crunchor
    I am going to build a trading system and use with the broker "Interactive Broker" API. The API supports C++ Socket, Java Socket, DDE, Active X APIs in Windows. The API supports Java Socket API, Posix C++ Socket API in Unix kind like Ubuntu. My trading system has some real time long calculations to do and a lot of maths for backtest. I am using a retail trading program Amibroker which is written in C++ and I run it in windows xp 32 bits, it will take me days to do one serious backtest with my G620 Sandy Bridge CPU and 3GB of ram. So for my trading system, I need to have 1. speed, 2. stability, 3. fast development I have done some research, C++ is fastest but I am not good at it and it takes much longer time to develop. Other than C++, Java in linux has the best speed. I also did some research for database and look like mysql has the best speed. Mysql and PostgreSQL both are very popular open source sql db, which one should I choose? I see MySQL has Workbench now which looks like similar to MS SQL management studio so look like a good start. Netbeans should be the most popular Java IDE now and seem like its GUI design can be as easy as Virtual Studio now. I am not sure if made by Netbeans would affect the speed and if its GUI design is really that good and easy to use. Ubuntu 64 bits LTS has good long term support, good community support, and stable. I will buy a new computer if I can create a good trading system for live trading and backtest. Very likely I will buy a I7 or I5 depends on if I7 can really have better speed for my case. Actually I mainly deal with C# in my jobs and I just knew java but not good at java. What would you guys recommend? Any better solution? This will be a big project and very likely life long project for me so I seriously do research including asking you guys before I start and focus on what I should, thanks!

    Read the article

  • Algorithmic trading software safety guards

    - by Adal
    I'm working on an automatic trading system. What sorts of safe-guards should I have in place? The main idea I have is to have multiple pieces checking each other. I will have a second independent little process which will also connect to the same trading account and monitor simple things, like ensuring the total net position does not go over a certain limit, or that there are no more than N orders in 10 minutes for example, or more than M positions open simultaneously. You can also check that the actual open positions correspond to what the strategy process thinks it actually holds. As a bonus I could run this checker process on a different machine/network provider. Besides the checks in the main strategy, this will ensure that whatever weird bug occurs, nothing really bad can happen. Any other things I should monitor and be aware of?

    Read the article

  • C++ : Avoid lot of boolean variable for multiple verification conditions in trading app

    - by Naveen
    Hi i am a junior dev in trading app... we have a order refresh verification unit. It has to verify order confirmation from exchange. We send a bunch of different request in bulk ( NEW, MODIFY, CANCEL ) to exchange... Verification has to happen for max N times with each T intervals for all orders. if verification successful for all the order before N retry then fine.. otherwise we need to indicate as verification unsuccessfull. i hv done a basic coding done in very urgent like below for( N times ) { for_each ( sent_request_order ) // SENT { 1) get all the refreshed order from DB or shared mem i.e REFRESHED 2) find current sent order in REFRESHED if( not_found ) not refreshed from exchange, continue to next order if( found ) case NEW : //check for new status, mark verification done case MODIFY : //check for modified status.. //if not mark pending, go to next order, //revisit the same after T time case CANCEL : //check for cancelled status.. //if not mark pending, go to next order, //revisit the same after T time } if( all_verified ) exit from verification. wait ( T sec ) } order_verification_pending, order_verification_done, order_visited, order_not_visited, all_verified, all_not_verified ... lot of boolean flags used for indication.. is there any better approach for doing this.... splitting responsibilities across the classes......???? i know this is not a general question.... but still flags are making me tidious to handle...

    Read the article

  • Java, Massive message processing with queue manager (trading)

    - by Ronny
    Hello, I would like to design a simple application (without j2ee and jms) that can process massive amount of messages (like in trading systems) I have created a service that can receive messages and place them in a queue to so that the system won't stuck when overloaded. Then I created a service (QueueService) that wraps the queue and has a pop method that pops out a message from the queue and if there is no messages returns null, this method is marked as "synchronized" for the next step. I have created a class that knows how process the message (MessageHandler) and another class that can "listen" for messages in a new thread (MessageListener). The thread has a "while(true)" and all the time tries to pop a message. If a message was returned, the thread calls the MessageHandler class and when it's done, he will ask for another message. Now, I have configured the application to open 10 MessageListener to allow multi message processing. I have now 10 threads that all time are in a loop. Is that a good design?? Can anyone reference me to some books or sites how to handle such scenario?? Thanks, Ronny

    Read the article

  • What grid distributed computing frameworks are currently favoured for trading systems

    - by Rich
    There seems to a quite a few grid computing frameworks out there, but which ones are actually being used to any great extent by the investment banks for purposes of low latency distributing calculation? I'd be interested to hear answers covering both windows,Linux and cross platform. Also, what RPC mechanisms seem to be favoured most? I've heard that for reason of low latency and speed, the calculations themselves are quite often written in C++/C as calculations running on VMs are several orders of magnitude slower than native code. Does this seem to be a common scenario in practice? e.g distributed .NET grid framework running calculations written in native c++/c?

    Read the article

  • Programming a trading strategy

    - by Rob
    Excuse me if I'm not descriptive enough, as I do not have much of a background when it comes to these things: How would I go about coding a primitive trading strategy and link it to some sort of artificial trading environment? Where do I start, and what are some other essential questions I should be asking? I am interested more in doing this because it interests me than making returns. Ideally it utilizes random/historical market data and doesn't actually execute any real trades. My background: I'm almost done my undergrad degree in computer science, and have had intro finance and economic courses. Familiar mostly with C and Java.

    Read the article

  • Advice for a getting a job in algorithmic trading - writing faster code

    - by Alex
    I am currently an intermediate Java developer working in the financial industry. I am considering trying to get into an algorithmic trading developer position. I am looking for any advice/resources that may help me obtain such a job. My naive initial thoughts are to concentrate on learning how to write faster, more memory efficient code whilst maintaining readability. Can anyone point me in the right direction of some useful resources for what I am aiming to achieve?

    Read the article

  • How to improve Algorithmic Programming Solving skill? [closed]

    - by gaurav
    Possible Duplicate: How can I improve my problem-solving ability? How do you improve your problem solving skills? Should I learn design patterns or algorithms to improve my logical thinking skills? What to do when you're faced with a problem that you can't solve quickly? Are there non-programming related activities akin to solving programming problems? I am a computer engineering graduate. I have studied programming since three years. I am good in coding and programming. I have been trying to compete in algorithmic competitions on sites such as topcoder,spoj since one and a half year, but I am still unable to solve problems other than too easy problems. I have learned from people that it takes practice to solve such problems. I try to solve those problems but sometimes I am unable to understand and even if I do understand I am unable to think of a good algorithm for solving it. Even if I solve I get Wrong answer and I am unable to figure out what is the problem with my code as it works on samples given on the sites but fails on test cases which they do not provide. I really want to solve those problems and become good in algorithms. I have read books for learning algorithms like Introduction to algorithms by CLRS,practicing programming questions. I have gone through some questions but they don't answer this question. I have seen the questions which are said duplicates but those questions focus on overall programming, but I am asking for algorithm related programming, basically for competing in programming which involve solving a problem statement then online judge will automatically evaluate it, such type of programming is quite different from the type of programming these questions discuss.

    Read the article

  • Microeconomical simulation: coordination/planning between self-interested trading agents

    - by Milton Manfried
    In a typical perfect-information strategy game like Chess, an agent can calculate its best move by searching the state tree for the best possible move, while assuming that the opponent will also make the best possible move (i.e. Mini-max). I would like to use this approach in a "game" modeling economic activity, where the possible "moves" would be to buy or sell for a given price, and the goal, rather than a specific class of states (e.g. Checkmate), would be to maximize some function F of the agent's state (e.g. F(money, widget) = 10*money + widget). How to handle buy/sell actions that require coordination between both parties, at the very least agreement upon a price? The cheap way out would be to set the price beforehand, maybe based upon the current supply -- but the idea of this simulation is to examine how prices emerge when freely determined by "perfectly rational" agents. A great example of what I do not want is the trading algorithm in SugarScape -- paraphrasing from Growing Artificial Societies p101-102: when a pair of agents interact to trade, they each compute their internal valuations of the goods, then a bargaining process is conducted and a price is agreed to. If this price makes both agents better off, they complete the transaction The protocol itself is beautiful, but what it cannot capture (as far as I can tell) is the ability for an agent to pay more than it might otherwise for a good, because it knows that it can sell it for even more at a later date -- what appears to be called "strategic thinking" in this pape at Google Books Multi-Agent-Based Simulation III: 4th International Workshop, MABS 2003... to get realistic behavior like that, it seems one would either (1) have to build an outrageously-complex internal valuation system which could at best only cover situations that were planned for at compile-time, or otherwise (2) have some mechanism to search the state tree... which would require some way of planning future trades. Note: The chess analogy only works as far as the state-space search goes; the simulation isn't intended to be "zero sum", so a literal mini-max search wouldn't be appropriate -- and ideally, it should work with more than two agents.

    Read the article

  • reference list for non-IT driven algorithmic patterns

    - by Quicker
    I am looking for a reference list for non-IT driven algorithmic patterns (which still can be helped with IT implementations of IT). An Example List would be: name; short desc; reference Travelling Salesman; find the shortest possible route on a multiple target path; http://en.wikipedia.org/wiki/Travelling_salesman_problem Ressource Disposition (aka Regulation); Distribute a limited/exceeding input on a given number output receivers based on distribution rules; http://database-programmer.blogspot.de/2010/12/critical-analysis-of-algorithm-sproc.html If there is no such list, but you instantly think of something specific, please 'put it on the desk'. Maybe I can compile something out of the input I get here (actually I am very frustrated as I did not find any such list via research by myself). Details on Scoping: I found it very hard to formulate what I want in a way everything is out that I do not need (which may be the issue why I did not find anything at google). There is a database centric definition for what I am looking for in the section 'Processes' of the second example link. That somehow fits, but the database focus sort of drifts away from the pattern thinking, which I have in mind. So here are my own thoughts around what's in and what's out: I am NOT looking for a foundational algo ref list, which is implemented as basis for any programming language. Eg. the php reference describes substr and strlen. That implements algos, but is not what I am looking for. the problem the algo does address would even exist, if there were no computers (or other IT components) the main focus of the algo is NOT to help other algo's chances are high, that there are implementions of the solution or any workaround without any IT support out there in the world however the algo could be benefitialy implemented/fully supported by a software application = means: the problem of the algo has to be addressed anyway, but running an algo implementation with software automates the process (that is why I posted it on stackoverflow and not somewhere else) typically such algo implementations have more than one input field value and more than one output field value - which implies it could not be implemented as simple function (which is fixed to produce not more than one output value) in a normalized data model often times such algo implementation outputs span accross multiple rows (sometimes multiple tables), whereby the number of output rows depends on the input paraters and rows in the table(s) at start time - which implies that any algo implementation/procedure must interact with a database (read and/or write) I am mainly looking for patterns, not for specific implementations. Example: The Travelling Salesman assumes any coordinates, however it does not say: You need a table targets with fields x and y. - however sometimes descriptions are focussed on examples with specific implementations very much - no worries, as long as the pattern gets clear

    Read the article

  • Why does derivative trading position always require C++ knowledge?

    - by Jeffrey
    I’ve never worked in trading environment before and I was curious to see that few of the trading houses seem to use C# but most of them do heavily rely on C++. Why is it? Is it because C++ is better performance wise? Is it because of legacy code base? Is it because cross platform issue? What about dynamic languages (ruby, python)? Are they too slow for this kind of work in terms of performance? Updated: If realibility and performance are important would "Erlang" be the "next big thing" in trading platform?

    Read the article

  • Option Trading: Getting the most out of the event session options

    - by extended_events
    You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers   Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is  largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Option Trading: Getting the most out of the event session options

    - by extended_events
    You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers   Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is  largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Help with algorithmic complexity in custom merge sort implementation

    - by bitcycle
    I've got an implementation of the merge sort in C++ using a custom doubly linked list. I'm coming up with a big O complexity of n^2, based on the merge_sort() slice operation. But, from what I've read, this algorithm should be n*log(n), where the log has a base of two. Can someone help me determine if I'm just determining the complexity incorrectly, or if the implementation can/should be improved to achieve n*log(n) complexity? If you would like some background on my goals for this project, see my blog. I've added comments in the code outlining what I understand the complexity of each method to be. Clarification - I'm focusing on the C++ implementation with this question. I've got another implementation written in Python, but that was something that was added in addition to my original goal(s).

    Read the article

  • What are algorithmic paradigms?

    - by Vaibhav Agarwal
    We generally talk about paradigms of programming as functional, procedural, object oriented, imperative etc but what should I reply when I am asked the paradigms of algorithms? For example are Travelling Salesman Problem, Dijkstra Shortest Path Algorithm, Euclid GCD Algorithm, Binary search, Kruskal's Minimum Spanning Tree, Tower of Hanoi paradigms of algorithms? Should I answer the data structures I would use to design these algorithms?

    Read the article

  • Can anyone help solve this complex algorithmic problem?

    - by Locaaaaa
    I got this question in an interview and I was not able to solve it. You have a circular road, with N number of gas stations. You know the ammount of gas that each station has. You know the ammount of gas you need to GO from one station to the next one. Your car starts with 0. The question is: Create an algorithm, to know from which gas station you must start driving. As an exercise to me, I would translate the algorithm to C#.

    Read the article

  • Algorithmic Forecasting and Pattern Recognition

    - by Ryan King
    Say a user could enter project data into my software. Each project has 2 variables "size" and "work" and they're related but the relationship is not known. Is there a way to programmatically determine the relationship between the variables based on previous data and forecast the amount of work provided if only given the size of the project in the future? For Example, say the user had manually entered the following projects. Project 1 - Size:1, Work: 4 Project 2 - Size:2, Work: 7 Project 3 - Size:3, Work: 10 Project 4 - Size:4, Work: x What should I look into to be able to programmatically determine, that Work = Size*3+1 and therefor be able to say that x=13?

    Read the article

  • Algorithmic problem - quickly finding all #'s where value %x is some given value

    - by Steve B.
    Problem I'm trying to solve, apologies in advance for the length: Given a large number of stored records, each with a unique (String) field S. I'd like to be able to find through an indexed query all records where Hash(S) % N == K for any arbitrary N, K (e.g. given a million strings, find all strings where HashCode(s) % 17 = 5. Is there some way of memoizing this so that we can quickly answer any question of this form without doing the % on every value? The motivation for this is a system of N distributed nodes, where each record has to be assigned to at least one node. The nodes are numbered 0 - (K-1) , and each node has to load up all of the records that match it's number: If we have 3 nodes Node 0 loads all records where Hash % 3 ==0 Node 1 loads all records where Hash % 3 ==1 Node 2 loads all records where Hash % 3 ==2 adding a 4th node, obviously all the assignments have to be recomputed - Node 0 loads all records where Hash % 4 ==0 ... etc I'd like to easily find these records through an indexed query without having to compute the mod individually. The best I've been able to come up with so far: If we take the prime factors of N (p1 * p2 * ... ) if N % M == I then p % M == I % p for all of N's prime factors e.g. 10 nodes : N % 10 == 6 then N % 2 = 0 == 6 %2 N % 5 = 1 == 6 %5 so storing an array of the "%" of N for the first "reasonable" number of primes for my data set should be helpful. For example in the above example we store the hash and the primes HASH PRIMES (array of %2, %3, %5, %7, ... ]) 16 [0 1 1 2 .. ] so looking for N%10 == 6 is equivalent to looking for all values where array[1]==1 and array[2] == 1. However, this breaks at the first prime larger than the highest number I'm storing in the factor table. Is there a better way?

    Read the article

  • High Frequency Trading

    - by Hamza Yerlikaya
    Over the last couple of weeks i have come across lots of articles about high frequency trading. They all talk about how important computers and software is to this but since they are all written from a financial point of view there is no detail about what does software do? Can anyone explain from a programmers point of view what is high frequency trading? and why is computer/software so important in this field?

    Read the article

  • Database design question (Book Trading System)

    - by Paul
    Hello all! I´m developing a Book Trading System... The user will input your Book to trading... I already have a table tblBook with "all" existing books ... So the user will select one book from that list and fill the book´s CONDITIONS and Edition... So, what is a good Database design to tha case? tblBook = All books tblUserBook = All User Books And making tblUserBook to inheritance tblBook? Thanks

    Read the article

  • Automating Etrade

    - by iAlexTsang
    Hey everyone, I was wondering how would I start programming an interface to trading stocks in Etrade in python. I am attempting to make an automated trading bot, but there is no api publicly available for automated trading with Etrade. Thanks in advance. ^^

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >