Search Results

Search found 874 results on 35 pages for 'scalability'.

Page 7/35 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • How are you taking advantage of Multicore?

    - by tgamblin
    As someone in the world of HPC who came from the world of enterprise web development, I'm always curious to see how developers back in the "real world" are taking advantage of parallel computing. This is much more relevant now that all chips are going multicore, and it'll be even more relevant when there are thousands of cores on a chip instead of just a few. My questions are: How does this affect your software roadmap? I'm particularly interested in real stories about how multicore is affecting different software domains, so specify what kind of development you do in your answer (e.g. server side, client-side apps, scientific computing, etc). What are you doing with your existing code to take advantage of multicore machines, and what challenges have you faced? Are you using OpenMP, Erlang, Haskell, CUDA, TBB, UPC or something else? What do you plan to do as concurrency levels continue to increase, and how will you deal with hundreds or thousands of cores? If your domain doesn't easily benefit from parallel computation, then explaining why is interesting, too. Finally, I've framed this as a multicore question, but feel free to talk about other types of parallel computing. If you're porting part of your app to use MapReduce, or if MPI on large clusters is the paradigm for you, then definitely mention that, too. Update: If you do answer #5, mention whether you think things will change if there get to be more cores (100, 1000, etc) than you can feed with available memory bandwidth (seeing as how bandwidth is getting smaller and smaller per core). Can you still use the remaining cores for your application?

    Read the article

  • Delphi : Sorted List

    - by Sethu
    I need to sort close to a 1,00,000 floating point entries in delphi. I am new to delphi and would like to know if there are any readymade solutions available. I tried a few language provided constructs and they take an inordinate amount of time to run to completion.(a 5-10 sec execution time is fine for the application)

    Read the article

  • NServiceBus pipeline with Distributors

    - by David
    I'm building a processing pipeline with NServiceBus but I'm having trouble with the configuration of the distributors in order to make each step in the process scalable. Here's some info: The pipeline will have a master process that says "OK, time to start" for a WorkItem, which will then start a process like a flowchart. Each step in the flowchart may be computationally expensive, so I want the ability to scale out each step. This tells me that each step needs a Distributor. I want to be able to hook additional activities onto events later. This tells me I need to Publish() messages when it is done, not Send() them. A process may need to branch based on a condition. This tells me that a process must be able to publish more than one type of message. A process may need to join forks. I imagine I should use Sagas for this. Hopefully these assumptions are good otherwise I'm in more trouble than I thought. For the sake of simplicity, let's forget about forking or joining and consider a simple pipeline, with Step A followed by Step B, and ending with Step C. Each step gets its own distributor and can have many nodes processing messages. NodeA workers contain a IHandleMessages processor, and publish EventA NodeB workers contain a IHandleMessages processor, and publish Event B NodeC workers contain a IHandleMessages processor, and then the pipeline is complete. Here are the relevant parts of the config files, where # denotes the number of the worker, (i.e. there are input queues NodeA.1 and NodeA.2): NodeA: <MsmqTransportConfig InputQueue="NodeA.#" ErrorQueue="error" NumberOfWorkerThreads="1" MaxRetries="5" /> <UnicastBusConfig DistributorControlAddress="NodeA.Distrib.Control" DistributorDataAddress="NodeA.Distrib.Data" > <MessageEndpointMappings> </MessageEndpointMappings> </UnicastBusConfig> NodeB: <MsmqTransportConfig InputQueue="NodeB.#" ErrorQueue="error" NumberOfWorkerThreads="1" MaxRetries="5" /> <UnicastBusConfig DistributorControlAddress="NodeB.Distrib.Control" DistributorDataAddress="NodeB.Distrib.Data" > <MessageEndpointMappings> <add Messages="Messages.EventA, Messages" Endpoint="NodeA.Distrib.Data" /> </MessageEndpointMappings> </UnicastBusConfig> NodeC: <MsmqTransportConfig InputQueue="NodeC.#" ErrorQueue="error" NumberOfWorkerThreads="1" MaxRetries="5" /> <UnicastBusConfig DistributorControlAddress="NodeC.Distrib.Control" DistributorDataAddress="NodeC.Distrib.Data" > <MessageEndpointMappings> <add Messages="Messages.EventB, Messages" Endpoint="NodeB.Distrib.Data" /> </MessageEndpointMappings> </UnicastBusConfig> And here are the relevant parts of the distributor configs: Distributor A: <add key="DataInputQueue" value="NodeA.Distrib.Data"/> <add key="ControlInputQueue" value="NodeA.Distrib.Control"/> <add key="StorageQueue" value="NodeA.Distrib.Storage"/> Distributor B: <add key="DataInputQueue" value="NodeB.Distrib.Data"/> <add key="ControlInputQueue" value="NodeB.Distrib.Control"/> <add key="StorageQueue" value="NodeB.Distrib.Storage"/> Distributor C: <add key="DataInputQueue" value="NodeC.Distrib.Data"/> <add key="ControlInputQueue" value="NodeC.Distrib.Control"/> <add key="StorageQueue" value="NodeC.Distrib.Storage"/> I'm testing using 2 instances of each node, and the problem seems to come up in the middle at Node B. There are basically 2 things that might happen: Both instances of Node B report that it is subscribing to EventA, and also that NodeC.Distrib.Data@MYCOMPUTER is subscribing to the EventB that Node B publishes. In this case, everything works great. Both instances of Node B report that it is subscribing to EventA, however, one worker says NodeC.Distrib.Data@MYCOMPUTER is subscribing TWICE, while the other worker does not mention it. In the second case, which seem to be controlled only by the way the distributor routes the subscription messages, if the "overachiever" node processes an EventA, all is well. If the "underachiever" processes EventA, then the publish of EventB has no subscribers and the workflow dies. So, my questions: Is this kind of setup possible? Is the configuration correct? It's hard to find any examples of configuration with distributors beyond a simple one-level publisher/2-worker setup. Would it make more sense to have one central broker process that does all the non-computationally-intensive traffic cop operations, and only sends messages to processes behind distributors when the task is long-running and must be load balanced? Then the load-balanced nodes could simply reply back to the central broker, which seems easier. On the other hand, that seems at odds with the decentralization that is NServiceBus's strength. And if this is the answer, and the long running process's done event is a reply, how do you keep the Publish that enables later extensibility on published events?

    Read the article

  • fast retrieval from MYSQL DB

    - by trojanwarrior3000
    I have a table of users - It contains around millions of rows (user-id is the primary key). I just want to retrieve user-id and their joining date. using "select user-id,joining date from table user" requires lot of time.Is there a fast way to query/retrieve the same data from this table?

    Read the article

  • How do you deal with denormalization / secondary indexes in database sharding?

    - by Continuation
    Say I have a "message" table with 2 secondary indexes: "recipient_id" "sender_id" I want to shard the "message" table by "recipient_id". That way to retrieve all messages sent to a certain recipient I only need to query one shard. But at the same time, I want to be able to make a query that ask for all messages sent by a certain sender. Now I don't want to send that query to every single shard of the "message" table. One way to do this is to duplicate the data and have a "message_by_sender" table sharded by "sender_id". The problem with that approach is that every time a message has been sent, I need to insert the message into both "message" and "message_by_sender" tables. But what if after inserting into "message" the insertion into "message_by_sender" fail? In that case the message exists in "message" but not in "message_by_sender". How do I make sure that if a message exists in "message" then it also exists in "message_by_sender" without resorting to 2 phase commit? This must be a very common issue for anyone who shards their databases. How do you deal woth it?

    Read the article

  • Practical approach to concurrency control

    - by Industrial
    Hi everyone, I'd read this article recently and are very interested on how to make a practical approach to Concurrency control on a web server. The server will run CentOS + PHP + mySQL with Memcached. How would you set it up to work? http://saasinterrupted.com/2010/02/05/high-availability-principle-concurrency-control/ Thanks!

    Read the article

  • Fast data retrieval in MySQL

    - by trojanwarrior3000
    I have a table of users - It contains around millions of rows (user-id is the primary key). I just want to retrieve user-id and their joining date. Using SELECT user-id, joining-date FROM users requires lot of time. Is there a fast way to query/retrieve the same data from this table?

    Read the article

  • Design Decision - Scaling out web based application's architecture

    - by Vadi
    This question is about design decision. I am currently working on a web project that will have 40K users to start with and in couple of month expected to grow 50M users (not concurrent users though). I would like to have a architecture that can be scaled out easily without much effort. In order to explain, I would like to use a trivial scenario. Lets say, User entities and services such as CreateUser, AuthenticateUser etc., are a simple method calls for the Page Controllers. But once the traffic increases, for example, authenticating user (or such services related to user entities) has to be moved out to a different internal server to spread the load. But at the same time using RPC calls over the network when the user count is 40K would become overkill. My proposal was to use IPC initially and when we need to scale out we can interally switch to TCP based RPC calls so that it can easily scale out. For example, I am referring to System.IO.Pipes.NamedPipeStreamServer to start with and move on to a TcpListener later on. If we have proper design that can encapsulate above said approach, it would easy for us to scale out services into multiple network servers but at the same time avoid network calls when the user count is small. Is this is a best approach? Any suggestions would be great .. Note: The database scaling is definetly the second phase optimization so we have already made architectural design in place to easily partition data when traffic increases. The primary bottleneck would be application servers over the time period.

    Read the article

  • Application Server or Lightweight Container?

    - by Jeff Storey
    Let me preface this by saying this is not an actual situation of mine but I'm asking this question more for my own knowledge and to get other people's inputs here. I've used both Spring and EJB3/JBoss, and for the smaller types of applications I've built, Spring (+Tomcat when needed) has been much simpler to use. However, when scaling up to larger applications that require things like load balancing and clustering, is Spring still a viable solution? Or is it time to turn to a solution like EJB3/JBoss when you start to get big enough to need that? I'm not sure if I've scoped the problem well enough to get a good answer, so please let me know. Thanks, Jeff

    Read the article

  • What should be the considerations for choosing SQL/NoSQL?

    - by Yuval A
    Target application is a medium-sized website built to support several hundred-thousand users an hour, with an option to scale above that. Data model is rather simple, and caching potential is pretty high (~10:1 ratio of read to edit actions). What should be the considerations when coming to choose between a relational, SQL-based datastore to a NoSQL option (such as HBase and Cassandra)?

    Read the article

  • What is optimal hardware configuration for heavy load LAMP application

    - by Piotr Kochanski
    I need to run Linux-Apache-PHP-MySQL application (Moodle e-learning platform) for a large number of concurrent users - I am aiming 5000 users. By concurrent I mean that 5000 people should be able to work with the application at the same time. "Work" means not only do database reads but writes as well. The application is not very typical, since it is doing a lot of inserts/updates on the database, so caching techniques are not helping to much. We are using InnoDB storage engine. In addition application is not written with performance in mind. For instance one Apache thread usually occupies about 30-50 MB of RAM. I would be greatful for information what hardware is needed to build scalable configuration that is able to handle this kind of load. We are using right now two HP DLG 380 with two 4 core processors which are able to handle much lower load (typically 300-500 concurrent users). Is it reasonable to invest in this kind of boxes and build cluster using them or is it better to go with some more high-end hardware? I am particularly curious how many and how powerful servers are needed (number of processors/cores, size of RAM) what network equipment should be used (what kind of switches, network cards) any other hardware, like particular disc storage solutions, etc, that are needed Another thing is how to put together everything, that is what is the most optimal architecture. Clustering with MySQL is rather hard (people are complaining about MySQL Cluster, even here on Stackoverflow).

    Read the article

  • How to estimate tomcat server requirements?

    - by Daniil
    We have a brand new webapp written that runs on Tomcat. So far, only one client is using it through the day. They run about 180 unique logins a day. Not really a lot IMO. Now, we managed to sell it to a brand new client who likes and wants to roll it out to 50,000 clients. How many of them will login at the same time - no idea. But I need to do the whole thing - allocate, create, config and maintain. OK - last is simple(errrr). The application runs off of Tomcat 5.5 on Gentoo (I'm thinking to upgrade to Tomcat 6) with MSSQL & mySQL behind. I do realize that a more enterprise ready application would be a better fit, but that's not an option at the moment. Since I've never done this before, I'm a bit lost. Can someone advice on how to go about estimating the equipment requirements for this client? Tomcat does have clustering, so that I can do. MS SQL - I'm sure they have something too. I'm thinking to stick it behind LVS (which we do use at the moment for something else too). Any help from people who deal with these details is greatly appreciated!

    Read the article

  • Database for Python Twisted

    - by Will
    There's an API for Twisted apps to talk to a database in a scalable way: twisted.enterprise.dbapi The confusing thing is, which database to pick? The database will have a Twisted app that is mostly making inserts and updates and relatively few selects, and then other strictly-read-only clients that are accessing the database directly making selects. (The read-only users are not necessarily selecting the data that the Twisted app is inserting; its not as though the database is being used as a message-queue) My understanding - which I'd like corrected/adviced - is that: Postgres is a great DB, but all the Python bindings - and there is a confusing maze of them - are abandonware There is psycopg2, but that makes a lot of noise about doing its own connection-pooling and things; does this co-exist gracefully/usefully/transparently with the Twisted async database connection pooling and such? SQLLite is a great database for little things but if used in a multi-user way it does whole-database locking, so performance would suck in the usage pattern I envisage MySQL - after the Oracle takeover, who'd want to adopt it now or adopt a fork? Is there anything else out there?

    Read the article

  • JavaEE Application Server or Lightweight Container?

    - by Jeff Storey
    Let me preface this by saying this is not an actual situation of mine but I'm asking this question more for my own knowledge and to get other people's inputs here. I've used both Spring and EJB3/JBoss, and for the smaller types of applications I've built, Spring (+Tomcat when needed) has been much simpler to use. However, when scaling up to larger applications that require things like load balancing and clustering, is Spring still a viable solution? Or is it time to turn to a solution like EJB3/JBoss when you start to get big enough to need that? I'm not sure if I've scoped the problem well enough to get a good answer, so please let me know. Thanks, Jeff

    Read the article

  • Building highly scalable web services

    - by christopher-mccann
    My team and I are in the middle of developing an application which needs to be able to handle pretty heavy traffic. Not facebook level but in the future I would like to be able to scale to that without massive code re-writes. My thought was to modularise out everything into seperate services with their own interfaces. So for example messaging would have a messaging interface that might have send and getMessages() as methods and then the PHP web app would simply query this interface through soap or curl or something like that. The messaging application could then be any kind of application so a Java application or Python or whatever was suitable for that particular functionality with its own seperate database shard. Is this a good approach?

    Read the article

  • How do you efficiently bulk index lookups?

    - by Liron Shapira
    I have these entity kinds: Molecule Atom MoleculeAtom Given a list(molecule_ids) whose lengths is in the hundreds, I need to get a dict of the form {molecule_id: list(atom_ids)}. Likewise, given a list(atom_ids) whose length is in the hunreds, I need to get a dict of the form {atom_id: list(molecule_ids)}. Both of these bulk lookups need to happen really fast. Right now I'm doing something like: atom_ids_by_molecule_id = {} for molecule_id in molecule_ids: moleculeatoms = MoleculeAtom.all().filter('molecule =', db.Key.from_path('molecule', molecule_id)).fetch(1000) atom_ids_by_molecule_id[molecule_id] = [ MoleculeAtom.atom.get_value_for_datastore(ma).id() for ma in moleculeatoms ] Like I said, len(molecule_ids) is in the hundreds. I need to do this kind of bulk index lookup on almost every single request, and I need it to be FAST, and right now it's too slow. Ideas: Will using a Molecule.atoms ListProperty do what I need? Consider that I am storing additional data on the MoleculeAtom node, and remember it's equally important for me to do the lookup in the molecule-atom and atom-molecule directions. Caching? I tried memcaching lists of atom IDs keyed by molecule ID, but I have tons of atoms and molecules, and the cache can't fit it. How about denormalizing the data by creating a new entity kind whose key name is a molecule ID and whose value is a list of atom IDs? The idea is, calling db.get on 500 keys is probably faster than looping through 500 fetches with filters, right?

    Read the article

  • Single SingOn - Best practice

    - by halfdan
    Hi Guys, I need to build a scalable single sign-on mechanism for multiple sites. Scenario: Central web application to register/manage account (Server in Europe) Several web applications that need to authenticate against my user database (Servers in US/Europe/Pacific region) I am using MySQL as database backend. The options I came up with are either replicating the user database across all servers (data security?) or allowing the servers to directly connect to my MySQL instance by explicitly allowing connections from their IPs in my.cnf (high load? single point of failure?). What would be the best way to provide a scalable and low-latency single sign-on for all web applications? In terms of data security would it be a good idea to replicate the user database across all web applications? Note: All web applications provide an API which users can use to embed widgets into their own websites. These widgets work through a token auth mechanism which will again need to authenticate against my user database.

    Read the article

  • Single SignOn - Best practice

    - by halfdan
    Hi Guys, I need to build a scalable single sign-on mechanism for multiple sites. Scenario: Central web application to register/manage account (Server in Europe) Several web applications that need to authenticate against my user database (Servers in US/Europe/Pacific region) I am using MySQL as database backend. The options I came up with are either replicating the user database across all servers (data security?) or allowing the servers to directly connect to my MySQL instance by explicitly allowing connections from their IPs in my.cnf (high load? single point of failure?). What would be the best way to provide a scalable and low-latency single sign-on for all web applications? In terms of data security would it be a good idea to replicate the user database across all web applications? Note: All web applications provide an API which users can use to embed widgets into their own websites. These widgets work through a token auth mechanism which will again need to authenticate against my user database.

    Read the article

  • A SelfHosted WCF Service over Basic HTTP Binding doesn't support more than 1000 concurrent requests

    - by Krishnan
    I have self hosted a WCF Service over BasicHttpBinding consumed by an ASMX Client. I'm simulating a concurrent user load of 1200 users. The service method takes a string parameter and returns a string. The data exchanged is less than 10KB. The processing time for a request is fixed at 2 seconds by having a Thread.Sleep(2000) statement. Nothing additional. I have removed all the DB Hits / business logic. The same piece of code runs fine for 1000 concurrent users. I get the following error when I bump up the number to 1200 users. System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a receive. ---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size) --- End of inner exception stack trace --- at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead) --- End of inner exception stack trace --- at System.Web.Services.Protocols.WebClientProtocol.GetWebResponse(WebRequest request) at System.Web.Services.Protocols.HttpWebClientProtocol.GetWebResponse(WebRequest request) at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters) at WCF.Throttling.Client.Service.Function2(String param) This exception is often reported on DataContract mismatch and large data exchange. But never when doing a load test. I have browsed enough and have tried most of the options which include, Enabled Trace & Message log on server side. But no errors logged. To overcome Port Exhaustion MaxUserPort is set to 65535, and TcpTimedWaitDelay 30 secs. MaxConcurrent Calls is set to 600, and MaxConcurrentInstances is set to 1200. The Open, Close, Send and Receive Timeouts are set to 10 Minutes. The HTTPWebRequest KeepAlive set to false. I have not been able to nail down the issue for the past two days. Any help would be appreciated. Thank you.

    Read the article

  • Is it wise to use temporary tables?

    - by Industrial
    Hi guys, We have a mySQL database table for products. We are utilizing a cache layer to reduce database load, but we think that it's a good idea to minimize the actual data needed to be stored in the cache layer to speed up the application further. All the products in the database, that is visible to visitors have a price attached to them: The prices are stored in a different table, called prices . There are multiple price categories depending on which discount level each visitor (customer) applies to. From time to time, there are campaigns which means that a special price for each product is available. The special prices are stored in a table called specials. Is it a bad to make a temp table that binds the tables together? It would only have the neccessary information and would ofcourse be cached. -------------|-------------|------------ | productId | hasPrice | hasSpecial -------------|-------------|------------ 1 | 1 | 0 2 | 1 | 1 By doing such, it would be super easy to know if the specific product really has a price, without having to iterate through the complete prices or specials table each time a product should be listed or presented. Are temp tables a common thing for web applications or is it just bad design?

    Read the article

  • Boost::Spirit::Qi autorules -- avoiding repeated copying of AST data structures

    - by phooji
    I've been using Qi and Karma to do some processing on several small languages. Most of the grammars are pretty small (20-40 rules). I've been able to use autorules almost exclusively, so my parse trees consist entirely of variants, structs, and std::vectors. This setup works great for the common case: 1) parse something (Qi), 2) make minor manipulations to the parse tree (visitor), and 3) output something (Karma). However, I'm concerned about what will happen if I want to make complex structural changes to a syntax tree, like moving big subtrees around. Consider the following toy example: A grammar for s-expr-style logical expressions that uses autorules... // Inside grammar class; rule names match struct names... pexpr %= pand | por | var | bconst; pand %= lit("(and ") >> (pexpr % lit(" ")) >> ")"; por %= lit("(or ") >> (pexpr % lit(" ")) >> ")"; pnot %= lit("(not ") >> pexpr >> ")"; ... which leads to parse tree representation that looks like this... struct var { std::string name; }; struct bconst { bool val; }; struct pand; struct por; struct pnot; typedef boost::variant<bconst, var, boost::recursive_wrapper<pand>, boost::recursive_wrapper<por>, boost::recursive_wrapper<pnot> > pexpr; struct pand { std::vector<pexpr> operands; }; struct por { std::vector<pexpr> operands; }; struct pnot { pexpr victim; }; // Many Fusion Macros here Suppose I have a parse tree that looks something like this: pand / ... \ por por / \ / \ var var var var (The ellipsis means 'many more children of similar shape for pand.') Now, suppose that I want negate each of the por nodes, so that the end result is: pand / ... \ pnot pnot | | por por / \ / \ var var var var The direct approach would be, for each por subtree: - create pnot node (copies por in construction); - re-assign the appropriate vector slot in the pand node (copies pnot node and its por subtree). Alternatively, I could construct a separate vector, and then replace (swap) the pand vector wholesale, eliminating a second round of copying. All of this seems cumbersome compared to a pointer-based tree representation, which would allow for the pnot nodes to be inserted without any copying of existing nodes. My question: Is there a way to avoid copy-heavy tree manipulations with autorule-compliant data structures? Should I bite the bullet and just use non-autorules to build a pointer-based AST (e.g., http://boost-spirit.com/home/2010/03/11/s-expressions-and-variants/)?

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >