Search Results

Search found 874 results on 35 pages for 'scalability'.

Page 3/35 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Highly scalable and dynamic "rule-based" applications?

- by Prof Plum

For a large enterprise app, everyone knows that being able to adjust to change is one of the most important aspects of design. I use a rule-based approach a lot of the time to deal with changing business logic, with each rule being stored in a DB. This allows for easy changes to be made without diving into nasty details. Now since C# cannot Eval("foo(bar);") this is accomplished by using formatted strings stored in rows that are then processed in JavaScript at runtime. This works fine, however, it is less than elegant, and would not be the most enjoyable for anyone else to pick up on once it becomes legacy. Is there a more elegant solution to this? When you get into thousands of rules that change fairly frequently it becomes a real bear, but this cannot be that uncommon of a problem that someone has not thought of a better way to do this. Any suggestions? Is this current method defensible? What are the alternatives? Edit: Just to clarify, this is a large enterprise app, so no matter which solution works, there will be plenty of people constantly maintaining its rules and data (around 10). Also, The data changes frequently enough to say that some sort of centralized server system is basically a must.

Read the article
Web framework able to handle many concurrent users [closed]

- by Jonas

Social networking sites needs to handle many concurrent users e.g. for chat functionality. What web frameworks scales well and are able to handle more than 10.000 concurrent users connected with Comet or WebSockets. The server is a Linux VPS with limited memory, e.g. 1GB-8GB. I have been looking for some Java frameworks but they consume much memory per connection. So I'm looking for other alternatives too. Are there any good frameworks that are able to handle more than 10.000 concurrent users with limited memory resources?

Read the article
How to design highly scalable web services in Java?

- by Kshitiz Sharma

I am creating some Web Services that would have 2000 concurrent users. The services are offered for free and are hence expected to get a large user base. In the future it may be required to scale up to 50,000 users. There are already a few other questions that address the issue like - Building highly scalable web services However my requirements differ from the question above. For example - My application does not have a user interface, so images, CSS, javascript are not an issue. It is in Java so suggestions like using HipHop to translate PHP to native code are useless. Hence I decided to ask my question separately. This is my project setup - Rest based Web services using Apache CXF Hibernate 3.0 (With relevant optimizations like lazy loading and custom HQL for tune up) Tomcat 6.0 MySql 5.5 My questions are - Are there alternatives to Mysql that offer better performance for what I'm trying to do? What are some general things to abide by in order to scale a Java based web application? I am thinking of putting my Application in two tomcat instances with httpd redirecting the request to appropriate tomcat on basis of load. Is this the right approach? Separate tomcat instances can help but then database becomes the bottleneck since both applications access the same database? I am a programmer not a Db Admin, how difficult would it be to cluster a Mysql database (or, to cluster whatever database offered as an alternative to 1)? How effective are caching solutions like EHCache? Any other general best practices? Some clarifications - Could you partition the data? Yes we could but we're trying to avoid it. We need to run a lot of data mining algorithms and the design would evolve over time so we can't be sure what lines of partition should be there.

Read the article
Ruby on Rails background API polling

- by Matthew Turney

I need to integrate a free/busy calendar integration with Zimbra. Unlike outlook, it seems, Zimbra requires polling their API. I need to be able to grab the free/busy data in background tasks for 10's of thousands of users on a regular time interval, preferably every few minutes. What would be the best way to implement this in a Rails application without bogging down our current resque tasks? I have considered moving this process to something like node.js or something similar in Ruby. The biggest problem is that we have no control over the IO, as each clients Zimbra instances could be slow and we don't want to create a huge backup in tasks. Thoughts and ideas?

Read the article
How would you gather client's data on Google App Engine without using Datastore/Backend Instances too much?

- by ruslan

I'm relatively new to StackExchange and not sure if it's appropriate place to ask design question. Site gives me a hint "The question you're asking appears subjective and is likely to be closed". Please let me know. Anyway.. One of the projects I'm working on is online survey engine. It's my first big commercial project on Google App Engine. I need your advice on how to collect stats and efficiently record them in DataStore without bankrupting me. Initial requirements are: After user finishes survey client sends list of pairs [ID (int) + PercentHit (double)]. This list shows how close answers of this user match predefined answers of reference answerers (which identified by IDs). I call them "target IDs". Creator of the survey wants to see aggregated % for given IDs for last hour, particular timeframe or from the beginning of the survey. Some surveys may have thousands of target/reference answerers. So I created entity public class HitsStatsDO implements Serializable { @Id transient private Long id; transient private Long version = (long) 0; transient private Long startDate; @Parent transient private Key parent; // fake parent which contains target id @Transient int targetId; private double avgPercent; private long hitCount; } But writing HitsStatsDO for each target from each user would give a lot of data. For instance I had a survey with 3000 targets which was answered by ~4 million people within one week with 300K people taking survey in first day. Even if we assume they were answering it evenly for 24 hours it would give us ~1040 writes/second. Obviously it hits concurrent writes limit of Datastore. I decided I'll collect data for one hour and save that, that's why there are avgPercent and hitCount in HitsStatsDO. GAE instances are stateless so I had to use dynamic backend instance. There I have something like this: // Contains stats for one hour private class Shard { ReadWriteLock lock = new ReentrantReadWriteLock(); Map<Integer, HitsStatsDO> map = new HashMap<Integer, HitsStatsDO>(); // Key is target ID public void saveToDatastore(); public void updateStats(Long startDate, Map<Integer, Double> hits); } and map with shard for current hour and previous hour (which doesn't stay here for long) private HashMap<Long, Shard> shards = new HashMap<Long, Shard>(); // Key is HitsStatsDO.startDate So once per hour I dump Shard for previous hour to Datastore. Plus I have class LifetimeStats which keeps Map<Integer, HitsStatsDO> in memcached where map-key is target ID. Also in my backend shutdown hook method I dump stats for unfinished hour to Datastore. There is only one major issue here - I have only ONE backend instance :) It raises following questions on which I'd like to hear your opinion: Can I do this without using backend instance ? What if one instance is not enough ? How can I split data between multiple dynamic backend instances? It hard because I don't know how many I have because Google creates new one as load increases. I know I can launch exact number of resident backend instances. But how many ? 2, 5, 10 ? What if I have no load at all for a week. Constantly running 10 backend instances is too expensive. What do I do with data from clients while backend instance is dead/restarting? Thank you very much in advance for your thoughts.

Read the article
How to write a network game? [closed]

- by Tom Wijsman

Based on Why is so hard to develop a MMO?: Networked game development is not trivial; there are large obstacles to overcome in not only latency, but cheat prevention, state management and load balancing. If you're not experienced with writing a networked game, this is going to be a difficult learning exercise. I know the theory about sockets, servers, clients, protocols, connections and such things. Now I wonder how one can learn to write a network game: How to balance load problems? How to manage the game state? How to keep things synchronized? How to protect the communication and client from reverse engineering? How to work around latency problems? Which things should be computed local and which things on the server? ... Are there any good books, tutorials, sites, interesting articles or other questions regarding this? I'm looking for broad answers, but specific ones are fine too to learn the difference.

Read the article
Slashdotted web site seeks new home

- by Arthur Edelstein

I am maintaining a website that contains mostly simple html (just a little php). Normally the site receives only 4000 hits per month, but it was recently slashdotted by the New York Times (30,000 visitors and 30 GB in a day) and the web host provider (bluehost) throttled the CPU in response. This slowed down the website considerably. What web host providers would offer a more scalable solution? Ideally I would like a high-quality host that charges by the GB and can handle bandwidth to expand during sudden slashdotting episodes without a reduction in performance.

Read the article
Preparing for interview questions involving scale

- by Chaitanya

I have over 6 years of software development experience. I have worked on multiple platforms, including mobile. However, I have not had a chance of working on scale-related issues. As a consequence, whenever someone asks a question involving a million inputs in an interview, I find myself out of depth. How do I prepare for such questions? Any books/resources to refer to? There are books for Java and Data Structures and Concurrency, but I don't know about any definitive ones to learn about scaling.

Read the article
Open Grid Engine or Akka/Something more fault tolerant?

- by Mike Lyons

My use case is that I have a pipeline of independent, stand alone programs, that I want to execute in a certain order on specific pieces of data that our output from previous pipeline stages. The pipeline is entirely linear and doesn't do anything in terms of alternate paths through the pipe. I'm currently using SGE to do this and it works OK, however occasionally a job will overstep it's memory bounds, fail, and all jobs that require that output data will fail. The pipe needs to be restarted in that case, and it seems that whatever is providing the fault tolerance in akka might solve that for me?

Read the article
What should I do to scale out an high-traffic website?

- by makerofthings7

What Best Practices should be undertaken for a Website that needs to "scale out" to handle capacity? This is especially relevant now that people are considering the cloud, but may be missing out on the fundamentals. I'm interested in hearing about anything you consider a best practice from development-level tasks, to infrastructure, to management. Use your best judgement when posting multiple answers, since it may make sense to post them separately for voting purposes. (hint: you'll likely get more reputation points for many small answers than one large answer)

Read the article
How can I gather client's data on Google App Engine without using Datastore/Backend Instances too much?

- by ruslan

One of the projects I'm working on is online survey engine. It's my first big commercial project on Google App Engine. I need your advice on how to collect stats and efficiently record them in DataStore without bankrupting me. Initial requirements are: After user finishes survey client sends list of pairs [ID (int) + PercentHit (double)]. This list shows how close answers of this user match predefined answers of reference answerers (which identified by IDs). I call them "target IDs". Creator of the survey wants to see aggregated % for given IDs for last hour, particular timeframe or from the beginning of the survey. Some surveys may have thousands of target/reference answerers. So I created entity public class HitsStatsDO implements Serializable { @Id transient private Long id; transient private Long version = (long) 0; transient private Long startDate; @Parent transient private Key parent; // fake parent which contains target id @Transient int targetId; private double avgPercent; private long hitCount; } But writing HitsStatsDO for each target from each user would give a lot of data. For instance I had a survey with 3000 targets which was answered by ~4 million people within one week with 300K people taking survey in first day. Even if we assume they were answering it evenly for 24 hours it would give us ~1040 writes/second. Obviously it hits concurrent writes limit of Datastore. I decided I'll collect data for one hour and save that, that's why there are avgPercent and hitCount in HitsStatsDO. GAE instances are stateless so I had to use dynamic backend instance. There I have something like this: // Contains stats for one hour private class Shard { ReadWriteLock lock = new ReentrantReadWriteLock(); Map<Integer, HitsStatsDO> map = new HashMap<Integer, HitsStatsDO>(); // Key is target ID public void saveToDatastore(); public void updateStats(Long startDate, Map<Integer, Double> hits); } and map with shard for current hour and previous hour (which doesn't stay here for long) private HashMap<Long, Shard> shards = new HashMap<Long, Shard>(); // Key is HitsStatsDO.startDate So once per hour I dump Shard for previous hour to Datastore. Plus I have class LifetimeStats which keeps Map<Integer, HitsStatsDO> in memcached where map-key is target ID. Also in my backend shutdown hook method I dump stats for unfinished hour to Datastore. There is only one major issue here - I have only ONE backend instance :) It raises following questions on which I'd like to hear your opinion: Can I do this without using backend instance ? What if one instance is not enough ? How can I split data between multiple dynamic backend instances? It hard because I don't know how many I have because Google creates new one as load increases. I know I can launch exact number of resident backend instances. But how many ? 2, 5, 10 ? What if I have no load at all for a week. Constantly running 10 backend instances is too expensive. What do I do with data from clients while backend instance is dead/restarting?

Read the article
Why would more CPU cores on virtual machine slow compile times?

- by Sid

[edit#2] If anyone from VMWare can hit me up with a copy of VMWare Fusion, I'd be more than happy to do the same as a VirtualBox vs VMWare comparison. Somehow I suspect the VMWare hypervisor will be better tuned for hyperthreading (see my answer too) I'm seeing something curious. As I increase the number of cores on my Windows 7 x64 virtual machine, the overall compile time increases instead of decreasing. Compiling is usually very well suited for parallel processing as in the middle part (post dependency mapping) you can simply call a compiler instance on each of your .c/.cpp/.cs/whatever file to build partial objects for the linker to take over. So I would have imagined that compiling would actually scale very well with # of cores. But what I'm seeing is: 8 cores: 1.89 sec 4 cores: 1.33 sec 2 cores: 1.24 sec 1 core: 1.15 sec Is this simply a design artifact due to a particular vendor's hypervisor implementation (type2:virtualbox in my case) or something more pervasive across more VMs to make hypervisor implementations more simpler? With so many factors, I seem to be able to make arguments both for and against this behavior - so if someone knows more about this than me, I'd be curious to read your answer. Thanks Sid [edit:addressing comments] @MartinBeckett: Cold compiles were discarded. @MonsterTruck: Couldn't find an opensource project to compile directly. Would be great but can't screwup my dev env right now. @Mr Lister, @philosodad: Have 8 hw threads, using VirtualBox, so should be 1:1 mapping without emulation @Thorbjorn: I have 6.5GB for the VM and a smallish VS2012 project - it's quite unlikely that I'm swapping in/out trashing the page file. @All: If someone can point to an open source VS2010/VS2012 project, that might be a better community reference than my (proprietary) VS2012 project. Orchard and DNN seem to need environment tweaking to compile in VS2012. I really would like to see if someone with VMWare Fusion also sees this (for VMWare vs VirtualBox compartmentalization) Test details: Hardware: Macbook Pro Retina CPU : Core i7 @ 2.3Ghz (quad core, hyper threaded = 8 cores in windows task manager) Memory : 16 GB Disk : 256GB SSD Host OS: Mac OS X 10.8 VM type: VirtualBox 4.1.18 (type 2 hypervisor) Guest OS: Windows 7 x64 SP1 Compiler: VS2012 compiling a solution with 3 C# Azure projects Compile times measure by VS2012 plugin called 'VSCommands' All tests run 5 times, first 2 runs discarded, last 3 averaged

Read the article
How to write a network game?

- by TomWij

Based on Why is so hard to develop a MMO?: Networked game development is not trivial; there are large obstacles to overcome in not only latency, but cheat prevention, state management and load balancing. If you're not experienced with writing a networked game, this is going to be a difficult learning exercise. I know the theory about sockets, servers, clients, protocols, connections and such things. Now I wonder how one can learn to write a network game: How to balance load problems? How to manage the game state? How to keep things synchronized? How to protect the communication and client from reverse engineering? How to work around latency problems? Which things should be computed local and which things on the server? ... Are there any good books, tutorials, sites, interesting articles or other questions regarding this? I'm looking for broad answers, but specific ones are fine too to learn the difference.

Read the article
migrating product and team from startup race to quality development

- by thevikas

This is year 3 and product is selling good enough. Now we need to enforce good software development practices. The goal is to monitor incoming bug reports and reduce them, allow never ending features and get ready for scaling 10x. The phrases "test-driven-development" and "continuous-integration" are not even understood by the team cause they were all in the first 2 year product race. Tech team size is 5. The question is how to sell/convince team and management about TDD/unit testing/coding standards/documentation - with economics. train the team to do more than just feature coding and start writing test units along - which looks like more work, means needs more time! how to plan for creating units for all backlog production code

Read the article
Any frameworks or library allow me to run large amount of concurrent jobs schedully?

- by Yoga

Are there any high level programming frameworks that allow me to run large amount of concurrent jobs schedully? e.g. I have 100K of urls need to check their uptime every 5 minutes Definitely I can write a program to handle this, but then I need to handle concurrency, queuing, error handling, system throttling, job distribution etc. Will there be a framework that I only focus on a particular job (i.e. the ping task) and the system will take care of the scaling and error handling for me? I am open to any language.

Read the article
Is OpenStack suitable as a fault tolerant DB host?

- by Jit B

I am trying to design a fault tolerant DB cluster (schema does not matter) that would not require much maintenance. After looking at almost everything from MySQL to MongoDB to HBase I still find that no DB is easily scalable - Cassandra comes close but it has its own set of problems. So I was thinking what if I run something like MySQL or OrientDB on top of a large openstack VM. The VM would be fault tolerant by itself so I dont need to do it st DB level. Is it viable? Has it been done before? If not then what are the possible problems with this approach?

Read the article
Non-dynamic CMS [closed]

- by user20457

Some of the web sites I visit every day (news, sports, etc..), although the content changes very often (several times per day), the URLs always have .html extension, what makes me thing that the content has been generated once, and then published as a static page, rather than generated in every call, or even cached in memory. For example, the fictitious site "mysports.com" have a "futbol.html" page, and then yesterday Messi gets injured and they have another thing to put in that page, then I presume they post the new item in their CMS system, and automatically a publishing action is triggered aftewards that recreates "futbol.html" in a CDN with the new item and probably discard the oldest one. Then the ETag changes and clients will get the new page if they try to access it. (the site is fictitious but this is what I believe happened yesterday in the sports site I read) This would fit in the CQRS approach, and I presume they have a huge performance. I know lots of CMS (WP, Drupal, BlogEngine.net, DNN, etc...), but I have never seen any able of doing this, or at least, I was not aware this feautre. How are called those distributed CMS? Which are the most well known? Cheers.

Read the article
Any good stories or blog posts of a startup's server/stack evolving as they got bigger? [closed]

- by user72245

I know lots of startups often go for practical, simple, efficient. So maybe tossing a Ruby program on a basic Apache server. Get some users up and running, etc. Then Ruby starts to not be fast enough, so they throw more servers at the problem? And load balancing or something? And then when stuff gets REALLY crazy, language changes, etc? I'm looking for someone who has cleanly and simply told their own company's story like this. Are there any good ones?

Read the article
Even distribution through a chain of resources

- by ClosetGeek

I'm working on an algorithm which routes tasks through a chain of distributed resources based on a hash (or random number). For example, say you have 10 gateways into a service which distribute tasks to 1000 handlers through 100 queues. 10,000 connected clients are expected to be connected to gateways at any given time (numbers are very general to keep it simple). Thats 10,000 clients 10 gateways (producers) 100 queues 1000 workers/handlers (consumers) The flow of each task is client-gateway-queue-worker Each client will have it's own hash/number which is used to route each task from the client to the same worker each time, with each task going through the same gateway and queue each time. Yet the algorithm handles distribution evenly, meaning each gateway, queue, and worker will have an even workload. My question is what exactly would this be called? Does such a thing already exist? This started off as a DHT, but I realized that DHTs can't do exactly what I need, so I started from scratch.

Read the article
What techniques can I use to render very large numbers of objects more efficiently in OpenGL?

- by Luke

You can think of my application as drawing a very large ball-and-stick diagram (or graph). At times, this graph can get very large, where the number of elements even outnumbers the pixels on the screen. Currently I am simply passing all of my textures (as GL_POINTS) and lines to the graphics card using VBO's. When the number of elements outnumbers the number of pixels, is this the most efficient way to do this? Or should I do some calculations on the CPU side before handing everything over to the GPU? If it matters, I do use GL_DEPTH_TEST and GL_ALPHA_TEST. I do some alpha blending, but probably not enough to make a huge performance difference. My scene can be static at times, but the user has control over a typical arc-ball camera and can pan, rotate, or zoom. It is during these operations that performance degradation is noticeable.

Read the article
PeerApp Scalability

- by ChaosFreak

William, In response to a question on P2P caching, you answered "PeerApp can do that but probably doesn't suit the scale you are looking at." PeerApp is the most scalable P2P cache in the world, and can handle hundreds of Gb per second of bandwidth. Their largest deployment in Taiwan handles 120Gbps with no problem. The next largest competitor, OverSi, can barely handle a tenth of that. Where do you get your information that PeerApp "doesn't suit scale"?

Read the article
Scalability of Boost.Asio

- by samm

I'm curious how far others have pushed Boost.Asio in terms of scalability. I am writing an application that may use close to 1000 socket objects, a handful of acceptor objects, and many thousand timer objects. I've configured it such that there's a thread pool invoking io_service::run and use strands in the appropriate places to ensure my handlers do not stomp on each other. My platform is Red Hat Enterprise Linux with Boost 1.39, though I'm not opposed to upgrading to a more recent version of boost.

Read the article
Scaling databases with cheap SSD hard drives

- by Dennis Kashkin

Hey guys! I hope that many of you are working with high traffic database-driven websites, and chances are that your main scalability issues are in the database. I noticed a couple of things lately: Most large databases require a team of DBAs in order to scale. They constantly struggle with limitations of hard drives and end up with very expensive solutions (SANs or large RAIDs, frequent maintenance windows for defragging and repartitioning, etc.) The actual annual cost of maintaining such databases is in $100K-$1M range which is too steep for me :) Finally, we got several companies like Intel, Samsung, FusionIO, etc. that just started selling extremely fast yet affordable SSD hard drives based on SLC Flash technology. These drives are 100 times faster in random read/writes than the best spinning hard drives on the market (up to 50,000 random writes per second). Their seek time is pretty much zero, so the cost of random I/O is the same as sequential I/O, which is awesome for databases. These SSD drives cost around $10-$20 per gigabyte, and they are relatively small (64GB). So, there seems to be an opportunity to avoid the HUGE costs of scaling databases the traditional way by simply building a big enough RAID 5 array of SSD drives (which would cost only a few thousand dollars). Then we don't care if the database file is fragmented, and we can afford 100 times more disk writes per second without having to spread the database across 100 spindles. . Is anybody else interested in this? I've been testing a few SSD drives and can share my results. If anybody on this site has already solved their I/O bottleneck with SSDs, I would love to hear your war stories! PS. I know that there are plenty of expensive solutions out there that help with scalability, for example the time proven RAM-based SANs. I want to be clear that even $50K is too expensive for my project. I have to find a solution that costs no more than $10K and does not take much time to implement.

Read the article
Developing high-performance and scalable zend framework website [on hold]

- by Daniel

We are going to develop an ads website like http://www.gumtree.com/ (it will not be like this one but just to give you an ideea) and we are having some issues regarding performance and scalability. We are planning on using Zend Framework for this project but this is all that I'm sure off at this point. I don't think a classic approch like Zend Framework (PHP) + MySQL + Memcache + jQuery (and I would throw Doctrine 2 in there to) will fix result in a high-performance application. I was thinking on making this a RESTful application (with Zend Framework) + NGINX (or maybe MongoDB) + Memcache (or eAccelerator -- I understand this will create problems with scalability on multiple servers) + jQuery or maybe throw Backbone.js in there, a CDN for static content, a server for images and a scalable server for the requests and the rest. My questions are: - What do you think about my approch? - What solutions would you recommand for developing an high performance, scalable application expected to have a lot of traffic using PHP(Zend Framework 2)...I would be interested in your approch. I should note that I'm a Zend developer, I'm working with Zend for over 3 years, this is why I'm choosing it.

Read the article
Developing high-performance and scalable zend framework website

- by Daniel

We are going to develop an ads website like http://www.gumtree.com/ (it will not be like this one but just to give you an ideea) and we are having some issues regarding performance and scalability. We are planning on using Zend Framework for this project but this is all that I'm sure off at this point. I don't think a classic approch like Zend Framework (PHP) + MySQL + Memcache + jQuery (and I would throw Doctrine 2 in there to) will fix result in a high-performance application. I was thinking on making this a RESTful application (with Zend Framework) + NGINX (or maybe MongoDB) + Memcache (or eAccelerator -- I understand this will create problems with scalability on multiple servers) + jQuery, a CDN for static content, a server for images and a scalable server for the requests and the rest. My questions are: - What do you think about my approch? - What solutions would you recommand in terms of servers approch (MySQL, NGINX, MongoDB or pgsql) for a scalable application expected to have a lot of traffic using PHP?...I would be interested in your approch. Note: I'm a Zend Framework developer and don't have to much experience with the servers part (to determin what would be best solution for my scalable application)

Read the article

Search Results

Search found 874 results on 35 pages for 'scalability'.

Page 3/35 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Prof Plum

- by Jonas

- by Kshitiz Sharma

- by Matthew Turney

- by ruslan

- by Tom Wijsman

- by Arthur Edelstein

- by Chaitanya

- by Mike Lyons

- by makerofthings7

- by ruslan

- by Sid

- by TomWij

- by thevikas

- by Yoga

- by Jit B

- by user20457

- by user72245

- by ClosetGeek

- by Luke

- by ChaosFreak

- by samm

- by Dennis Kashkin

- by Daniel

- by Daniel

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >