Search Results

Search found 3912 results on 157 pages for 'distributed caching'.

Page 1/157 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Windows Azure Evolution – Caching (Preview)

    - by Shaun
    Caching is a popular topic when we are building a high performance and high scalable system not only on top of the cloud platform but the on-premise environment as well. On March 2011 the Windows Azure AppFabric Caching had been production launched. It provides an in-memory, distributed caching service over the cloud. And now, in this June 2012 update, the cache team announce a grand new caching solution on Windows Azure, which is called Windows Azure Caching (Preview). And the original Windows Azure AppFabric Caching was renamed to Windows Azure Shared Caching.   What’s Caching (Preview) If you had been using the Shared Caching you should know that it is constructed by a bunch of cache servers. And when you want to use you should firstly create a cache account from the developer portal and specify the size you want to use, which means how much memory you can use to store your data that wanted to be cached. Then you can add, get and remove them through your code through the cache URL. The Shared Caching is a multi-tenancy system which host all cached items across all users. So you don’t know which server your data was located. This caching mode works well and can take most of the cases. But it has some problems. The first one is the performance. Since the Shared Caching is a multi-tenancy system, which means all cache operations should go through the Shared Caching gateway and then routed to the server which have the data your are looking for. Even though there are some caches in the Shared Caching system it also takes time from your cloud services to the cache service. Secondary, the Shared Caching service works as a block box to the developer. The only thing we know is my cache endpoint, and that’s all. Someone may satisfied since they don’t want to care about anything underlying. But if you need to know more and want more control that’s impossible in the Shared Caching. The last problem would be the price and cost-efficiency. You pay the bill based on how much cache you requested per month. But when we host a web role or worker role, it seldom consumes all of the memory and CPU in the virtual machine (service instance). If using Shared Caching we have to pay for the cache service while waste of some of our memory and CPU locally. Since the issues above Microsoft offered a new caching mode over to us, which is the Caching (Preview). Instead of having a separated cache service, the Caching (Preview) leverage the memory and CPU in our cloud services (web role and worker role) as the cache clusters. Hence the Caching (Preview) runs on the virtual machines which hosted or near our cloud applications. Without any gateway and routing, since it located in the same data center and same racks, it provides really high performance than the Shared Caching. The Caching (Preview) works side-by-side to our application, initialized and worked as a Windows Service running in the virtual machines invoked by the startup tasks from our roles, we could get more information and control to them. And since the Caching (Preview) utilizes the memory and CPU from our existing cloud services, so it’s free. What we need to pay is the original computing price. And the resource on each machines could be used more efficiently.   Enable Caching (Preview) It’s very simple to enable the Caching (Preview) in a cloud service. Let’s create a new windows azure cloud project from Visual Studio and added an ASP.NET Web Role. Then open the role setting and select the Caching page. This is where we enable and configure the Caching (Preview) on a role. To enable the Caching (Preview) just open the “Enable Caching (Preview Release)” check box. And then we need to specify which mode of the caching clusters we want to use. There are two kinds of caching mode, co-located and dedicate. The co-located mode means we use the memory in the instances we run our cloud services (web role or worker role). By using this mode we must specify how many percentage of the memory will be used as the cache. The default value is 30%. So make sure it will not affect the role business execution. The dedicate mode will use all memory in the virtual machine as the cache. In fact it will reserve some for operation system, azure hosting etc.. But it will try to use as much as the available memory to be the cache. As you can see, the Caching (Preview) was defined based on roles, which means all instances of this role will apply the same setting and play as a whole cache pool, and you can consume it by specifying the name of the role, which I will demonstrate later. And in a windows azure project we can have more than one role have the Caching (Preview) enabled. Then we will have more caches. For example, let’s say I have a web role and worker role. The web role I specified 30% co-located caching and the worker role I specified dedicated caching. If I have 3 instances of my web role and 2 instances of my worker role, then I will have two caches. As the figure above, cache 1 was contributed by three web role instances while cache 2 was contributed by 2 worker role instances. Then we can add items into cache 1 and retrieve it from web role code and worker role code. But the items stored in cache 1 cannot be retrieved from cache 2 since they are isolated. Back to our Visual Studio we specify 30% of co-located cache and use the local storage emulator to store the cache cluster runtime status. Then at the bottom we can specify the named caches. Now we just use the default one. Now we had enabled the Caching (Preview) in our web role settings. Next, let’s have a look on how to consume our cache.   Consume Caching (Preview) The Caching (Preview) can only be consumed by the roles in the same cloud services. As I mentioned earlier, a cache contributed by web role can be connected from a worker role if they are in the same cloud service. But you cannot consume a Caching (Preview) from other cloud services. This is different from the Shared Caching. The Shared Caching is opened to all services if it has the connection URL and authentication token. To consume the Caching (Preview) we need to add some references into our project as well as some configuration in the Web.config. NuGet makes our life easy. Right click on our web role project and select “Manage NuGet packages”, and then search the package named “WindowsAzure.Caching”. In the package list install the “Windows Azure Caching Preview”. It will download all necessary references from the NuGet repository and update our Web.config as well. Open the Web.config of our web role and find the “dataCacheClients” node. Under this node we can specify the cache clients we are going to use. For each cache client it will use the role name to identity and find the cache. Since we only have this web role with the Caching (Preview) enabled so I pasted the current role name in the configuration. Then, in the default page I will add some code to show how to use the cache. I will have a textbox on the page where user can input his or her name, then press a button to generate the email address for him/her. And in backend code I will check if this name had been added in cache. If yes I will return the email back immediately. Otherwise, I will sleep the tread for 2 seconds to simulate the latency, then add it into cache and return back to the page. 1: protected void btnGenerate_Click(object sender, EventArgs e) 2: { 3: // check if name is specified 4: var name = txtName.Text; 5: if (string.IsNullOrWhiteSpace(name)) 6: { 7: lblResult.Text = "Error. Please specify name."; 8: return; 9: } 10:  11: bool cached; 12: var sw = new Stopwatch(); 13: sw.Start(); 14:  15: // create the cache factory and cache 16: var factory = new DataCacheFactory(); 17: var cache = factory.GetDefaultCache(); 18:  19: // check if the name specified is in cache 20: var email = cache.Get(name) as string; 21: if (email != null) 22: { 23: cached = true; 24: sw.Stop(); 25: } 26: else 27: { 28: cached = false; 29: // simulate the letancy 30: Thread.Sleep(2000); 31: email = string.Format("{0}@igt.com", name); 32: // add to cache 33: cache.Add(name, email); 34: } 35:  36: sw.Stop(); 37: lblResult.Text = string.Format( 38: "Cached = {0}. Duration: {1}s. {2} => {3}", 39: cached, sw.Elapsed.TotalSeconds.ToString("0.00"), name, email); 40: } The Caching (Preview) can be used on the local emulator so we just F5. The first time I entered my name it will take about 2 seconds to get the email back to me since it was not in the cache. But if we re-enter my name it will be back at once from the cache. Since the Caching (Preview) is distributed across all instances of the role, so we can scaling-out it by scaling-out our web role. Just use 2 instances and tweak some code to show the current instance ID in the page, and have another try. Then we can see the cache can be retrieved even though it was added by another instance.   Consume Caching (Preview) Across Roles As I mentioned, the Caching (Preview) can be consumed by all other roles within the same cloud service. For example, let’s add another web role in our cloud solution and add the same code in its default page. In the Web.config we add the cache client to one enabled in the last role, by specifying its role name here. Then we start the solution locally and go to web role 1, specify the name and let it generate the email to us. Since there’s no cache for this name so it will take about 2 seconds but will save the email into cache. And then we go to web role 2 and specify the same name. Then you can see it retrieve the email saved by the web role 1 and returned back very quickly. Finally then we can upload our application to Windows Azure and test again. Make sure you had changed the cache cluster status storage account to the real azure account.   More Awesome Features As a in-memory distributed caching solution, the Caching (Preview) has some fancy features I would like to highlight here. The first one is the high availability support. This is the first time I have heard that a distributed cache support high availability. In the distributed cache world if a cache cluster was failed, the data it stored will be lost. This behavior was introduced by Memcached and is followed by almost all distributed cache productions. But Caching (Preview) provides high availability, which means you can specify if the named cache will be backup automatically. If yes then the data belongs to this named cache will be replicated on another role instance of this role. Then if one of the instance was failed the data can be retrieved from its backup instance. To enable the backup just open the Caching page in Visual Studio. In the named cache you want to enable backup, change the Backup Copies value from 0 to 1. The value of Backup Copies only for 0 and 1. “0” means no backup and no high availability while “1” means enabled high availability with backup the data into another instance. But by using the high availability feature there are something we need to make sure. Firstly the high availability does NOT means the data in cache will never be lost for any kind of failure. For example, if we have a role with cache enabled that has 10 instances, and 9 of them was failed, then most of the cached data will be lost since the primary and backup instance may failed together. But normally is will not be happened since MS guarantees that it will use the instance in the different fault domain for backup cache. Another one is that, enabling the backup means you store two copies of your data. For example if you think 100MB memory is OK for cache, but you need at least 200MB if you enabled backup. Besides the high availability, the Caching (Preview) support more features introduced in Windows Server AppFabric Caching than the Windows Azure Shared Caching. It supports local cache with notification. It also support absolute and slide window expiration types as well. And the Caching (Preview) also support the Memcached protocol as well. This means if you have an application based on Memcached, you can use Caching (Preview) without any code changes. What you need to do is to change the configuration of how you connect to the cache. Similar as the Windows Azure Shared Caching, MS also offers the out-of-box ASP.NET session provider and output cache provide on top of the Caching (Preview).   Summary Caching is very important component when we building a cloud-based application. In the June 2012 update MS provides a new cache solution named Caching (Preview). Different from the existing Windows Azure Shared Caching, Caching (Preview) runs the cache cluster within the role instances we have deployed to the cloud. It gives more control, more performance and more cost-effect. So now we have two caching solutions in Windows Azure, the Shared Caching and Caching (Preview). If you need a central cache service which can be used by many cloud services and web sites, then you have to use the Shared Caching. But if you only need a fast, near distributed cache, then you’d better use Caching (Preview).   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • Recommendations for distributed processing/distributed storage systems

    - by Eddie
    At my organization we have a processing and storage system spread across two dozen linux machines that handles over a petabyte of data. The system right now is very ad-hoc; processing automation and data management is handled by a collection of large perl programs on independent machines. I am looking at distributed processing and storage systems to make it easier to maintain, evenly distribute load and data with replication, and grow in disk space and compute power. The system needs to be able to handle millions of files, varying in size between 50 megabytes to 50 gigabytes. Once created, the files will not be appended to, only replaced completely if need be. The files need to be accessible via HTTP for customer download. Right now, processing is automated by perl scripts (that I have complete control over) which call a series of other programs (that I don't have control over because they are closed source) that essentially transforms one data set into another. No data mining happening here. Here is a quick list of things I am looking for: Reliability: These data must be accessible over HTTP about 99% of the time so I need something that does data replication across the cluster. Scalability: I want to be able to add more processing power and storage easily and rebalance the data on across the cluster. Distributed processing: Easy and automatic job scheduling and load balancing that fits with processing workflow I briefly described above. Data location awareness: Not strictly required but desirable. Since data and processing will be on the same set of nodes I would like the job scheduler to schedule jobs on or close to the node that the data is actually on to cut down on network traffic. Here is what I've looked at so far: Storage Management: GlusterFS: Looks really nice and easy to use but doesn't seem to have a way to figure out what node(s) a file actually resides on to supply as a hint to the job scheduler. GPFS: Seems like the gold standard of clustered filesystems. Meets most of my requirements except, like glusterfs, data location awareness. Ceph: Seems way to immature right now. Distributed processing: Sun Grid Engine: I have a lot of experience with this and it's relatively easy to use (once it is configured properly that is). But Oracle got its icy grip around it and it no longer seems very desirable. Both: Hadoop/HDFS: At first glance it looked like hadoop was perfect for my situation. Distributed storage and job scheduling and it was the only thing I found that would give me the data location awareness that I wanted. But I don't like the namename being a single point of failure. Also, I'm not really sure if the MapReduce paradigm fits the type of processing workflow that I have. It seems like you need to write all your software specifically for MapReduce instead of just using Hadoop as a generic job scheduler. OpenStack: I've done some reading on this but I'm having trouble deciding if it fits well with my problem or not. Does anyone have opinions or recommendations for technologies that would fit my problem well? Any suggestions or advise would be greatly appreciated. Thanks!

    Read the article

  • Building a Redundant / Distributed Application

    - by MattW
    This is more of a "point me in the right direction" question. My team of three and I have built a hosted web app that queues and routes customer chat requests to available customer service agents (It does other things as well, but this is enough background to illustrate the issue). The basic dev architecture today is: a single page ajax web UI (ASP.NET MVC) with floating chat windows (think Gmail) a backend Windows service to queue and route the chat requests this service also logs the chats, calculates service levels, etc a Comet server product that routes data between the web frontend and the backend Windows service this also helps us detect which Agents are still connected (online) And our hardware architecture today is: 2 servers to host the web UI portion of the application a load balancer to route requests to the 2 different web app servers a third server to host the SQL Server DB and the backend Windows service responsible for queuing / delivering chats So as it stands today, one of the web app servers could go down and we would be ok. However, if something would happen to the SQL Server / Windows Service server we would be boned. My question - how can I make this backend Windows service logic be able to be spread across multiple machines (distributed)? The Windows service is written to accept requests from the Comet server, check for available Agents, and route the chat to those agents. How can I make this more distributed? How can I make it so that I can distribute the work of the backend Windows service can be spread across multiple machines for redundancy and uptime purposes? Will I need to re-write it with distributed computing in mind? I should also note that I am hosting all of this on Rackspace Cloud instances - so maybe it is something I should be less concerned about? Thanks in advance for any help!

    Read the article

  • Synchronizing local and remote cache in distributed caching

    - by ltfishie
    With a distributed cache, a subset of the cache is kept locally while the rest is held remotely. In a get operation, if the entry is not available locally, the remote cache will be used and and the entry is added to local cache. In a put operation, both the local cache and remote cache are updated. Other nodes in the cluster also need to be notified to invalidate their local cache as well. What's a simplest way to achieve this if you implemented it yourself, assuming that nodes are not aware of each other. Edit My current implementation goes like this: Each cache entry contains a time stamp. Put operation will update local cache and remote cache Get operation will try local cache then remote cache A background thread on each node will check remote cache periodically for each entry in local cache. If the timestamp on remote is newer overwrite the local. If entry is not found in remote, delete it from local.

    Read the article

  • Data caching in ASP.Net applications

    - by nikolaosk
    In this post I will continue my series of posts on caching. You can read my other post in Output caching here .You can read on how to cache a page depending on the user's browser language. Output caching has its place as a caching mechanism. But right now I will focus on data caching .The advantages of data caching are well known but I will highlight the main points. We have improvements in response times We have reduced database round trips We have different levels of caching and it is up to us...(read more)

    Read the article

  • Java - System design with distributed Queues and Locks

    - by sunny
    Looking for inputs to evaluate a design for a system (java) which would have a distributed queue serving several (but not too many) nodes. These nodes would process objects present in the distributed queue and on occasion require a distributed lock across the cluster on an arbitrary (distributed) data structures. These (distributed) data structures could potentially lie in a distributed cache. Eliminating Terracotta (DSO),Hazelcast and Akka what could be alternative choices. Currently considering zookeeper as a distributed locking mechanism. Since the recommendation of a znode is not to exceed the 1M size , the understanding is that zookeeper should not be used a distributed queue. And also from Netflix curator tech note 4. So should a distributed cache, say like memcached, or redis be used to emulate a distributed queue ? i.e. The distributed queue will be stored in the caches and will be locked cluster-wide via zookeeper. Are there potential pitfalls with this high-level approach. The objects don't need to be taken off the queue. The object will pass through a lifecycle which will determine its removal from the queue. There would be about 10k+ objects in a queue at a given time changing states and any node could service one stage of the object's lifecycle. (Although not strictly necessary .. i.e. one node could serve the entire lifecycle if that is more efficient.) Any suggestions/alternatives ? sidenote: new to zookeeper ; redis etc.

    Read the article

  • Caching in the .NET Stack: Inside-Out

    - by Elton Stoneman
    Originally posted on: http://geekswithblogs.net/EltonStoneman/archive/2013/06/28/caching-in-the-.net-stack-inside-out.aspxI'm delighted to have my first course published on Pluralsight - Caching in the .NET Stack: Inside-out.   It's a pretty comprehensive look at caching in .NET solutions. The first half covers using local, remote and persistent cache stores inside the solution, including the .NET MemoryCache, NCache Express, AppFabric Caching, memcached, Azure Table Storage and local disk stores. The second half covers caching outside the solution in HTTP clients and proxies, and how to set up ASP.NET WebForms, MVC, Web API and WCF projects to use HTTP validation and expiration caching.   The course takes a hands-on approach, starting with a distributed solution that has no caching, analysing key points which can benefit from caching, and adding different types of cache. At the end of the course I run through a set of before and after performance tests, stressing the solution under load. Without caching and with 60 concurrent users the page response time maxes out at 18 seconds - with caching that falls to 2 seconds, so it's a huge improvement from very little effort. I’d be glad to hear feedback if you watch the course, especially if it’s as positive as my editor’s.

    Read the article

  • Sixeyed.Caching available now on NuGet and GitHub!

    - by Elton Stoneman
    Originally posted on: http://geekswithblogs.net/EltonStoneman/archive/2013/10/22/sixeyed.caching-available-now-on-nuget-and-github.aspxThe good guys at Pluralsight have okayed me to publish my caching framework (as seen in Caching in the .NET Stack: Inside-Out) as an open-source library, and it’s out now. You can get it here: Sixeyed.Caching source code on GitHub, and here: Sixeyed.Caching package v1.0.0 on NuGet. If you haven’t seen the course, there’s a preview here on YouTube: In-Process and Out-of-Process Caches, which gives a good flavour. The library is a wrapper around various cache providers, including the .NET MemoryCache, AppFabric cache, and  memcached*. All the wrappers inherit from a base class which gives you a set of common functionality against all the cache implementations: •    inherits OutputCacheProvider, so you can use your chosen cache provider as an ASP.NET output cache; •    serialization and encryption, so you can configure whether you want your cache items serialized (XML, JSON or binary) and encrypted; •    instrumentation, you can optionally use performance counters to monitor cache attempts and hits, at a low level. The framework wraps up different caches into an ICache interface, and it lets you use a provider directly like this: Cache.Memory.Get<RefData>(refDataKey); - or with configuration to use the default cache provider: Cache.Default.Get<RefData>(refDataKey); The library uses Unity’s interception framework to implement AOP caching, which you can use by flagging methods with the [Cache] attribute: [Cache] public RefData GetItem(string refDataKey) - and you can be more specific on the required cache behaviour: [Cache(CacheType=CacheType.Memory, Days=1] public RefData GetItem(string refDataKey) - or really specific: [Cache(CacheType=CacheType.Disk, SerializationFormat=SerializationFormat.Json, Hours=2, Minutes=59)] public RefData GetItem(string refDataKey) Provided you get instances of classes with cacheable methods from the container, the attributed method results will be cached, and repeated calls will be fetched from the cache. You can also set a bunch of cache defaults in application config, like whether to use encryption and instrumentation, and whether the cache system is enabled at all: <sixeyed.caching enabled="true"> <performanceCounters instrumentCacheTotalCounts="true" instrumentCacheTargetCounts="true" categoryNamePrefix ="Sixeyed.Caching.Tests"/> <encryption enabled="true" key="1234567890abcdef1234567890abcdef" iv="1234567890abcdef"/> <!-- key must be 32 characters, IV must be 16 characters--> </sixeyed.caching> For AOP and methods flagged with the cache attribute, you can override the compile-time cache settings at runtime with more config (keyed by the class and method name): <sixeyed.caching enabled="true"> <targets> <target keyPrefix="MethodLevelCachingStub.GetRandomIntCacheConfiguredInternal" enabled="false"/> <target keyPrefix="MethodLevelCachingStub.GetRandomIntCacheExpiresConfiguredInternal" seconds="1"/> </targets> It’s released under the MIT license, so you can use it freely in your own apps and modify as required. I’ll be adding more content to the GitHub wiki, which will be the main source of documentation, but for now there’s an FAQ to get you started. * - in the course the framework library also wraps NCache Express, but there's no public redistributable library that I can find, so it's not in Sixeyed.Caching.

    Read the article

  • Distributed storage and computing

    - by Tim van Elteren
    Dear Serverfault community, After researching a number of distributed file systems for deployment in a production environment with the main purpose of performing both batch and real-time distributed computing I've identified the following list as potential candidates, mainly on maturity, license and support: Ceph Lustre GlusterFS HDFS FhGFS MooseFS XtreemFS The key properties that our system should exhibit: an open source, liberally licensed, yet production ready, e.g. a mature, reliable, community and commercially supported solution; ability to run on commodity hardware, preferably be designed for it; provide high availability of the data with the most focus on reads; high scalability, so operation over multiple data centres, possibly on a global scale; removal of single points of failure with the use of replication and distribution of (meta-)data, e.g. provide fault-tolerance. The sensitivity points that were identified, and resulted in the following questions, are: transparency to the processing layer / application with respect to data locality, e.g. know where data is physically located on a server level, mainly for resource allocation and fast processing, high performance, how can this be accomplished? Do you from experience know what solutions provide this transparency and to what extent? posix compliance, or conformance, is mentioned on the wiki pages of most of the above listed solutions. The question here mainly is, how relevant is support for the posix standard? Hadoop for example isn't posix compliant by design, what are the pro's and con's? what about the difference between synchronous and asynchronous opeartion of a distributed file system. Though a synchronous distributed file system has the preference because of reliability it also imposes certain limitations with respect to scalability. What would be, from your expertise, the way to go on this? I'm looking forward to your replies. Thanks in advance! :) With kind regards, Tim van Elteren

    Read the article

  • Idea to develop a caching server between IIS and SQL Server

    - by John
    I work on a few high traffic websites that all share the same database and that are all heavily database driven. Our SQL server is max-ed out and, although we have already implemented many changes that have helped but the server is still working too hard. We employ some caching in our website but the type of queries we use negate using SQL dependency caching. We tried SQL replication to try and kind of load balance but that didn't prove very successful because the replication process is quite demanding on the servers too and it needed to be done frequently as it is important that data is up to date. We do use a Varnish web caching server (Linux based) to take a bit of the load off both the web and database server but as a lot of the sites are customised based on the user we can only do so much. Anyway, the reason for this question... Varnish gave me an idea for a possible application that might help in this situation. Just like Varnish sits between a web browser and the web server and caches response from the web server, I was wondering about the possibility of creating something that sits between the web server and the database server. Imagine that all SQL queries go through this SQL caching server. If it's a first time query then it will get recorded, and the result requested from the SQL server and stored locally on the cache server. If it's a repeat request within a set time then the result gets retrieved from the local copy without the query being sent to the SQL server. The caching server could also take advantage of SQL dependency caching notifications. This seems like a good idea in theory. There's still the same amount of data moving back and forward from the web server, but the SQL server is relieved of the work of processing the repeat queries. I wonder about how difficult it would be to build a service that sort of emulates requests and responses from SQL server, whether SQL server's own caching is doing enough of this already that this wouldn't be a benefit, or even if someone has done this before and I haven't found it? I would welcome any feedback or any references to any relevant projects.

    Read the article

  • System.Web.Caching vs. Enterprise Library Caching Block

    - by ESV
    For a .NET component that will be used in both web applications and rich client applications, there seem to be two obvious options for caching: System.Web.Caching or the Ent. Lib. Caching Block. What do you use? Why? System.Web.Caching Is this safe to use outside of web apps? I've seen mixed information, but I think the answer is maybe-kind-of-not-really. a KB article warning against 1.0 and 1.1 non web app use The 2.0 page has a comment that indicates it's OK: http://msdn.microsoft.com/en-us/library/system.web.caching.cache(VS.80).aspx Scott Hanselman is creeped out by the notion The 3.5 page includes a warning against such use Rob Howard encouraged use outside of web apps I don't expect to use one of its highlights, SqlCacheDependency, but the addition of CacheItemUpdateCallback in .NET 3.5 seems like a Really Good Thing. Enterprise Library Caching Application Block other blocks are already in use so the dependency already exists cache persistence isn't necessary; regenerating the cache on restart is OK Some cache items should always be available, but be refreshed periodically. For these items, getting a callback after an item has been removed is not very convenient. It looks like a client will have to just sleep and poll until the cache item is repopulated. Memcached for Win32 + .NET client What are the pros and cons when you don't need a distributed cache?

    Read the article

  • Alternative to distributed caching

    - by Chen
    Hi, There is a technical requirement to scale a new system easily. This new system consists of three tiered applications (as a batch processors). Each tier will contains at least 2 servers with the same application resides on each server. So, when one of the tier reaches peak performance, we could extend the scalability easily by adding a new server and the same application to off-load some of the processing loads. The problem is that one or two of the three tiers require heavy caching (about 3 million records and increasing). I'm thinking of using distributed caching system to overcome this problem but the new distributed caching system will means an additional point of failure as applications now need to interact with additional caching systems for processing. I'm currently looking at ncache but just wondering if there is an alternatives to this problem? or is there any other comparable distributed caching system that maybe similar or better than ncache and provide enterprise supports too? Thanks, Chen

    Read the article

  • Distributed cache and improvement

    - by philipl
    Have this question from interview: Web Service function given x static HashMap map (singleton created) if (!map.containsKey(x)) { perform some function to retrieve result y map.put(x, y); } return y; The interviewer asked general question such as what is wrong with this distributed cache implementation. Then asked how to improve on it, due to distributed servers will have different cached key pairs in the map. There are simple mistakes to be pointed out about synchronization and key object, but what really startled me was that this guy thinks that moving to database implementation solves the problem that different servers will have different map content, i.e., the situation when value x is not on server A but on server B, therefore redundant data has to be retrieved in server A. Does his thinking make any sense? (As I understand this is the basic cons for distributed cache against database model, seems he does not understand it at all) What is the typical solution for the cache growth issue (weak reference?) and sync issue (do not know which server has the key already cached - use load balancing)? Thanks

    Read the article

  • Elastic versus Distributed in caching.

    - by Mike Reys
    Until now, I hadn't heard about Elastic Caching yet. Today I read Mike Gualtieri's Blog entry. I immediately thought about Oracle Coherence and got a little scare throughout the reading. Elastic Caching is the next step after Distributed Caching. As we've always positioned Coherence as a Distributed Cache, I thought for a brief instance that Oracle had missed a new trend/technology. But then I started reading the characteristics of an Elastic Cache. Forrester definition: Software infrastructure that provides application developers with data caching services that are distributed across two or more server nodes that consistently perform as volumes grow can be scaled without downtime provide a range of fault-tolerance levels Hey wait a minute, doesn't Coherence fullfill all these requirements? Oh yes, I think it does! The next defintion in the article is about Elastic Application Platforms. This is mainly more of the same with the addition of code execution. Now there is analytics functionality in Oracle Coherence. The analytics capability provides data-centric functions like distributed aggregation, searching and sorting. Coherence also provides continuous querying and event-handling. I think that when it comes to providing an Elastic Application Platform (as in the Forrester definition), Oracle is close, nearly there. And what's more, as Elastic Platform is the next big thing towards the big C word, Oracle Coherence makes you cloud-ready ;-) There you go! Find more info on Oracle Coherence here.

    Read the article

  • SQL SERVER – Faster SQL Server Databases and Applications – Power and Control with SafePeak Caching Options

    - by Pinal Dave
    Update: This blog post is written based on the SafePeak, which is available for free download. Today, I’d like to examine more closely one of my preferred technologies for accelerating SQL Server databases, SafePeak. Safepeak’s software provides a variety of advanced data caching options, techniques and tools to accelerate the performance and scalability of SQL Server databases and applications. I’d like to look more closely at some of these options, as some of these capabilities could help you address lagging database and performance on your systems. To better understand the available options, it is best to start by understanding the difference between the usual “Basic Caching” vs. SafePeak’s “Dynamic Caching”. Basic Caching Basic Caching (or the stale and static cache) is an ability to put the results from a query into cache for a certain period of time. It is based on TTL, or Time-to-live, and is designed to stay in cache no matter what happens to the data. For example, although the actual data can be modified due to DML commands (update/insert/delete), the cache will still hold the same obsolete query data. Meaning that with the Basic Caching is really static / stale cache.  As you can tell, this approach has its limitations. Dynamic Caching Dynamic Caching (or the non-stale cache) is an ability to put the results from a query into cache while maintaining the cache transaction awareness looking for possible data modifications. The modifications can come as a result of: DML commands (update/insert/delete), indirect modifications due to triggers on other tables, executions of stored procedures with internal DML commands complex cases of stored procedures with multiple levels of internal stored procedures logic. When data modification commands arrive, the caching system identifies the related cache items and evicts them from cache immediately. In the dynamic caching option the TTL setting still exists, although its importance is reduced, since the main factor for cache invalidation (or cache eviction) become the actual data updates commands. Now that we have a basic understanding of the differences between “basic” and “dynamic” caching, let’s dive in deeper. SafePeak: A comprehensive and versatile caching platform SafePeak comes with a wide range of caching options. Some of SafePeak’s caching options are automated, while others require manual configuration. Together they provide a complete solution for IT and Data managers to reach excellent performance acceleration and application scalability for  a wide range of business cases and applications. Automated caching of SQL Queries: Fully/semi-automated caching of all “read” SQL queries, containing any types of data, including Blobs, XMLs, Texts as well as all other standard data types. SafePeak automatically analyzes the incoming queries, categorizes them into SQL Patterns, identifying directly and indirectly accessed tables, views, functions and stored procedures; Automated caching of Stored Procedures: Fully or semi-automated caching of all read” stored procedures, including procedures with complex sub-procedure logic as well as procedures with complex dynamic SQL code. All procedures are analyzed in advance by SafePeak’s  Metadata-Learning process, their SQL schemas are parsed – resulting with a full understanding of the underlying code, objects dependencies (tables, views, functions, sub-procedures) enabling automated or semi-automated (manually review and activate by a mouse-click) cache activation, with full understanding of the transaction logic for cache real-time invalidation; Transaction aware cache: Automated cache awareness for SQL transactions (SQL and in-procs); Dynamic SQL Caching: Procedures with dynamic SQL are pre-parsed, enabling easy cache configuration, eliminating SQL Server load for parsing time and delivering high response time value even in most complicated use-cases; Fully Automated Caching: SQL Patterns (including SQL queries and stored procedures) that are categorized by SafePeak as “read and deterministic” are automatically activated for caching; Semi-Automated Caching: SQL Patterns categorized as “Read and Non deterministic” are patterns of SQL queries and stored procedures that contain reference to non-deterministic functions, like getdate(). Such SQL Patterns are reviewed by the SafePeak administrator and in usually most of them are activated manually for caching (point and click activation); Fully Dynamic Caching: Automated detection of all dependent tables in each SQL Pattern, with automated real-time eviction of the relevant cache items in the event of “write” commands (a DML or a stored procedure) to one of relevant tables. A default setting; Semi Dynamic Caching: A manual cache configuration option enabling reducing the sensitivity of specific SQL Patterns to “write” commands to certain tables/views. An optimization technique relevant for cases when the query data is either known to be static (like archive order details), or when the application sensitivity to fresh data is not critical and can be stale for short period of time (gaining better performance and reduced load); Scheduled Cache Eviction: A manual cache configuration option enabling scheduling SQL Pattern cache eviction based on certain time(s) during a day. A very useful optimization technique when (for example) certain SQL Patterns can be cached but are time sensitive. Example: “select customers that today is their birthday”, an SQL with getdate() function, which can and should be cached, but the data stays relevant only until 00:00 (midnight); Parsing Exceptions Management: Stored procedures that were not fully parsed by SafePeak (due to too complex dynamic SQL or unfamiliar syntax), are signed as “Dynamic Objects” with highest transaction safety settings (such as: Full global cache eviction, DDL Check = lock cache and check for schema changes, and more). The SafePeak solution points the user to the Dynamic Objects that are important for cache effectiveness, provides easy configuration interface, allowing you to improve cache hits and reduce cache global evictions. Usually this is the first configuration in a deployment; Overriding Settings of Stored Procedures: Override the settings of stored procedures (or other object types) for cache optimization. For example, in case a stored procedure SP1 has an “insert” into table T1, it will not be allowed to be cached. However, it is possible that T1 is just a “logging or instrumentation” table left by developers. By overriding the settings a user can allow caching of the problematic stored procedure; Advanced Cache Warm-Up: Creating an XML-based list of queries and stored procedure (with lists of parameters) for periodically automated pre-fetching and caching. An advanced tool allowing you to handle more rare but very performance sensitive queries pre-fetch them into cache allowing high performance for users’ data access; Configuration Driven by Deep SQL Analytics: All SQL queries are continuously logged and analyzed, providing users with deep SQL Analytics and Performance Monitoring. Reduce troubleshooting from days to minutes with database objects and SQL Patterns heat-map. The performance driven configuration helps you to focus on the most important settings that bring you the highest performance gains. Use of SafePeak SQL Analytics allows continuous performance monitoring and analysis, easy identification of bottlenecks of both real-time and historical data; Cloud Ready: Available for instant deployment on Amazon Web Services (AWS). As you can see, there are many options to configure SafePeak’s SQL Server database and application acceleration caching technology to best fit a lot of situations. If you’re not familiar with their technology, they offer free-trial software you can download that comes with a free “help session” to help get you started. You can access the free trial here. Also, SafePeak is available to use on Amazon Cloud. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Building a distributed system on Amazon Web Services

    - by Songo
    Would simply using AWS to build an application make this application a distributed system? For example if someone uses RDS for the database server, EC2 for the application itself and S3 for hosting user uploaded media, does that make it a distributed system? If not, then what should it be called and what is this application lacking for it to be distributed? Update Here is my take on the application to clarify my approach to building the system: The application I'm building is a social game for Facebook. I developed the application locally on a LAMP stack using Symfony2. For production I used an a single EC2 Micro instance for hosting the app itself, RDS for hosting my database, S3 for the user uploaded files and CloudFront for hosting static content. I know this may sound like a naive approach, so don't be shy to express your ideas.

    Read the article

  • Design a Distributed System

    - by Bonton255
    I am preparing for an interview on Distributed Systems. I have gone through a lot of text and understand the basics of the area. However, I need some examples of discussions on designing a distributed system given a scenario. For example, if I were to design a distributed system to calculate if a number N is primary or not, what will the be design of the system, what will be the impact of network latency, CPU performance, node failure, addition of nodes, time synchronization etc. If you guys could present your in-depth thoughts on this example, or point me to some similar discussion, that would be really helpful.

    Read the article

  • Can an issue tracking system be distributed?

    - by Klaim
    I was thinking about issue tracking software like Redmine, Trac or even the one that is in Fossil and something hit me: Is there a reason why Redmine and Trac are not possible to be distributed? Or maybe it's possible and I just don't know how it's possible? If it's not possible, why? By distributed I mean like Facebook or Google or other applications that effectively runs on multiple hardware a the same time but share data.

    Read the article

  • Best practices for caching search queries

    - by David Esteves
    I am trying to improve performance of my ASP.net Web Api by adding a data cache but I am not sure how exactly to go about it as it seems to be more complex than most caching scenarios. An example is I have a table of Locations and an api to retrieve locations via search, for an autocomplete. /api/location/Londo and the query would be something like SELECT * FROM Locations WHERE Name like 'Londo%' These locations change very infrequently so I would like to cache them to prevent trips to the database for no real reason and improve the response time. Looking at caching options I am using the Windows Azure Appfabric system, the problem is it's just a key/value cache. Since I can only retrieve items based on keys I couldn't actually use it for this scenario as far as Im aware. Is what I am trying to do bad use of a caching system? Should I try looking into NoSql DB which could possibly run as a cache for something like this to improve performance? Should I just cache the entire table/collection in a single key with a specific data structure which could assist with the searching and then do the search upon retrieval of the data?

    Read the article

  • Caching strategies for entities and collections

    - by Rob West
    We currently have an application framework in which we automatically cache both entities and collections of entities at the business layer (using .NET cache). So the method GetWidget(int id) checks the cache using a key GetWidget_Id_{0} before hitting the database, and the method GetWidgetsByStatusId(int statusId) checks the cache using GetWidgets_Collections_ByStatusId_{0}. If the objects are not in the cache they are retrieved from the database and added to the cache. This approach is obviously quick for read scenarios, and as a blanket approach is quick for us to implement, but requires large numbers of cache keys to be purged when CRUD operations are carried out on entities. Obviously as additional methods are added this impacts performance and the benefits of caching diminish. I'm interested in alternative approaches to handling caching of collections. I know that NHibernate caches a list of the identifiers in the collection rather than the actual entities. Is this an approach other people have tried - what are the pros and cons? In particular I am looking for options that optimise performance and can be implemented automatically through boilerplate generated code (we have our own code generation tool). I know some people will say that caching needs to be done by hand each time to meet the needs of the specific situation but I am looking for something that will get us most of the way automatically.

    Read the article

  • Where is a good place to start to learn about custom caching in .Net

    - by John
    I'm looking to make some performance enhancements to our site, but I'm not sure exactly where to begin. We have some custom object caching, but I think that we can do better. Our Business We aggregate news stories on a news type of web site. We get approximately 500-1000 new stories per week. We have index pages that show various lists of the items and details pages that show the individual stories. Our Current Use case: Getting an Individual Story User makes a request The Data Access Layer(DAL) checks to see if the item is in cache and if item is fresh (15 minutes). If the item is not in cache or is not fresh, retrieve the item from SQL Server, save to cache and return to user. Problems with this approach The pull nature of caching means that users have to pay the waiting cost every time that the cache is refreshed. Once a story is published, it changes infrequently and I think that we should replace the pull model with something better. My initial thoughts My initial thought is that stories should ALL be stored locally in some type of dictionary. (Cache or is there another, better way?). If the story is not found, then make a trip to the database, update the local dictionary and send the item back. Since there may be occasional updates to stories, this should be an entirely process from the user. I watched a video by Brent Ozar, How StackOverflow Scales SQL Server, in which Brent states "the fastest database query is the one that you don't make". Where do I start? At this point, I don't know exactly what the solution is. Is it caching? Is there a better way of using local storage? Do I use a Dictionary, OrderedDictionary, List ? It seems daunting and I'm just looking for some good starting points to learn more about how to do this type of optimization.

    Read the article

  • Distributed Computing Framework (.NET) - Specifically for CPU Instensive operations

    - by StevenH
    I am currently researching the options that are available (both Open Source and Commercial) for developing a distributed application. "A distributed system consists of multiple autonomous computers that communicate through a computer network." Wikipedia The application is focused on distributing highly cpu intensive operations (as opposed to data intensive) so I'm sure MapReduce solutions don't fit the bill. Any framework that you can recommend ( + give a brief summary of any experience or comparison to other frameworks ) would be greatly appreciated. Thanks.

    Read the article

  • Caching strategies - LRU, MRU, Clock-Pro

    - by golgofa
    I am going to write a bachelor's science work on caching strategies and really, can't find any links to specifications or full descriptions of some of them. Only something like summaries from wikipedia. Please, help with some links on LRU, MRU caching and new-one - Clock Pro. Thanks a lot. All links are very useful for me. The purpose of work - is to compare different cache strategies to get more effiency. It based on WebApplication with ejb 2.0, so algorithm's will be implemented there, espesially in ejbLoad() and ejbFindByPrimarKey(). Also, one of aspects of this application - it will use not common scheme of tables in database - it based on metamodel. So, if you had any experience on this topic, i would be grateful to take some of your knowledge)

    Read the article

  • Distributed Development Tools -- (Version control and Project Management)

    - by Macy Abbey
    Hello, I've recently become responsible for choosing which source control and project management software to use for a company that employs me. Currently it uses Jira (project management) and Subversion (version control). I know there are many other options out there -- the ones I know about are all in this article http://mashable.com/2010/07/14/distributed-developer-teams/ . I'm leaning towards recommending they just stay with what they have as it seems workable and any change would have to be worth the cost of switching to say github/basecamp or some other solution. Some details on the team: It's a distributed development shop. Meetings of the whole team in one room are rare. It's currently a very small development team (three developers). The project management software is used by developers and a product manager or two. What are you experiences with version control and project management web applications? Are there any you would recommend and you think are worth the switching cost of time to learn new services / implementing the change? Edit: After educating myself further on the options it appears DVCS offer powerful benefits that may be worth investing in now as opposed to later in the company's lifetime when the switching cost is higher: I'm a Subversion geek, why I should consider or not consider Mercurial or Git or any other DVCS?

    Read the article

  • Distributed Development Tools -- (Version control and Project Management)

    - by Macy Abbey
    I've recently become responsible for choosing which source control and project management software to use for a company that employs me. Currently it uses Jira (project management) and Subversion (version control). I know there are many other options out there -- the ones I know about are all in this article http://mashable.com/2010/07/14/distributed-developer-teams/ . I'm leaning towards recommending they just stay with what they have as it seems workable and any change would have to be worth the cost of switching to say github/basecamp or some other solution. Some details on the team: It's a distributed development shop. Meetings of the whole team in one room are rare. It's currently a very small development team (three developers). The project management software is used by developers and a product manager or two. What are you experiences with version control and project management web applications? Are there any you would recommend and you think are worth the switching cost of time to learn new services / implementing the change? Edit: After educating myself further on the options it appears DVCS offer powerful benefits that may be worth investing in now as opposed to later in the company's lifetime when the switching cost is higher: I'm a Subversion geek, why I should consider or not consider Mercurial or Git or any other DVCS?

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >