Search Results

Search found 1706 results on 69 pages for 'distributed'.

Page 1/69 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Recommendations for distributed processing/distributed storage systems

- by Eddie

At my organization we have a processing and storage system spread across two dozen linux machines that handles over a petabyte of data. The system right now is very ad-hoc; processing automation and data management is handled by a collection of large perl programs on independent machines. I am looking at distributed processing and storage systems to make it easier to maintain, evenly distribute load and data with replication, and grow in disk space and compute power. The system needs to be able to handle millions of files, varying in size between 50 megabytes to 50 gigabytes. Once created, the files will not be appended to, only replaced completely if need be. The files need to be accessible via HTTP for customer download. Right now, processing is automated by perl scripts (that I have complete control over) which call a series of other programs (that I don't have control over because they are closed source) that essentially transforms one data set into another. No data mining happening here. Here is a quick list of things I am looking for: Reliability: These data must be accessible over HTTP about 99% of the time so I need something that does data replication across the cluster. Scalability: I want to be able to add more processing power and storage easily and rebalance the data on across the cluster. Distributed processing: Easy and automatic job scheduling and load balancing that fits with processing workflow I briefly described above. Data location awareness: Not strictly required but desirable. Since data and processing will be on the same set of nodes I would like the job scheduler to schedule jobs on or close to the node that the data is actually on to cut down on network traffic. Here is what I've looked at so far: Storage Management: GlusterFS: Looks really nice and easy to use but doesn't seem to have a way to figure out what node(s) a file actually resides on to supply as a hint to the job scheduler. GPFS: Seems like the gold standard of clustered filesystems. Meets most of my requirements except, like glusterfs, data location awareness. Ceph: Seems way to immature right now. Distributed processing: Sun Grid Engine: I have a lot of experience with this and it's relatively easy to use (once it is configured properly that is). But Oracle got its icy grip around it and it no longer seems very desirable. Both: Hadoop/HDFS: At first glance it looked like hadoop was perfect for my situation. Distributed storage and job scheduling and it was the only thing I found that would give me the data location awareness that I wanted. But I don't like the namename being a single point of failure. Also, I'm not really sure if the MapReduce paradigm fits the type of processing workflow that I have. It seems like you need to write all your software specifically for MapReduce instead of just using Hadoop as a generic job scheduler. OpenStack: I've done some reading on this but I'm having trouble deciding if it fits well with my problem or not. Does anyone have opinions or recommendations for technologies that would fit my problem well? Any suggestions or advise would be greatly appreciated. Thanks!

Read the article
Building a Redundant / Distributed Application

- by MattW

This is more of a "point me in the right direction" question. My team of three and I have built a hosted web app that queues and routes customer chat requests to available customer service agents (It does other things as well, but this is enough background to illustrate the issue). The basic dev architecture today is: a single page ajax web UI (ASP.NET MVC) with floating chat windows (think Gmail) a backend Windows service to queue and route the chat requests this service also logs the chats, calculates service levels, etc a Comet server product that routes data between the web frontend and the backend Windows service this also helps us detect which Agents are still connected (online) And our hardware architecture today is: 2 servers to host the web UI portion of the application a load balancer to route requests to the 2 different web app servers a third server to host the SQL Server DB and the backend Windows service responsible for queuing / delivering chats So as it stands today, one of the web app servers could go down and we would be ok. However, if something would happen to the SQL Server / Windows Service server we would be boned. My question - how can I make this backend Windows service logic be able to be spread across multiple machines (distributed)? The Windows service is written to accept requests from the Comet server, check for available Agents, and route the chat to those agents. How can I make this more distributed? How can I make it so that I can distribute the work of the backend Windows service can be spread across multiple machines for redundancy and uptime purposes? Will I need to re-write it with distributed computing in mind? I should also note that I am hosting all of this on Rackspace Cloud instances - so maybe it is something I should be less concerned about? Thanks in advance for any help!

Read the article
Java - System design with distributed Queues and Locks

- by sunny

Looking for inputs to evaluate a design for a system (java) which would have a distributed queue serving several (but not too many) nodes. These nodes would process objects present in the distributed queue and on occasion require a distributed lock across the cluster on an arbitrary (distributed) data structures. These (distributed) data structures could potentially lie in a distributed cache. Eliminating Terracotta (DSO),Hazelcast and Akka what could be alternative choices. Currently considering zookeeper as a distributed locking mechanism. Since the recommendation of a znode is not to exceed the 1M size , the understanding is that zookeeper should not be used a distributed queue. And also from Netflix curator tech note 4. So should a distributed cache, say like memcached, or redis be used to emulate a distributed queue ? i.e. The distributed queue will be stored in the caches and will be locked cluster-wide via zookeeper. Are there potential pitfalls with this high-level approach. The objects don't need to be taken off the queue. The object will pass through a lifecycle which will determine its removal from the queue. There would be about 10k+ objects in a queue at a given time changing states and any node could service one stage of the object's lifecycle. (Although not strictly necessary .. i.e. one node could serve the entire lifecycle if that is more efficient.) Any suggestions/alternatives ? sidenote: new to zookeeper ; redis etc.

Read the article
Distributed storage and computing

- by Tim van Elteren

Dear Serverfault community, After researching a number of distributed file systems for deployment in a production environment with the main purpose of performing both batch and real-time distributed computing I've identified the following list as potential candidates, mainly on maturity, license and support: Ceph Lustre GlusterFS HDFS FhGFS MooseFS XtreemFS The key properties that our system should exhibit: an open source, liberally licensed, yet production ready, e.g. a mature, reliable, community and commercially supported solution; ability to run on commodity hardware, preferably be designed for it; provide high availability of the data with the most focus on reads; high scalability, so operation over multiple data centres, possibly on a global scale; removal of single points of failure with the use of replication and distribution of (meta-)data, e.g. provide fault-tolerance. The sensitivity points that were identified, and resulted in the following questions, are: transparency to the processing layer / application with respect to data locality, e.g. know where data is physically located on a server level, mainly for resource allocation and fast processing, high performance, how can this be accomplished? Do you from experience know what solutions provide this transparency and to what extent? posix compliance, or conformance, is mentioned on the wiki pages of most of the above listed solutions. The question here mainly is, how relevant is support for the posix standard? Hadoop for example isn't posix compliant by design, what are the pro's and con's? what about the difference between synchronous and asynchronous opeartion of a distributed file system. Though a synchronous distributed file system has the preference because of reliability it also imposes certain limitations with respect to scalability. What would be, from your expertise, the way to go on this? I'm looking forward to your replies. Thanks in advance! :) With kind regards, Tim van Elteren

Read the article
Elastic versus Distributed in caching.

- by Mike Reys

Until now, I hadn't heard about Elastic Caching yet. Today I read Mike Gualtieri's Blog entry. I immediately thought about Oracle Coherence and got a little scare throughout the reading. Elastic Caching is the next step after Distributed Caching. As we've always positioned Coherence as a Distributed Cache, I thought for a brief instance that Oracle had missed a new trend/technology. But then I started reading the characteristics of an Elastic Cache. Forrester definition: Software infrastructure that provides application developers with data caching services that are distributed across two or more server nodes that consistently perform as volumes grow can be scaled without downtime provide a range of fault-tolerance levels Hey wait a minute, doesn't Coherence fullfill all these requirements? Oh yes, I think it does! The next defintion in the article is about Elastic Application Platforms. This is mainly more of the same with the addition of code execution. Now there is analytics functionality in Oracle Coherence. The analytics capability provides data-centric functions like distributed aggregation, searching and sorting. Coherence also provides continuous querying and event-handling. I think that when it comes to providing an Elastic Application Platform (as in the Forrester definition), Oracle is close, nearly there. And what's more, as Elastic Platform is the next big thing towards the big C word, Oracle Coherence makes you cloud-ready ;-) There you go! Find more info on Oracle Coherence here.

Read the article
Building a distributed system on Amazon Web Services

- by Songo

Would simply using AWS to build an application make this application a distributed system? For example if someone uses RDS for the database server, EC2 for the application itself and S3 for hosting user uploaded media, does that make it a distributed system? If not, then what should it be called and what is this application lacking for it to be distributed? Update Here is my take on the application to clarify my approach to building the system: The application I'm building is a social game for Facebook. I developed the application locally on a LAMP stack using Symfony2. For production I used an a single EC2 Micro instance for hosting the app itself, RDS for hosting my database, S3 for the user uploaded files and CloudFront for hosting static content. I know this may sound like a naive approach, so don't be shy to express your ideas.

Read the article
Design a Distributed System

- by Bonton255

I am preparing for an interview on Distributed Systems. I have gone through a lot of text and understand the basics of the area. However, I need some examples of discussions on designing a distributed system given a scenario. For example, if I were to design a distributed system to calculate if a number N is primary or not, what will the be design of the system, what will be the impact of network latency, CPU performance, node failure, addition of nodes, time synchronization etc. If you guys could present your in-depth thoughts on this example, or point me to some similar discussion, that would be really helpful.

Read the article
Can an issue tracking system be distributed?

- by Klaim

I was thinking about issue tracking software like Redmine, Trac or even the one that is in Fossil and something hit me: Is there a reason why Redmine and Trac are not possible to be distributed? Or maybe it's possible and I just don't know how it's possible? If it's not possible, why? By distributed I mean like Facebook or Google or other applications that effectively runs on multiple hardware a the same time but share data.

Read the article
Distributed Computing Framework (.NET) - Specifically for CPU Instensive operations

- by StevenH

I am currently researching the options that are available (both Open Source and Commercial) for developing a distributed application. "A distributed system consists of multiple autonomous computers that communicate through a computer network." Wikipedia The application is focused on distributing highly cpu intensive operations (as opposed to data intensive) so I'm sure MapReduce solutions don't fit the bill. Any framework that you can recommend ( + give a brief summary of any experience or comparison to other frameworks ) would be greatly appreciated. Thanks.

Read the article
Distributed Development Tools -- (Version control and Project Management)

- by Macy Abbey

Hello, I've recently become responsible for choosing which source control and project management software to use for a company that employs me. Currently it uses Jira (project management) and Subversion (version control). I know there are many other options out there -- the ones I know about are all in this article http://mashable.com/2010/07/14/distributed-developer-teams/ . I'm leaning towards recommending they just stay with what they have as it seems workable and any change would have to be worth the cost of switching to say github/basecamp or some other solution. Some details on the team: It's a distributed development shop. Meetings of the whole team in one room are rare. It's currently a very small development team (three developers). The project management software is used by developers and a product manager or two. What are you experiences with version control and project management web applications? Are there any you would recommend and you think are worth the switching cost of time to learn new services / implementing the change? Edit: After educating myself further on the options it appears DVCS offer powerful benefits that may be worth investing in now as opposed to later in the company's lifetime when the switching cost is higher: I'm a Subversion geek, why I should consider or not consider Mercurial or Git or any other DVCS?

Read the article
Distributed Development Tools -- (Version control and Project Management)

- by Macy Abbey

I've recently become responsible for choosing which source control and project management software to use for a company that employs me. Currently it uses Jira (project management) and Subversion (version control). I know there are many other options out there -- the ones I know about are all in this article http://mashable.com/2010/07/14/distributed-developer-teams/ . I'm leaning towards recommending they just stay with what they have as it seems workable and any change would have to be worth the cost of switching to say github/basecamp or some other solution. Some details on the team: It's a distributed development shop. Meetings of the whole team in one room are rare. It's currently a very small development team (three developers). The project management software is used by developers and a product manager or two. What are you experiences with version control and project management web applications? Are there any you would recommend and you think are worth the switching cost of time to learn new services / implementing the change? Edit: After educating myself further on the options it appears DVCS offer powerful benefits that may be worth investing in now as opposed to later in the company's lifetime when the switching cost is higher: I'm a Subversion geek, why I should consider or not consider Mercurial or Git or any other DVCS?

Read the article
JavaScript distributed computing project

- by Ben L.

I made a website that does absolutely nothing, and I've proven to myself that people like to stay there - I've already logged 11+ hours worth of cumulative time on the page. My question is whether it would be possible (or practical) to use the website as a distributed computing site. My first impulse was to find out if there were any JavaScript distributed computing projects already active, so that I could put a piece of code on the page and be done. Unfortunately, all I could find was a big list of websites that thought it might be a cool idea. I'm thinking that I might want to start with something like integer factorization - in this case, RSA numbers. It would be easy for the server to check if an answer was correct (simply test for modulus equals zero), and also easy to implement. Is my idea feasible? Is there already a project out there that I can use?

Read the article
Ceph: open-source petabyte scale distributed storage

- by Gregory Burd

You might want to read this interesting research paper titled "Object Storage on Berkeley DB". The Ceph project is a distributed network file system designed to provide excellent performance, reliability, and scalability... and it uses Oracle Berkeley DB for storage.

Read the article
Synchronizing local and remote cache in distributed caching

- by ltfishie

With a distributed cache, a subset of the cache is kept locally while the rest is held remotely. In a get operation, if the entry is not available locally, the remote cache will be used and and the entry is added to local cache. In a put operation, both the local cache and remote cache are updated. Other nodes in the cluster also need to be notified to invalidate their local cache as well. What's a simplest way to achieve this if you implemented it yourself, assuming that nodes are not aware of each other. Edit My current implementation goes like this: Each cache entry contains a time stamp. Put operation will update local cache and remote cache Get operation will try local cache then remote cache A background thread on each node will check remote cache periodically for each entry in local cache. If the timestamp on remote is newer overwrite the local. If entry is not found in remote, delete it from local.

Read the article
Scalable distributed file system for blobs like images and other documents

- by Pinnacle

Cassandra & HBase both do not efficiently support storage of blobs like images. Storing directly on HDFS stresses the Namenode because of huge number of files. Facebook uses Haystack for images and attachments storage, but this is not open source. So is Lustre a good choice for distributed blob storage? I have read that Amazon S3 is used by many, but this would cost money and personally, I would not like to rely on third party system. What are other suggestions?

Read the article
Helper library for distributed algorithms programming?

- by anon

When you code a distributed algorithm, do you use any library to model abstract things like processor, register, message, link, etc.? Is there any library that does that? I'm thinking about e.g. self-stabilizing algorithms, like self-stabilizing minimum spanning-tree algorithms.

Read the article
Distributed transactions

- by javi

Hello! I've a question regarding distributed transactions. Let's assume I have 3 transaction programs: Transaction A begin a=read(A) b=read(B) c=a+b write(C,c) commit Transaction B begin a=read(A) a=a+1 write(A,a) commit Transaction C begin c=read(C) c=c*2 write(A,c) commit So there are 5 pairs of critical operations: C2-A5, A2-B4, B4-C4, B2-C4, A2-C4. I should ensure integrity and confidentiality, do you have any idea of how to achieve it? Thank you in advance!

Read the article
Optimistic work sharing on sparsely distributed systems

- by Asti

What would a system like BOINC look like if it were written today? At the time BOINC was written, databases were the primary choice for maintaining a shared state and concurrency among nodes. Since then, many approaches have been developed for tasking with optimistic concurrency (OT, partial synchronization primitives, shared iterators etc.) Is there an optimal paradigm for optimistically distributing units of work on sparsely distributing systems which communicate through message passing? Sorry if this is a bit vague. P.S. The concept of Tuple-spaces is great, but locking is inherent to its definition. Edit: I already have a federation system which works very well. I have a reactive OT system is implemented on top of it. I'm looking to extend it to get clients to do units of work.

Read the article
Distributed Transaction Framework across webservices

- by John Petrak

I am designing a new system that has one central web service and several site web services which are spread across the country and some overseas. It has some data that must be identical on all sites. So my plan is to maintain that data in the central web service and then "sync" the data to sites. This includes inserts, edits and deletes. I see a problem when deleting, if one site has used the record, then I need to undo the delete that has happened on the other servers. This lead me to idea that I need some sort of transaction system that can work across different web servers. Before I design one from scratch, I would like to know if anyone has come across this sort of problem and if there are any frame works or even design patterns that might aid me?

Read the article
Interconnect nodes in a Java distributed infrastructure for tweet processing

- by David Moreno García

I'm working in a new version of an old project that I used to download and process user statuses from Twitter. The main problem of that project was its infrastructure. I used multiple instances of a java application (trackers) to download from Twitter given an specific task (basically terms to search for), connected with a central node (a web application) that had to process all tweets once per day and generate a new task for each trackers once each 15 minutes. The central node also had to monitor all trackers and enable/disable them under user petition. This, as I said, was too slow because I had multiple bottlenecks, so in this new version I want to improve the infrastructure and isolate all functionalities in specific nodes. I also need a good notification system to receive notifications for any node. So, in the next diagram I show the components that I'll need in this new version: As you can see, there are more nodes. Here are some notes about them: Dashboard: Controls trackers statuses and send a single task to each of them (under user request). The trackers will use this task until replaced with a new one (if done, not each 15 minutes like before). Search engine: I need to store all the tweets. They are firstly stored in a local database for each tracker but after that I'm thinking on using something like Elasticsearch to be able to do fast searches. Tweet processor: Just and isolated component with its own database (maybe something like the search engine to have fast access to info generated by the module). In the future more could be added. Application UI: A web application with a shared database with the Dashboard (mainly to store users information and preferences). Indeed, both could be merged into a single web. The main difference with the previous version of the project is that now they will be isolated and they will only show information and send requests. I will not do any heavy task in them (like process tweets as I did before). So, having this components, my main headache is how to structure all to not have to rewrite a lot of code every time I need to access any new data. Another headache is how can I interconnect nodes. I could use sockets but that is a pain in the ass. Maybe a REST layer? And finally, if all the nodes are isolated, how could I generate notifications for each user which info is only in the database used by the Application UI? I'm programming this using Java and Spring (at least I used them in the last version) but I have no problems with changing the language if I can take advantage of a tool/library/engine to make my life easier and have a better platform. Any comment will be appreciated.

Read the article
I've totally missed the point of distributed vcs [closed]

- by NimChimpsky

I thought the major benefit of it was that each developers code gets stored within each others repository. My impression was that each developer has their working directory, their own repository, and then a copy of the other developers repository. Removing the need for central server, as you have as many backups as you have developers/repositories Turns out this is nto the case, and your code is only backed up (somewhere other than locally) when you push, the same as a commit in subversions. I am bit disappointed ... hopefully I will be pleasantly surprised when it handles merges better and there are less conflicts ?

Read the article
Building a Redundant / Distrubuted Application

- by MattW

This is more of a "point me in the right direction" question. I (and my team of 3) have built a hosted web app that queues and routes customer chat requests to available customer service agents (It does other things as well, but this is enough background to illustrate the issue). The basic dev architecture today is: a single page ajax web UI (ASP.NET MVC) with floating chat windows (think Gmail) a backend Windows service to queue and route the chat requests this service also logs the chats, calculates service levels, etc a Comet server product that routes data between the web frontend and the backend Windows service this also helps us detect which Agents are still connected (online) And our hardware architecture today is: 2 servers to host the web UI portion of the application a load balancer to route requests to the 2 different web app servers a third server to host the SQL Server DB and the backend Windows service responsible for queuing / delivering chats So as it stands today, one of the web app servers could go down and we would be ok. However, if something would happen to the SQL Server / Windows Service server we would be boned. My question - how can I make this backend Windows service logic be able to be spread across multiple machines (distributed)? The Windows service is written to accept requests from the Comet server, check for available Agents, and route the chat to those agents. How can I make this more distributed? How can I make it so that I can distribute the work of the backend Windows service can be spread across multiple machines for redundancy and uptime purposes? Will I need to re-write it with distributed computing in mind? I should also note that I am hosting all of this on Rackspace Cloud instances - so maybe it is something I should be less concerned about? Thanks in advance for any help!

Read the article
open source gossip-based membership protocol?

- by Aaron

I am looking for a library which I can plug into a distributed application which implements any gossip-based membership protocol. Such a library would allow me to send/receive membership lists, merge received membership lists, etc... Even better would be if the library implemented a protocol with performance O(logn) performance guarantees. Does anyone know of any open source library like this? It doesn't need to meet all of the aforementioned requirements; even something partially implemented would be helpful.

Read the article
Distributed website server redundancy

- by Keith Lion

Assume a website infrastructure is very complicated and is fully distributed (probably like most large web companies). Am I right in thinking that although there are all these extra web servers to handle multiple client requests, there is still a single "machine" whereby users must enter? I am guessing this machine will be the one physically associated to the IP address? I ask because I need to know whether, in places where distributed systems exist, there is still a single point of failure- usually the control node or, in this example, the machine connected to the public internet? Surely there cannot be two machines connected to the internet, as they would have to have different IP addresses? This "machine" may not be a server per se, but maybe it is a piece of cisco equipment. I just need to know whether, in the real world, these distributed systems still have a particular section where they depend on the integrity of one electronic device?

Read the article
What are the functionalities of Distributed File systems and Distributed Storage Systems?

- by Berkay

i'm reading cloud vendors solutions for the distributed storage systems such as Amazon Dynamo and Google Big Table. and really confused in two terms : what is Distrubuted file systems for in cloud ? what is Distributed storage systems for? what are differences of these terms and functionalities ? if i understand these terms i will create the general architecture of the cloud vendors, any good tutorial or web page will be appreciated. Thanks

Read the article

Search Results

Search found 1706 results on 69 pages for 'distributed'.

Page 1/69 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Eddie

- by MattW

- by sunny

- by Tim van Elteren

- by Mike Reys

- by Songo

- by Bonton255

- by Klaim

- by StevenH

- by Macy Abbey

- by Macy Abbey

- by Ben L.

- by Gregory Burd

- by ltfishie

- by Pinnacle

- by anon

- by javi

- by Asti

- by John Petrak

- by David Moreno García

- by NimChimpsky

- by MattW

- by Aaron

- by Keith Lion

- by Berkay

1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >