Search Results

Search found 1706 results on 69 pages for 'distributed'.

Page 4/69 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Experience with MooseFS?

- by brown.2179

Anyone have any experience using MooseFS? I want an easy distributed storage platform to store static data archive of about 10 TB and serve it to 20-40 nodes. Also I want to be able to add storage as the archive grows without having to rebuild the filesystem. I don't care if it's a bit slow. I just want it to be simple and stable. Basically from what I can see for OS X it's between MooseFS and Gluster. Any other suggestions?

Read the article
Tuning Distributed Applications to Access Big Data

Distributed applications are just that: distributed across one or more hardware platforms across the enterprise. The database administrator (DBA) has the unenviable task of monitoring these environments and configuring and tuning the database server to meet multiple needs. As multiple distributed applications now require access to a very large data store, what tuning options are available to help? Get your SQL Server database under version control now!Version control is standard for applications, but databases haven’t caught up. So how can you bring database development up to speed? Why should you start? Find out…

Read the article
Looking for efficient scaling patterns for Silverlight application with distributed text-file data s

- by Edward Tanguay

I'm designing a Silverlight software solution for students and teachers to record flashcards, e.g. words and phrases that students find while reading and errors that teachers notice while teaching. Requirements are: each person publishes his own flashcards in a file on a web server, e.g. http://:www.mywebserver.com/flashcards.txt other people subscribe to that person's flashcards by using a Silverlight flashcard reader that I have developed and entering the URLs of flashcard files they want to subscribe to, URLs and imported flashcards being saved in IsolatedStorage the flashcards.txt file has the following simple format: title, then blocks of question/answers: Jim Smith's flashcards from English class 53-222, winter semester 2009 ==fla Das kann nicht sein. That can't be. ==fla Es sei denn, er kommt nicht. Unless he doesn't come. The user then makes public the URL to his flashcard file and other readers begin reading in his flashcards. In order to lower the bar for non-technical users to contribute, it will even be possible for them to save this text in a Google Document, which they publish and distribute the URL. The flashcard readers will then recognize it is a google document and perform the necessary screen scraping to get at the raw text. I have two technical questions about this approach: What is a best way to plan now for scalability issues: e.g. if your reader is subscribed to 10 flashcard files that are each 200K, it will have to download 2MB of text just to find out if any new flashcards are available. Or can I somehow accurately and consistently get at the last update date/time of text files on servers and published google docs? Each reader will have the ability to allow the person to test himself on imported flashcards and add meta information to them, e.g. categorize them, edit them, etc. This information will be stored in IsolatedStorage along with the important flashcards themselves. What is a good pattern to allow these readers to share and synchronize this meta data, e.g. so when you are looking at a flashcard you can see that 5 other people have made corrections to it. The best solution I can think of now is that the Silverlight readers will have to republish their data to a central database, but then there is the problem of uniquely identifying each flashcard, the best approach seems to be URL + position-in-file, or even better URL + original text of both question and answer fields, but both of these have their obvious drawbacks. The main requirement is that the bar for participation is kept as low as possible, i.e. type text in a google document, publish it, distribute the URL, and you're publishing within the flashcard community. So I want to come up with the most efficient technical solutions in order to compensate for the lack of database, lack of unique ids, etc. For those who have designed or developed similar non-traditional, distributed database projects like this, what advice, experience or best-practice tips you can share on the above two points?

Read the article
Choosing a distributed shared memory solution

- by mindas

I have a task to build a prototype for a massively scalable distributed shared memory (DSM) app. The prototype would only serve as a proof-of-concept, but I want to spend my time most effectively by picking the components which would be used in the real solution later on. The aim of this solution is to take data input from an external source, churn it and make the result available for a number of frontends. Those "frontends" would just take the data from the cache and serve it without extra processing. The amount of frontend hits on this data can literally be millions per second. The data itself is very volatile; it can (and does) change quite rapidly. However the frontends should see "old" data until the newest has been processed and cached. The processing and writing is done by a single (redundant) node while other nodes only read the data. In other words: no read-through behaviour. I was looking into solutions like memcached however this particular one doesn't fulfil all our requirements which are listed below: The solution must at least have Java client API which is reasonably well maintained as the rest of app is written in Java and we are seasoned Java developers; The solution must be totally elastic: it should be possible to add new nodes without restarting other nodes in the cluster; The solution must be able to handle failover. Yes, I realize this means some overhead, but the overall served data size isn't big (1G max) so this shouldn't be a problem. By "failover" I mean seamless execution without hardcoding/changing server IP address(es) like in memcached clients when a node goes down; Ideally it should be possible to specify the degree of data overlapping (e.g. how many copies of the same data should be stored in the DSM cluster); There is no need to permanently store all the data but there might be a need of post-processing of some of the data (e.g. serialization to the DB). Price. Obviously we prefer free/open source but we're happy to pay a reasonable amount if a solution is worth it. In any way, paid 24hr/day support contract is a must. The whole thing has to be hosted in our data centers so SaaS offerings like Amazon SimpleDB are out of scope. We would only consider this if no other options would be available. Ideally the solution would be strictly consistent (as in CAP); however, eventual consistence can be considered as an option. Thanks in advance for any ideas.

Read the article
Release management with a distributed version control system

- by See Sharp Cheddar

We're considering a switch from SVN to a distributed VCS at my workplace. I'm familiar with all the reasons for wanting to using a DVCS for day-to-day development: local version control, easier branching and merging, etc., but I haven't seen that much that's compelling in terms of managing software releases. Here's our release process: Discover what changes are available for merging. Run a query to find the defects/tickets associated with these changes. Filter out changes associated with "open" tickets. In our environment, tickets must be in a closed state in order to merged with a release branch. Filter out changes we don't want in the release branch. We are very conservative when it comes to merging changes. If a change isn't absolutely necessary, it doesn't get merged. Merge available changes, preferably in chronological order. We group changes together if they're associated with the same ticket. Block unwanted changes from the release branch (svnmerge block) so we don't have to deal with them again. Sometimes we can be juggling 3-5 different milestones at a time. Some milestones have very different constraints, and the block list can get quite long. I've been messing around with git, mercurial and plastic, and as far as I can tell none of them address this model very well. It seems like they would work very well when you have only one product you're releasing, but I can't imagine using them for juggling multiple, very different products from the same codebase. For example, cherry-picking seems to be an afterthought in mercurial. (You have to use the 'transplant' command). After you cherry-pick a change into a branch it still shows up as an available integration. Cherry-picking breaks the mercurial way of working. DVCS seems to be better suited for feature branches. There's no need for cherry-picking if you merge directly from a feature branch to trunk and the release branch. But who wants to do all that merging all the time? And how do you query for what's available to merge? And how do you make sure all the changes in a feature branch belong together? It sounds like total chaos. I'm torn because the coder in me wants DVCS for day-to-day work. I really want it. But I fear the day when I have to put the release manager hat and sort out what needs to be merged and what doesn't. I want to write code, I don't want to be a merge monkey.

Read the article
How to rewrite a TCP MMOG server designed to run in a single machine, in a distributed way?

- by Dokkat

I have a MMOG server running on C++, using winsockets. My server won't support more than 200 players. I had the idea of redesigning it so it will use multiple servers instead of one, so, maybe, for example, each server could take care of a number of players, and, if it was too laggy, it could transfer the responsability of that player to other server. I'm not sure of how to program a consistent game logic like that, though. Are there techniques for this?

Read the article
How DBAs Can Tune Distributed IBM DB2 Applications

Many critical business applications now execute in an environment separate from that of the enterprise database server. The database administrator often finds monitoring and performance tuning of these "distributed" applications to be especially difficult. This article looks at common performance issues of distributed applications and presents advice to assist the IBM DB2 database administrator in mitigating performance problems.

Read the article
MSDTC attempts to enlist client machine in a distributed transaction

- by Ken

Hi there We're seeing the following intermittent warning logged by MSDTC: A caller has attempted to propagate a transaction to a remote system, but MSDTC network DTC access is currently disabled on machine 'X'. Please review the MS DTC configuration settings. However, MSDTC is disabled on machine X by design - it's a client machine, and has no business being enlisted in the transaction! Several windows service endpoints hosting WCF services over TCP Single SQL Server 2005 instance beneath Linq to Sql Remote client receives event callbacks over WCF/TCP The issue is tricky to reproduce - usually following restart of services. We suspect a callback to the client machine is occurring within the context of a transaction. Just wondering if anyone has seen similar issues?? Ken

Read the article
Ehcache - Distributed RMI not working

- by Ted

Hi, I have this strange problem with ehcache 2.0 that I hope someone can help me with. I have set up a cluster of two hosts, A and B. I can see that heartbeats are received at both ends, so I'm pretty sure the networking and multicast stuff is working. The problem is that is I put an element into the cache at host A, I can see in the logs of host B that it receives a remote put. But when I request the same element from host B, it runs off to the data base and performs a query nonetheless. What may be the cause of this? Thankful for any pointers!

Read the article
running a python script where dependencies are not avail: distributed computing

- by sadhu_

Hi, I have access to a grid (running condor) that would (potentially) allow to very substantially reduce how long by nltk based nlp tasks take. unfortunately, i dont have root access on the cluster so cannot install new packages, only run whatever is available on the linux boxes. python is of course available, but nltk isnt - i was wondering however, if there might be a way around this somehow ? is there a way i can somehow still distribute the task in a self-contained 'package' of some sort? Thanks for your hel

Read the article
Sphinx search distributed index tuning

- by Andriy Bohdan

I'm deciding how to split 3 large sphinx indexes between 3 servers. Each of the 3 indexes is searched separately. What's more effective: to host each index on separate machine Example machine1 - index1 machine2 - index2 machine3 - index3 or to split each index into 3 parts and host each part of the same index on separate machine. Example machine1 - index1_chunk1, index2_chunk1, index3_chunk1 machine2 - index1_chunk2, index2_chunk2, index3_chunk2 machine3 - index1_chunk3, index2_chunk3, index3_chunk3 ?

Read the article
Advice on designing and building distributed application to track vehicles

- by dario-g

I'm working on application for tracking vehicles. There will be about 10k or more vehicles. Each will be sending ~250bytes in each minute. Data contains gps location and everything from CAN Bus (every data that we can read from vehicle computer and dashboard). Data are sent by GSM/GPRS (using UDP protocol). Estimated rows with this data per day is ~2000k. I see there 3 main blocks. 1. Multithreaded Socket Server (MSS) - I have it. MSS stores received data to the queue (using NServiceBus). 2. Rule Processor Server (RPS) - this is core of this system. This block is responsible for parsing received data, storing in the database, processing rules, sending messages to Notifier Server (this will be sending e-mails/sms texts). Rule example. As I said between received bytes there will be information about current speed. When speed will be above 120 then: show alert in web application for specified users, send e-mail, send sms text. (There can be more than one instance of RPS). 3. Web application - allows reporting and defining rules by users, monitoring alerts, etc. I'm looking for advice how to design communication between RPS and Web application. Some questions: - Should Web application and RPS have separated databases or one central database will be enough? I have one domain model in web application. If there will be one central database then can I use the same model (objects) on RPS? So, how to send changed rules to RPS? I try to decouple this blocks as much as possible. I'm planning to create different instance of application for each client (each client will have separated database). One client will be have 10k vehicles, others only 100 vehicles.

Read the article
Distributed Lucene.NET

- by user72185

Hi, I have a Terabyte of data, maybe more, which I'd like to index and search with Lucene. I'd like to be able to split the index out to different machines, similar to what Solr does (if I understand Solr correctly). Are there any existing tools to do this on the Windows platform? Thanks!

Read the article
Distributed Message Ordering

- by sbanwart

I have an architectural question on handling message ordering. For purposes of this question, the transport is irrelevant, so I'm not going to specify one. Say we have three systems, a website, a CRM and an ERP. For this example, the ERP will be the "master" system in terms of data ownership. The website and the CRM can both send a new customer message to the ERP system. The ERP system then adds a customer and publishes the customer with the newly assigned account number so that the website and CRM can add the account number to their local customer records. This is a pretty straight forward process. Next we move on to placing orders. The account number is required in order for the CRM or website to place an order with the ERP system. However the CRM will permit the user to place an order even if the customer lacks an account number. (For this example assume we can't modify the CRM behavior) This creates the possibility that a user could create a new customer, and place an order before the account number gets updated in the CRM. What is the best way to handle this scenario? Would it be best to send the order message sans account number and let it go to an error queue? Would it be better to have the CRM endpoint hold the message and wait until the account number is updated in the CRM? Maybe something completely different that I haven't thought of? Thanks in advance for any help.

Read the article
Best Work Queue service for distributed clusters

- by onewheelgood

Hi there. I require a simple work queue type system for asynchronous task management. I have looked at both beanstalkd and gearman. However, both these seem to assume that the client and the queue server are on the same network, and therefore that there will always be a reliable network between them. I need one that can support the client and server being in different places in the world, and be able to manage temporary loss of network connection between clusters. Ideally, this would work in such a way where I post a job to a local proxy that attempts to send it to the main queue server. If there is no network connection, it would try again later, however it would not lose the job or delay the client. Any recommendations?

Read the article
JPA in distributed Java EE configuration

- by sof

Hello, I'm developing a JEE application to run on Glassfish: Database (javaDB, MS SQL, MySQL or Oracle) EJB layer with JPA (Toplink essentials - from Glassfish) for database access JSF/Icefaces based web UI accessing the EJB layer The application will have a lot of concurrent web client, so I want to run it on different physical servers and use a load-balancer. My problem is now how to keep the applications synchronized. I intend to set up multiple servers, each running Glassfish with my EAR app installed. Whenever on one of the servers data is added to or removed from the database (via JPA, no direct SQL queries), this change should be reflected in the JPA layer on the other servers. I've been looking around for solutions to this, but couldn't find anything I really like (the full Toplink from Oracle claims to have a solution, but don't know). Doing a refresh before every access to a JPA entity could work, but is far from efficient. Are there any patterns, libraries, ... that could help here? Thanks a lot!

Read the article
hibernate distributed 2nd level cache options

- by ishmeister

Not really a question but I'm looking for comments/suggestions from anyone who has experiences using one or more of the following: EhCache with RMI EhCache with JGroups EhCache with Terracotta Gigaspaces Data Grid A bit of background: our applications is read only for the most part but there is some user data that is read-write and some that is only written (and can also be reasonably inaccurate). In addition, it would be nice to have tools that enable us to flush and fill the cache at intervals or by admin intervention. Regarding the first option - are there any concerns about the overhead of RMI and performance of Java serialization?

Read the article
Understanding omission failure in distributed systems

- by karthik A

The following text says this which I'm not able to quite agree : client C sends a request R to server S. The time taken by a communication link to transport R over the link is D. P is the maximum time needed by S to recieve , process and reply to R. If omission failure is assumed ; then if no reply to R is received within 2(D+P) , then C will never recieve a reply to R . Why is the time here 2(D+P). As I understand shouldn't it be 2D+P ?

Read the article
High performance distributed asynchronous RPC in java

- by unludo

I would like to do RPC to a list of clients with the following requirements: the server does not know the clients (implies a kind of broker?) and the cleints do not know the server there may be several clients - they share the load to treat the RPC The RPC is asynchronous very fast (round-trip < 1ms) optional : offers a fail-over mechanism. It can be done with underlying tools which are not really intended for that (Hazelcast is an example). What would you use for such requirements? Thanks!

Read the article
Generic DRM (Distributed resource management) wrapper

- by Pavel Bernshtam

I need to write a software, which launches DRM jobs in a customer environment and monitors those jobs status. It should work with various customer environments and DRMs - like LSF, Sun Grid and others. Can you recommend some 3rd party library, which hides DRM differences from me and has API like "launch job", "get list of jobs", "get job status" etc. ? Both Java and native libraries are good for me.

Read the article
Versioning millions of files with distributed SCM

- by C. Lawrence Wenham

I'm looking into the feasibility of using off-the-shelf distributed SCMs such as Git or Mercurial to manage millions of XML files. Each file would be a commercial transaction, such as a purchase order, that would be updated perhaps 10 times during the lifecycle of the transaction until it is "done" and changes no more. And by "manage", I mean that the SCM would be used to not just version the files, but also to replicate them to other machines for redundancy and transfer of IP. Lets suppose, for the sake of example, that a goal is to provide good performance if it was handling the volume of orders that Amazon.com claimed to have at its peak in December 2010: about 150,000 orders per minute. We're expecting the system to be distributed over many servers in order to get reasonable performance. We're also planning to use solid-state drives exclusively. There is a reason why we don't want to use an RDBMS for primary storage, but it's a bit beyond the scope of this question. Does anyone have first-hand experience with the performance of distributed SCMs under such a load, and what strategies were used? Open-source preferred, since the final product is to be FOSS, too.

Read the article
Standard Network Tiers in a Distributed N-Tier System

Distributed N-Tier client/server architecture allows for segments of an application to be broken up and distributed across multiple locations on a network. Listed below are standard tiers in a Distributed N-Tier System. End-User Client Tier The End-User Client is responsible for sending and receiving requests from web servers and other applications servers and translating the responses so that the End-User can interpret the data effectively. The primary roles for this tier are to communicate with servers and translate server responses back to the end-user to interpret. Business-Specific Functions Validate Data Display Data Send Data to Webserver Web Server Tier The Web server tier processes new requests for information coming in from the HTTP and HTTPS ports. This primarily handles the generation of user interfaces and calls the application server when needed to access Data and business logic when needed. Business-specific functions Send Data to application server Format Data for Display Validate Data Application Server Tier The application server stores and executes predefined business logic that is applied to various pieces of data as the business determines. The processed data is then returned back to the Webserver. Additionally, this server directly calls the database to obtain and store any data used by the system Business-Specific Functions Validate Data Process Data Send Data to Database Server Database Server Tier The Database Server is responsible for storing and returning all data need by the calling applications. The primary role for this this server is storage. Data is stored as needed and can be recalled at any point later in time. Business-Specific Functions Insert Data Delete Data Return Data to Application Server

Read the article
Are there any viable DNS or LDAP alternatives for distributed key/value storage and retrieval?

- by makerofthings7

I'm working on a software app that needs distributed decentralized name resolution, and isn't bound to TCP/IP. Or more precisely, I need to store a "key" and look up it's value, and the key may be a string, a number, or any other realistic data type. Examples: With a phone number, look up a name. (or with an area code, redirect to the server that handles that exchange) With an IP Address get a DNS name, or a Whois contact (string value) With a string, get an IP, ( like a DNS TXT or SRV record). I'm thinking out of the box here and looking for any software that allows for this. (more info below) Are there any secure, scalable DNS alternatives that have gained notoriety? I could ask on StackOverflow, but think the infrastructure groups would have better insight on this. Edit More info: I'm looking at "Namecoin" the DNS version of Bitcoin, and since that project is faltering, I'm looking at alternative ways to store name-value pairs, with an optional qualifier. I think a name value pair is of global interest is useful, but on a limited scale. Namecoin tried to be too much, and ended up becoming nothing. I'm trying to solve that problem in researching alternatives and applying distributed technologies where applicable. Bitcoin/Namecoin offers a Distributed Hash Table, which has some positive aspects, but not useful for DNS, except for root servers.

Read the article
SQLAuthority News – Mark the Date: October 16, 2013 – Introducing NuoDB Blackbirds: THE Distributed Database

- by Pinal Dave

I am very excited to announce first on this blog about the release of NuoDB Blackbirds (NuoDB Release 2.0). NuoDB is my favorite application to work with data now a days. They are increasingly gaining market share as well as brining out new features with their every new release. I was very excited when I learned that NuoDB is releasing their flagship release of 2.0 on October 16, 2013. Interesting enough I will be in USA while this release happens and I will be watching it live during my day time. Even though if I had to stay up the entire night to just watch this release, I would do it. Here is the details of the announcements: Introducing NuoDB Blackbirds: THE Distributed Database Date: October 16, 2013 Time: 1:00 PM EDT Location: Online Registration Link What is the best DBMS architecture to handle today’s and tomorrow’s evolving needs? The days of shared disk are over. The times are “a-changin” and IT infrastructure has to change with them. Join NuoDB live for the introduction of our latest major product release, NuoDB Blackbirds, and take a look at why the NuoDB distributed database architecture is the only answer for customers like Fathom Voice, a leading provider of Voice Over IP (VoIP). NuoDB CEO, Barry Morris, welcomes Cameron Weeks, CEO of Fathom Voice to discuss how his company is using DBMS to break away from the pack and become the hottest player in VoIP. The webcast will include demonstrations of a single, logical database running in multiple geographies and a live Q&A. If due to any reason, you cannot watch it live, do not worry at all, just register at this Registration Link, as after the event you will get the link to watch the event on-demand. You can watch the launch event at any time if you have registered for the launch. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: NuoDB

Read the article
Test Data in a Distributed System

- by Davin Tryon

A question that has been vexing me lately has been about how to effectively test (end-to-end) features in a distributed system. Particuarly, how to effectively manage (through time) test data for feature testing. The system in question is a typical SOA setup. The composition is done in JavaScript when call to several REST APIs. Each service is built as an independent block. Each service has some kind of persistent storage (SQL Server in most cases). The main issue at the moment is how to approach test data when testing end-to-end features. Functional end-to-end testing occurs through the UI, and it is therefore necessary for test data to be set up before the test run (this could be manual or automated testing). As is typical in a distributed system, identifiers from one service are used as a link in another service. So, some level of synchronization needs to be present in the data to effectively test. What is the best way to manage and set up this data after a successful deployment to a test environment? For example, is it better to manage this test data inside each service? Or package it together with the testing suite? Does that testing suite exist as a separate project? I'm interested in design guidance about how to store and manage this test data as the application features evolve.

Read the article

Search Results

Search found 1706 results on 69 pages for 'distributed'.

Page 4/69 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by brown.2179

- by Edward Tanguay

- by mindas

- by See Sharp Cheddar

- by Dokkat

- by Ken

- by Ted

- by sadhu_

- by Andriy Bohdan

- by dario-g

- by user72185

- by sbanwart

- by onewheelgood

- by sof

- by ishmeister

- by karthik A

- by unludo

- by Pavel Bernshtam

- by C. Lawrence Wenham

- by makerofthings7

- by Pinal Dave

- by Davin Tryon

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >