Search Results

Search found 1706 results on 69 pages for 'distributed'.

Page 10/69 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

How to distribute multiple executions of an app across many machines

- by Salec

I've got a simulation app (64-bit windows) that runs without any user interaction. This app gathers information and pushes it to a remote MS SQL Server. What I'd like to do is execute this simulation as many times as I can on multiple machines after our nightly build has finished and it has passed the test suite. If possible I'd love to have the ability to configure it to stop after x total runs or if the entire batch has taken over y hours. I've tried using Visual Studio's built in test framework since we already have a test lab set up with multiple agents. I created a single unit test that simply runs the simulation then I created an ordered test and added that single test multiple times (from what I gather, this is the only way to execute the same unit test more than once). I found that ordered tests are only run on a single agent and not distributed which is very limiting. We use TeamCity to perform our nightly builds and I suspect it's possible to implement this on top of that, but I'm fairly new to TeamCity. We also have Jenkins and Bamboo available and I'm open to any other software that would get the job done presuming it runs on a 64-bit Windows OS. Any suggestions?

Read the article
Nagios DNX plugins

- by danneh3826

I'm toying with the idea of multiple Nagios instances setup to monitor our infrastructure. I've looked at all the various methods of distributed Nagios checks, and I think DNX comes out the closest. DNX handles failure of worker nodes, that's fine. What happens if the main DNX server fails though? Is there a way to replicate the server too? I'm using AWS EC2 primarily, so I can utilise Elastic Load Balancing for the web UI, but I need to be able to handle the AZ where the monitoring server is to fail over, and essentially for a second to pick up the checking load (active/passive, active/active, so long as it doesn't fail completely) The other thing I'm trying to solve is an issue with routing. What I'd like is to have multiple nodes report a fault before Nagios confirms it as critical. Not the NRPE checks, as they're pretty self explanitory, but things more like check_ping. I often have routing issues out of AWS to certain datacenters, so Nagios can often report bad/no ping/timeout as a critical issue, even though the machine in question is working fine. Would it be possible to have a setup where a worker complains a service check is critical, and have a second worker node (positioned in another datacenter/AZ) also report the service as critical before the Nagios central server issues a critical alert? I realise I might be asking a bit much (how far down the line do you go setting up failover systems before it starts to get ridiculous), however surely someone must have thought of this scenario when developing DNX?

Read the article
Is LLVM suitable for parallel languages?

- by DSblizzard

What properties of LLVM makes it good choice for implementation of (parallel, concurrent, distributed)-oriented language, what makes it bad?

Read the article
No recent books on MPI: is it dying?

- by Jono

I've never used Message Passing Interface (MPI), but I've heard its name thrown about, most recently with Windows HPC Server. I had a quick look on amazon to see if there were any books on it, but they're all dated around 7 or more years ago. Is MPI still a valid technology choice for new applications, or has it been largely superceded by other distributed programming alternatives (e.g. DataSynapse GridServer)? As it's not really an implementation, but rather a standard, what is the likelihood (assuming it's not dead) that learning it will result in better design of distributed programming systems? Is there something else I should be looking at instead?

Read the article
Distributing cpu-bound compression jobs to multiple computers?

- by barnaby

The other day I needed to archive a lot of data on our network and I was frustrated I had no immediate way to harness the power of multiple machines to speed-up the process. I understand that creating a distributed job management system is a leap from a command-line archiving tool. I'm now wondering what the simplest solution to this type of distributed performance scenario could be. Would a custom tool always be a requirement or are there ways to use standard utilities and somehow distribute their load transparently at a higher level? Thanks for any suggestions.

Read the article
Algorithms for Data Redundancy and Failover for distributed storage system?

- by kennetham

I'm building a distributed storage system that works with different storage sizes. For instance, my storage devices have sizes of 50GB, 70GB, 150GB, 250GB, 1000GB, 5 storage systems in one system. My application will store any files to the storage system. Question: How can I build a distributed storage with the idea of data redundancy and fail-over to store documents, videos, any type of files at the same time ensuring that should one of any storage devices fail, there would be another copy of these files on another storage device. However, the concern is, 50GB of storage can only store this maximum number of files as compared to 70GB, 150GB etc. With one storage in mind, bringing 5 storage systems like a cloud storage, is there any logical way to distribute or store the files through my application? How do I ensure data redundancy through different storage sizes? Is there any algorithm to collate multiple blob files into a single file archive? What is the best solution for one cloud storage with multiple different storage sizes? I open this topic with the objective of discussing the best way to implement this idea, assuming simplicity, what are the issues of this implementation, performance measurements and discussion of the limitations.

Read the article
Delivery of JMS message before the transaction is committed

- by ewernli

Hi, I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple: 1. an EJB inserts a row in the database and sends a message. 2. when the message is delivered with an MDB, the row is read and updated. The problem is that sometimes the message is delivered before the insert has been committed in the database. This is actually understandable if we consider the 2 phase commit protocol: 1. prepare JMS 2. prepare database 3. commit JMS 4. ( tiny little gap where message can be delivered before insert has been committed) 5. commit database I've discussed this problem with others, but the answer was always: "Strange, it should work out of the box". My questions are then: How could it work out-of-the box? My scenario sounds fairly simple, why isn't there more people with similar troubles? Am I doing something wrong? Is there a way to solve this issue correctly? Here are a bit more details about my understanding of the problem: This timing issue exist only if the participant are treated in this order. If the 2PC treats the participants in the reverse order (database first then message broker) that should be fine. The problem was randomly happening but completely reproducible. I found no way to control the order of the participants in the distributed transactions in the JTA, JCA and JPA specifications neither in the Glassfish documentation. We could assume they will be enlisted in the distributed transaction according to the order when they are used, but with an ORM such as JPA, it's difficult to know when the data are flushed and when the database connection is really used. Any idea?

Read the article
How do you save the state of a computation in while using SNOW (or Multicore or ...)

- by James

From hard experience I've found it useful to occasionally save the state of my long computations to disk to start them up later if something fails. Can I do this in a distributed computation package in R (like SNOW or multicore)? It does not seem clear how this could be done since the master is collecting things from the slaves in a non-transparent way.

Read the article
Which is more robust and scalable method?

- by Dhruv Arya

I am implementing a distributed chat system, in this system we have the following options : Make the client and server running at each node run as separate threads. The server acting as the receiver will be running as the daemon thread and the client taking the user input as a normal thread. Fork two processes one for the client and one for the server. I am not able to reason out with which one to proceed. Any insight would be great !

Read the article
Definition of a simple MAP Reduce API

- by Zubair

I am developing a distributed processing API in Java, Erlang and Ruby. What basic commands can I include from which I can build mapreduce, pipelining, and all the most used parallell algorithms on top of it.

Read the article
has anyone produced an in-memory GIT repository?

- by Andrew Matthews

I would like to be able to take advantage of the benefits of GIT (and its workflows), but without the cost of disk access - I just would like to leverage the distributed revision control capabilities of GIT to produce something like a hybrid of memcached and GIT. (preferably in .NET) Is there such a beast out there?

Read the article
Possible to distribute or parallel process a sequential program?

- by damigu

In C++, I've written a mathematical program (for diffusion limited aggregation) where each new point calculated is dependent on all of the preceding points. Is it possible to have such a program work in a parallel or distributed manner to increase computing speed? If so, what type of modifications to the code would I need to look into?

Read the article
What electronic scrum/kanban board do you use and recommend for distributed teams?

- by Derick Bailey

I have a coworker on a team that is fairly distributed, fairly large (for our company) and wants to take advantage of visual management tools like scrum / kanban boards. Since they are a somewhat distributed team, though, all of the issue management / work management must be done via an electronic tool (we currently use Trac). What issue / work management tools, with a visualization of a scrum / kanban board, do you use for your distributed scrum / kanban teams? would you recommend it, and if so, why?

Read the article
What eletronic scrum/kanban board do you use and recommend for distributed teams?

- by Derick Bailey

I have a coworker on a team that is fairly distributed, fairly large (for our company) and wants to take advantage of visual management tools like scrum / kanban boards. Since they are a somewhat distributed team, though, all of the issue management / work management must be done via an electronic tool (we currently use Trac). What issue / work management tools, with a visualization of a scrum / kanban board, do you use for your distributed scrum / kanban teams? would you recommend it, and if so, why? Thanks.

Read the article
GlusterFS vs Ceph, which is better for production use for the moment?

- by Mickey Shine

I am evaluating GlusterFS and Ceph, seems Gluster is FUSE based which means it may be not as fast as Ceph. But looks like Gluster got a very friendly control panel and is ease to use. Ceph was merged into linux kernel a few days ago and this indicates that it has much more potential energy and may be a good choice in the future. I am wondering which(even out of the two?) is a better choice for production use? It would be nice if you could share your practical experiences

Read the article
How stable is POHMELFS?

- by Ztyx

I am currently looking into POHMELFS because of its ability to scale reads. Does anyone have it in production and could tell me how stable it is?

Read the article
gpfs: adding a new nsd server to a cluster

- by alessandra

I have a gpfs cluster composed by 10 linux nodes, managed by a primary server A, which also act as nsd server for a first stack of disks. I attached a new jbod to one of the nodes (call it node B), which I would like to become a nsd server for this new stack of disks, but still be included in the cluster so that the disks are available to all the nodes. Node B is connected to the cluster via ethernet. How can I make the new nsd seen by all the nodes of the cluster? I can create the new nsd but when trying to create the filesystem on node B it the command mmcrfs times out. It looks like the nodes of the cluster cannot understand the filesystem location even if I specify them attached to server B in the description file. Would it be better to remove node B from the cluster, create a cluster on its own with its attached filesystem and connect it remotely with the previous cluster? Or a clustered NFS solution would apply better? Can you please give me any suggestion?

Read the article
Windows DFS - file locking & replication?

- by Adam Salkin

I'm in a small company that has offices on the east and west coasts of America and also various people working from their homes. There are Windows Servers already in the offices. I think that Microsoft Windows DFS will do what I want, but despite reading the web site, I'm really not sure, so I'm hoping that someone can confirm if it will do all the following: (For various personnel / political reasons I know that a proposal for a Microsoft Windows system has more chance of being accepted than any *nix system) Creation of a Folder so that any files in this folder will automatically be available on the servers in all the offices. When anyone opens up one of these shared files on any of servers, the copies on all the servers will automatically be locked. And when they close the file, the updates automatically get copied to the file on all the servers. VPN access to these folders for people working outside the offices. Bandwidth at the main offices varies from 6 Mb/s to 20Mb/s. Files are Excel / Word / AutoCAD ranging in size from 100KB to 4MB. Thank you.

Read the article
Umount stale glusterfs partition

- by Khaled

I am using glusterfs on several Ubuntu servers: two of them are running glusterfs servers in replication mode. Without any clear error, the glusterfs partition became stale and the system shows this error when I try to access the stale partition: Transport endpoint is not connected Also, when running ls -l on the parent folder I get: d????????? ? ? ? ? ? myfolder I tried all types of commands that I can find to umount this partition, but I could not get it done: umount -l /path/to/mount/point umount -f /path/to/mount/point Also, using fuser command to show processes accessing this folder did not work. Unload the fuse kernel module can not be done as it is clear from the kernel config that fuse is built into the kernel and not a loadable module. I found this line in /boot/config-2.6.32-24-server CONFIG_FUSE_FS=y I have been left with two options: Reboot the system. Create another mount point like myfolder2 and mount this again using sudo glusterfs -f /etc/glustefs/glusterfs.vol /path/to/folder2. Of course, I have chosen to go with option 2. Anyone faced such an issue before? Anyone has a better solution for such a case?

Read the article
Cloud Computing - Multiple Physical Computers, One Logical Computer

- by Koobz

I know that you can set up multiple virtual machines per physical computer. I'm wondering if it's possible to make multiple physical computers behave as one logical unit? Fundamentally the way I imagine it working is that you can throw 10 computers into a facility one day. You've got one client that requires the equivalent of two computers worth, and 100 others that eat up the remaining 8. As demands change you're just reallocating logical resources, maybe the 2 computer client now requires a third physical system. You just add it to the cloud, and don't worry about sharding the database, or migrating data over to a new server. Can it work this way? If yes, why would anyone ever do things like partition their database servers anymore? Just add more computing resources. You scale horizontally with the hardware, but your server appears to scale vertically. There's no need to modify your application's infrastructure to support multiple databases etc.

Read the article
How does NFS read cache work on Debian?

- by Ztyx

I am planning to use NFS to serve out many small files. They will be read very often so client side caching is crucial. Does NFS handle this? Is there a way to increase the client side caching in some way? ...or should I look at another solution? Syncing using rsync or unison periodically is not an option since the files are modified on the client side from time to time.

Read the article
DFS - Stop sync of large folder that has since been removed

- by g18c

We have a site to site DFSR on Windows Server 2008 R2 that has been running perfectly between site A to site B until someone dumped a 20GB folder. This has overwhelmed the upload and make the internet almost useless at site A (the upload is low at the branch office). We have removed this folder from the DFS share on site A, however the internet is still really slow. Is there any way to cancel this sync or other way to get DFSR back in to a happy state?

Read the article
Is there any way to distribute x264 encoding jobs across multiple computers (to increase the encoding speed)?

- by Breakthrough

Does anyone know of a current, active solution to encoding x264 videos across many computers (via the network) to increase encoding FPS? Brownie points for cross-platform and open source, but just so you all know, I usually use Windows. Programs that I have heard of, and why I do not believe they are suitable: x264farm: Not actively developed. Good interface, but does not support two-pass encoding, and fails with newer x264 builds. ELDER: Again, not actively developed, but my issue was that it didn't work with new x264 builds, and it was very difficult to configure (read: randomly stopped working). While I don't absolutely need a program which is being actively developed, I would like one that supports two-pass encoding, and works with new(er) x264 builds. Additional information: So far, I've offered (and awarded!) two separate bounties on this question since I first posted it over two years ago, and I still haven't found a solution to this problem. What I'm looking for basically is a simple program to allow me to encode x264 videos using the processing power of multiple computers connected over a LAN. Furthermore, it would be nice if it worked with new(er) x264 builds, and supported two-pass encoding. If at any time someone has an updated answer, or a new solution to this problem, please post it and it will be given some consideration.

Read the article
Problems when loop over a series of ssh-ed commands

- by Jack Medley

I have a series of server machines which I want to run the same command on. Each command takes hours and (even though I am running the commands using nohup and setting them to run in the background) I have to wait for each to finish before the next starts. Here is roughly how I have set it up: On the host machines: for i in {1..9}; do ssh RemoteMachine${i} ./RunJobs.sh; done Where RunJobs.sh on each remote machine is: source ~/.bash_profile cd AriadneMatching for file in FileDirectory/Input_*; do nohup ./Executable ${file} & done exit Does anyone know of a way such that I dont have to wait for each job to finish before the next starts? Or alternatively a better way of doing this, I have a feeling what I am do is fairly sub-optimal. Cheers, Jack

Read the article
Why doesn't SSHFS let me look into a mounted directory?

- by Jan

I use SSHFS to mount a directory on a remote server. There is a user xxx on client and server. UID and GID are identical on both boxes. I use sshfs -o kernel_cache -o auto_cache -o reconnect -o compression=no \ -o cache_timeout=600 -o ServerAliveInterval=15 \ [email protected]:/mnt/content /home/xxx/path_to/content to mount the directory on the remote server. When I log in as xxx on the client I have no problems. I can cd into /home/xxx/path_to/content. But when I log in on the client as another user zzz and then $ ls -l /home/xxx/path_to I get this d????????? ? ? ? ? ? content and on $ ls -l /home/xxx/path_to/content I get ls: cannot access content: Permission denied When I do $ ls -l /mnt on the remote server I get drwxr-xr-x 6 xxx xxx 4096 2011-07-25 12:51 content What am I doing wrong? The permissions seem to be correct to me. Am I wrong?

Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >