openmpi - Developer IT

Openmpi 1.6.3 on ubuntu 12.10

- by torem

I manually installed the tar.gz of openmpi 1.6.3 on Ubuntu 12.10. But now mpif90.openmpi returns the following: Cannot open configuration file /usr/local/share/openmpi/ mpif90.openmpi-wrapper- data.txt Error parsing data file mpif90.openmpi: Not found How can I get mpif90.openmpi get running again? It was running fine if I install openmpi using apt-get install. But that way I will get only version 1.6.1. Thanks.

Read the article

Running OpenMPI on Windows XP

- by iamweird

Hi there. I'm trying to build a simple cluster based on Windows XP. I compiled OpenMPI-1.4.2 successfully, and tools like mpicc and ompi_info work too, but I can't get my mpirun working properly. The only output I can see is Z:\orterun --hostfile z:\hosts.txt -np 2 hostname [host0:04728] Failed to initialize COM library. Error code = -2147417850 [host0:04728] [[8946,0],0] ORTE_ERROR_LOG: Error in file ..\..\openmpi-1.4.2 \orte\mca\ess\hnp\ess_hnp_module.c at line 218 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_plm_init failed -- Returned value Error (-1) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [host0:04728] [[8946,0],0] ORTE_ERROR_LOG: Error in file ..\..\openmpi-1.4.2 \orte\runtime\orte_init.c at line 132 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_set_name failed -- Returned value Error (-1) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [host0:04728] [[8946,0],0] ORTE_ERROR_LOG: Error in file ..\..\..\..\openmpi -1.4.2\orte\tools\orterun\orterun.c at line 543 Where z:\hosts.txt appears as follows: host0 host1 Z: is a shared network drive available to both host0 and host1. What my problem is and how do I fix it? Upd: Ok, this problem seems to be fixed. It seems to me that WideCap driver and/or software components causes this error to appear. A "clean" machine runs local task successfully. Anyway, I still cannot run a task within at least 2 machines, I'm getting following message: Z:\mpirun --hostfile z:\hosts.txt -np 2 hostname connecting to host1 username:cluster password:******** Save Credential?(Y/N) y [host0:04728] This feature hasn't been implemented yet. [host0:04728] Could not connect to namespace cimv2 on node host1. Error code =-2147024891 -------------------------------------------------------------------------- mpirun was unable to start the specified application as it encountered an error. More information may be available above. -------------------------------------------------------------------------- I googled a little and did all the things as described here: http://www.open-mpi.org/community/lists/users/2010/03/12355.php but I'm still getting the same error. Can anyone help me? Upd2: Error code -2147024891 might be WMI error WBEM_E_INVALID_PARAMETER (0x80041008) which occures when one of the parameters passed to the WMI call is not correct. Does this mean that the problem is in OpenMPI source code itself? Or maybe it's because of wrong/outdated wincred.h and credui.lib I used while building OpenMPI from the source code?

Read the article

MPICH vs OpenMPI

- by lava

Can someone elaborate between the OpenMPI and MPICH implementations of MPI ? Which of the two is a better implementation ?

Read the article

Unable to run OpenMPI across more than two machines

- by rcollyer

When attempting to run the first example in the boost::mpi tutorial, I was unable to run across more than two machines. Specifically, this seemed to run fine: mpirun -hostfile hostnames -np 4 boost1 with each hostname in hostnames as <node_name> slots=2 max_slots=2. But, when I increase the number of processes to 5, it just hangs. I have decreased the number of slots/max_slots to 1 with the same result when I exceed 2 machines. On the nodes, this shows up in the job list: <user> Ss orted --daemonize -mca ess env -mca orte_ess_jobid 388497408 \ -mca orte_ess_vpid 2 -mca orte_ess_num_procs 3 -hnp-uri \ 388497408.0;tcp://<node_ip>:48823 Additionally, when I kill it, I get this message: node2- daemon did not report back when launched node3- daemon did not report back when launched The cluster is set up with the mpi and boost libs accessible on an NFS mounted drive. Am I running into a deadlock with NFS? Or, is something else going on?

Read the article

matrix multiplication with MPI [on hold]

- by user3695701

I'm working on an assignment on matrix multiplication with MPI. A*B=C. the requirement is that B should be vertically partitioned. Here's what I intend to do: broadcast matrix A to all processes and scatter B into several slices with each slice containing n/p columns. The following code only works when the number of process(p) is 1. when p1(say 2), I got [cluster2:21080] *** Process received signal *** [cluster2:21080] Signal: Segmentation fault (11) [cluster2:21080] Signal code: Address not mapped (1) [cluster2:21080] Failing at address: (nil) [cluster2:21080] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7f49f38108f0] [cluster2:21080] [ 1] /lib/libc.so.6(memcpy+0xe1) [0x7f49f35024c1] [cluster2:21080] [ 2] /usr/lib/libmpi.so.0(ompi_convertor_unpack+0x121)[0x7f49f47c88e1] [cluster2:21080] [ 3] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x8a26) [0x7f49f0dcea26] [cluster2:21080] [ 4] /usr/lib/openmpi/lib/openmpi/mca_btl_tcp.so(+0x662c) [0x7f49efce462c] [cluster2:21080] [ 5] /usr/lib/libopen-pal.so.0(+0x1ede8) [0x7f49f42e0de8] [cluster2:21080] [ 6] /usr/lib/libopen-pal.so.0(opal_progress+0x99) [0x7f49f42d5369] [cluster2:21080] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x5585) [0x7f49f0dcb585] [cluster2:21080] [ 8] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0xcc01) [0x7f49eeeb1c01] [cluster2:21080] [ 9] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0x266c) [0x7f49eeea766c] [cluster2:21080] [10] /usr/lib/openmpi/lib/openmpi/mca_coll_sync.so(+0x1388) [0x7f49ef0c0388] [cluster2:21080] [11] /usr/lib/libmpi.so.0(MPI_Bcast+0x10e) [0x7f49f47d025e] [cluster2:21080] [12] ./out(main+0x259) [0x401571] [cluster2:21080] [13] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f49f3498c8d] [cluster2:21080] [14] ./out() [0x400f29] [cluster2:21080] *** End of error message *** Can someone help me? Thanks. //matrices A and B //double* A =(double *)malloc(n*n*sizeof(double)); //double* B =(double *)malloc(n*n*sizeof(double)); //code initializing A,B... //n is the size of the matrix //p is the number of processes //myrank is the rank of calling process MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &p); //broadcast A to all processes MPI_Bcast (A, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD); MPI_Datatype tmp_type, col_type; // extract a slice from B MPI_Type_vector(n, num_of_col_per_slice, n, MPI_DOUBLE, &tmp_type); // position of the first (0) and each next (stride * sizeof(double) ) slice MPI_Type_create_resized(tmp_type, 0, n * sizeof(double), &col_type); MPI_Type_commit(&col_type); //scatter a slice of B to each process MPI_Scatter(B, 1, col_type, B+myrank*n/p, n * n/p, MPI_DOUBLE, 0, MPI_COMM_WORLD); //use blas function to calculate A*sliceOfB and store the resulting slice to C cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, n, n/p, n, 1.0, A, n, B+myrank*n/p, n, 0.0, C+myrank*n/p, n); //gather all those resulting slices into C MPI_Gather (C+myrank*n/p, n*n/p, MPI_DOUBLE, C, n*n/p, MPI_DOUBLE, 0, MPI_COMM_WORLD);

Read the article

Possible to distribute an MPI (C++) program accross the internet rather than within a LAN cluster?

- by Ben

Hi there, I've written some MPI code which works flawlessly on large clusters. Each node in the cluster has the same cpu architecture and has access to a networked (i.e. 'common') file system (so that each node can excecute the actual binary). But consider this scenario: I have a machine in my office with a dual core processor (intel). I have a machine at home with a dual core processor (amd). Both machines run linux, and both machines can successfully compile and run the MPI code locally (i.e. using 2 cores). Now, is it possible to link the two machines together via MPI, so that I can utilise all 4 cores, bearing in mind the different architectures, and bearing in mind the fact that there are no shared (networked) filesystems? If so, how? Thanks, Ben.

Read the article

Looking for mpic++

- by unknownthreat

I am following instructions at http://www.boost.org/doc/libs/1_43_0/doc/html/mpi/getting_started.html#mpi.config trying to build Boost MPI .lib files, but I got one problem: I do not have mpic++. Looking at the MPI implementation files such as MPICH2 and Open MPI, I see no mpic++ included at all. Where can I find mpic++?

Read the article

Error mpicc command not found [closed]

- by skn

I want to compile hdf5 but I find the following error: /hdf5/hdf5-1.6.9CC=/usr/local/openmpi/bin/mpicc ./configure /home/sknandi/Research/ Simulation/hdf5/parallel_fdf5 CC=/usr/local/openmpi/bin/mpicc: Command not found. The result of echo $PATH is /hdf5/hdf5-1.6.9echo $PATH /priv/myriad3/ayw/research/COALA/visit/bin:/usr/local/bin:/usr/bin:/bin:/pkg/linux/intel/composerxe-2011.3.174/composerxe-2011.3.174/bin/intel64:/pkg/linux/casa/x86_64:/usr/local/bin:/bin:/usr/bin:/usr/local/openmpi/bin:/pkg/linux/intel/composerxe-2011.3.174/composerxe-2011.3.174/mpirt/bin/intel64:/pkg/linux/SS12/solstudio12.2/bin:/usr/local/vanilla-pds/bin and result of which mpicc is /hdf5/hdf5-1.6.9which mpicc /usr/local/openmpi/bin/mpicc

Read the article

Emacs CEDET and system include paths

- by synasius

Hello everyone, I'd like to add path to the openMPI library headers. So, after i found all openMPI headers are in /usr/lib/openmpi/include/* i added these two lines to my .emacs: (semantic-add-system-include "/usr/lib/openmpi/include" 'c-mode) (semantic-add-system-include "/usr/lib/openmpi/include" 'c++-mode) I think this is ok, but it's not working! This is the result of semantic-c-describe-envirnoment command: This file's system include path is: /usr/include /usr/local/include/ /usr/lib/gcc/i486-linux-gnu/4.4.3/include/ /usr/lib/gcc/i486-linux-gnu/4.4.3/include-fixed/ /usr/include/ Can't figure out what's wrong or what i'm missing Thanks

Read the article

Error while compiling Cuda Accelerated Linpack hpl_2.0_FERMI

- by ghostrustam

I use Ubuntu 11.04 x86_64 CUDA 4.0 OpenMpi 1.4stable MKL When I compile, I get this error: ar r -L/home/limksadmin/hpl-2.0_FERMI_v13/lib/CUDA/libhpl.a HPL_dlacpy.o HPL_dlatcpy.o HPL_fprintf.o HPL_warn.o HPL_abort.o HPL_dlaprnt.o HPL_dlange.o HPL_dlamch.o ar: -L/home/limksadmin/hpl-2.0_FERMI_v13/lib/CUDA/libhpl.a: No such file or directory make[2]: *** [lib.grd] Error 9 make[2]: Leaving directory `/home/limksadmin/hpl-2.0_FERMI_v13/src/auxil/CUDA' make[1]: *** [build_src] Error 2 make[1]: Leaving directory `/home/limksadmin/hpl-2.0_FERMI_v13' make: *** [build] Error 2 Make.CUDA: LAdir = /opt/intel/mkl/lib/intel64 LAlib = -L $(TOPdir)/src/cuda -ldgemm -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -L$(LAdir) -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 MPdir = /usr/local/mpi/openmpi MPinc = -I$(MPdir)/include MPlib = -L$(MPdir)/lib/libmpi.so CC = /usr/local/mpi/openmpi/bin/mpicc What could be the problem?

Read the article

Configuring MPI on 2 nodes

- by Wysek

I'm trying to create really simple "cluster" from 2 multicore computers using openmpi. My problem is that I can't find any tutorials on that matter. I don't want to use torque because it's not necessary in my case nevertheless all tutorials give configuration details either about torque or mpd (which doesn't exist in openmpi implementation). Could you give me some tips or links to appropriate manuals? Steps I've already completed: - openmpi installation - network configuration (computers see each other) - ssh password-less login to second computer I tried using machinefiles without further configuration and with just 2 IPs in it. But jobs don't seem to start at all after initialization part. (MPI seems to work because I'm able to scatter jobs on multiple cores of both computers without communication between them).

Read the article

openmp vs opencl for computer vision

- by user1235711

I am creating a computer vision application that detect objects via a web camera. I am currently focusing on the performance of the application My problem is in a part of the application that generates the XML cascade file using Haartraining file. This is very slow and takes about 6days . To get around this problem I decided to use multiprocessing, to minimize the total time to generate Haartraining XML file. I found two solutions: opencl and (openMp and openMPI ) . Now I'm confused about which one to use. I read that opencl is to use multiple cpu and GPU but on the same machine. Is that so? On the other hand OpenMP is for multi-processing and using openmpi we can use multiple CPUs over the network. But OpenMP has no GPU support. Can you please suggest the pros and cons of using either of the libraries.

Read the article

Linking Error: undefined reference to `MPI_Init' on Windows 7

- by fatpipp

I am using OpenMPI library to write a program to run on Windows 7. I compile and build with C Free 4.0, Mingw. Compiling is Ok but when the compiler links object, errors "undefined reference to ..." occurs. I have set the environment already: I added OpenMPI lib, include and bin folder into C Free Build Directories. I added them into Windows environment variables too. But the error still occurs. Can anyone tell me how to fix it? Thanks a lot.

Read the article

c++ programming for clusters and HPC

- by Abruzzo Forte e Gentile

HI All I need to write a scientific application in C++ doing a lot of computations and using a lot of memory. I have part of the job but due to high requirements in terms of resources I was thinking to start moving to OpenMPI. Before doing that I have a simple curiosity: If I understood the principle of OpenMPI is the developer that has the task of splitting the jobs over different nodes calling SEND and RECEIVE based on node available at that time. Do you know if it does exist some library or OS or whatever that has this capability letting my code reamain as it is now? Basically something that connects all computers and let share as one their memory and CPU? I am a bit confused because of the high material available on the topic. Should I look at cloud computing? or Distributed Shared Memory? Can you help me or address me a bit? Thanks

Read the article

trying to build Boost MPI, but the lib files are not created. What's going on?

- by unknownthreat

I am trying to run a program with Boost MPI, but the thing is I don't have the .lib. So I try to create one by following the instruction at http://www.boost.org/doc/libs/1_43_0/doc/html/mpi/getting_started.html#mpi.config The instruction says "For many users using LAM/MPI, MPICH, or OpenMPI, configuration is almost automatic", I got myself OpenMPI in C:\, but I didn't do anything more with it. Do we need to do anything with it? Beside that, another statement from the instruction: "If you don't already have a file user-config.jam in your home directory, copy tools/build/v2/user-config.jam there." Well, I simply do what it says. I got myself "user-config.jam" in C:\boost_1_43_0 along with "using mpi ;" into the file. Next, this is what I've done: bjam --with-mpi C:\boost_1_43_0>bjam --with-mpi WARNING: No python installation configured and autoconfiguration failed. See http://www.boost.org/libs/python/doc/building.html for configuration instructions or pass --without-python to suppress this message and silently skip all Boost.Python targets Building the Boost C++ Libraries. warning: skipping optional Message Passing Interface (MPI) library. note: to enable MPI support, add "using mpi ;" to user-config.jam. note: to suppress this message, pass "--without-mpi" to bjam. note: otherwise, you can safely ignore this message. warning: Unable to construct ./stage-unversioned warning: Unable to construct ./stage-unversioned Component configuration: - date_time : not building - filesystem : not building - graph : not building - graph_parallel : not building - iostreams : not building - math : not building - mpi : building - program_options : not building - python : not building - random : not building - regex : not building - serialization : not building - signals : not building - system : not building - test : not building - thread : not building - wave : not building ...found 1 target... The Boost C++ Libraries were successfully built! The following directory should be added to compiler include paths: C:\boost_1_43_0 The following directory should be added to linker library paths: C:\boost_1_43_0\stage\lib C:\boost_1_43_0> I see that there are many libs in C:\boost_1_43_0\stage\lib, but I see no trace of libboost_mpi-vc100-mt-1_43.lib or libboost_mpi-vc100-mt-gd-1_43.lib at all. These are the libraries required for linking in mpi applications. What could possibly gone wrong when libraries are not being built?

Read the article

What is the best MPI implementation

- by pvsnp

I have to implement MPI system in a cluster. If anyone here has any experience with MPI (MPICH/OpenMPI), I'd like to know which is better and how the performance can be boosted on a cluster of x86_64 boxes.

Read the article

Sharing information between nodes in Beowulf Cluster

- by Alejandro Sazo

I am setting up a beowulf cluster and I've been reading that it might be necessary to make the home directory of the cluster users shared between them (assuming this users are local to each machine). The other case is leave each user with its own home and the communication is up to the master node. Another idea that came up was to use an LDAP unique user logged on each machine in the cluster, that keeps the idea of the shared home between nodes (but is only one home of one user). Which approach is better for this kind of cluster? Edit: The cluster is running openmpi and it will support cuda and opencl

Read the article

How to solve package issues/dependencies

- by Wolfgang Kuehne

Background info I am trying to install Veins simulation environment by following the tutorial provided by the author. In step 1 it is required to install some packages in Linux, the tutorial suggest this commands to be executed on Terminal: sudo apt-get install build-essential gcc g++ bison flex perl tcl-dev tk-dev blt libxml2-dev zlib1g-dev default-jre doxygen graphviz libwebkitgtk-1.0-0 openmpi-bin libopenmpi-dev libpcap-dev autoconf automake libtool libxerces-c2-dev proj libgdal1-dev libfox-1.6-dev When I execute this command, I immediately get: E: Package 'proj' has no installation candidate Then I remove the proj from the command and execute it again without proj in it, next I get: The following packages have unmet dependencies: libgdal1-dev : Depends: libgdal-dev but it is not going to be installed E: Unable to correct problems, you have held broken packages. So, I remove libgdal1-dev from the command as well. And it executes file, by downloading the remaining packages. To troubleshoot the problem with proj and libdgal1-dev I go to the Synaptic Package Manager. libgdal1-dev I search for libgdal1-dev in Synaptic Package Manager and I get an entry. I Mark for Installation and then Synaptic Package Manager suggests removing libxerces-c2-dev which is actually added via the initial command. Should I trust Synaptic Package Manager with this suggestion, and proceed further? proj What should I do about proj. There are some packages in Synaptic Package Manager such as proj-bin or libproj-dev. Should I install them? I think proj has to do with this and this What should I do to make sure that this simulation tool works fine?

Read the article

How to manage several Linux workstation like a cluster?

- by Richard Zak

How does one go about managing a lab of Linux workstations? I'd like for users to be able to log in, run their GUI apps (LibreOffice, Firefox, Eclipse, etc), and for the computers to be able to be used as compute nodes (OpenMPI). This part I'm fine with. But how can I centrally deploy a new software package or upgrade an installed package? How can I reload the entire OS on a given node, as if these workstations were part of a super computing cluster? Is there a nice program to help with setting up PXE booting and image management, and remotely managing packages? Ideally such a system would work with Ubuntu. If there isn't a nice package, how could this be set up manually?

Read the article

Python error after installing libboost-all-dev on debian [migrated]

- by Cameron Metzke

A friend of mine wanted the liboost libraries installed on our shared computer so after installing libboost-all-dev 1.49.0.1 ( A debian wheezy machine ), I get this error when using the "pydoc modules" command on the commandline. It spits out the following error -- root@debian:/usr/include/c++/4.7# pydoc modules Please wait a moment while I gather a list of all available modules... **[debian:49065] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 357 [debian:49065] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 230 [debian:49065] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ../../../orte/runtime/orte_init.c at line 132 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_set_name failed --> Returned value A system-required executable either could not be found or was not executable by this user (-127) instead of ORTE_SUCCESS -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_mpi_init: orte_init failed --> Returned "A system-required executable either could not be found or was not executable by this user" (-127) instead of "Success" (0) -------------------------------------------------------------------------- *** The MPI_Init() function was called before MPI_INIT was invoked. *** This is disallowed by the MPI standard. *** Your MPI job will now abort. [debian:49065] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!** root@debian:/usr/include/c++/4.7# I tried looking into the problem and ended up uninstalling the following to get it to work again. openmpi common all 1.4.5-1 libibverbs-dev amd64 1.1.6-1 libopenmpi-dev amd64 1.4.5-1 mpi-default-dev amd64 1.0.1 libboost-mpi-python1.49.0 although pydoc works again, I'm assuming the packages I removed are gunna hurt somethiong else down the track ? As you guessed im not a c/c++ programmer. So I guess my question is, will this hurt something later ? is their a way to install those packages without hurting python ?

Search Results

Search found 20 results on 1 pages for 'openmpi'.

Page 1/1 | 1

- by torem

- by iamweird

- by lava

- by rcollyer

- by user3695701

- by Ben

- by unknownthreat

- by skn

- by synasius

- by ghostrustam

- by Wysek

- by user1235711

- by fatpipp

- by Abruzzo Forte e Gentile

- by unknownthreat

- by pvsnp

- by Alejandro Sazo

- by Wolfgang Kuehne

- by Richard Zak

- by Cameron Metzke