Search Results

Search found 6716 results on 269 pages for 'distributed algorithm'.

Page 138/269 | < Previous Page | 134 135 136 137 138 139 140 141 142 143 144 145  | Next Page >

  • Nagios: Central Monitoring

    <b>Begin Linux:</b> "This is part two of a three part series on distributed monitoring. You can usehttp://newsadmin.linuxtoday.com/ passive service and host checks to allow non-central Nagios servers to collect data from a network of machines and then transfer that information to a central Nagios server."

    Read the article

  • DRBD and MySQL - Virtualbox Setup

    DRBD is a Linux project that provides a real-time distributed filesystem. Sean Hull demonstrates how to use Sun's virtualbox software to create a pair of VMs, then configure those VMs with DRBD, and finally install and test MySQL running on volumes sitting on DRBD.

    Read the article

  • DRBD and MySQL - Virtualbox Setup

    DRBD is a Linux project that provides a real-time distributed filesystem. Sean Hull demonstrates how to use Sun's virtualbox software to create a pair of VMs, then configure those VMs with DRBD, and finally install and test MySQL running on volumes sitting on DRBD.

    Read the article

  • Article Sharing &ndash; Windows Azure Memcached Plugin

    - by Shaun
    I just found that David Aiken, a windows azure developer and evangelist, wrote a cool article about how to use Memcached in Windows Azure through the new feature Azure Plugin. http://www.davidaiken.com/2011/01/11/windows-azure-memcached-plugin/ I think the best solution for distributed cache in Azure would be the Windows Azure AppFabric Caching but since it’s only in CTP and avaiable in the US data center David’s solution would be the best. Only one thing I’m concerning about, is the stability of windows verion Memcached.

    Read the article

  • Tellago releases a RESTful API for BizTalk Server business rules

    - by Charles Young
    Jesus Rodriguez has blogged recently on Tellago Devlabs' release of an open source RESTful API for BizTalk Server Business Rules.   This is an excellent addition to the BizTalk ecosystem and I congratulate Tellago on their work.   See http://weblogs.asp.net/gsusx/archive/2011/02/08/tellago-devlabs-a-restful-api-for-biztalk-server-business-rules.aspx   The Microsoft BRE was originally designed to be used as an embedded library in .NET applications. This is reflected in the implementation of the Rules Engine Update (REU) Service which is a TCP/IP service that is hosted by a Windows service running locally on each BizTalk box. The job of the REU is to distribute rules, managed and held in a central database repository, across the various servers in a BizTalk group.   The engine is therefore distributed on each box, rather than exploited behind a central rules service.   This model is all very well, but proves quite restrictive in enterprise environments. The problem is that the BRE can only run legally on licensed BizTalk boxes. Increasingly we need to deliver rules capabilities across a more widely distributed environment. For example, in the project I am working on currently, we need to surface decisioning capabilities for use within WF workflow services running under AppFabric on non-BTS boxes. The BRE does not, currently, offer any centralised rule service facilities out of the box, and hence you have to roll your own (and then run your rules services on BTS boxes which has raised a few eyebrows on my current project, as all other WCF services run on a dedicated server farm ).   Tellago's API addresses this by providing a RESTful API for querying the rules repository and executing rule sets against XML passed in the request payload. As Jesus points out in his post, using a RESTful approach hugely increases the reach of BRE-based decisioning, allowing simple invocation from code written in dynamic languages, mobile devices, etc.   We developed our own SOAP-based general-purpose rules service to handle scenarios such as the one we face on my current project. SOAP is arguably better suited to enterprise service bus environments (please don't 'flame' me - I refuse to engage in the RESTFul vs. SOAP war). For example, on my current project we use claims based authorisation across the entire service bus and use WIF and WS-Federation for this purpose.   We have extended this to the rules service. I can't release the code for commercial reasons :-( but this approach allows us to legally extend the reach of BRE far beyond the confines of the BizTalk boxes on which it runs and to provide general purpose decisioning capabilities on the bus.   So, well done Tellago.   I haven't had a chance to play with the API yet, but am looking forward to doing so.

    Read the article

  • Oracle Fusion Applications Data Sheets Are Now Available

    - by David Hope-Ross
    For customers chomping at the bit for more information on Oracle Fusion applications, there is good news. We’ve just published  a complete library of data sheets for Oracle Fusion Applications. Included are SCM applications like Fusion Distributed Order Orchestration, Fusion Inventory Management, and Fusion Product Hub. And customers interested in sourcing and procurement should review documents that address Oracle Fusion Sourcing ,Oracle Fusion Procurement Contracts, Oracle Fusion Purchasing, Oracle Fusion Self Service Procurement, and Oracle Fusion Supplier Portal.

    Read the article

  • Move Data into the grid for scalable, predictable response times

    - by JuergenKress
    CloudTran is pleased to introduce the availability of the CloudTran Transaction and Persistence Manager for creating scalable, reliable data services on the Oracle Coherence In-Memory Data Grid (IMDG). Use of IMDG architectures has been key to handling today’s web-scale loads because it eliminates database latency by storing important and frequently access data in memory instead of on disk. The CloudTran product lets developers easily use an IMDG for full ACID-compliant transactions without having to be concerned about the location or spread of data. The system has its own implementation of fast, scalable distributed transactions that does NOT depend on XA protocols but still guarantees all ACID properties. Plus, CloudTran asynchronously replicates data going into the IMDG to back-end datastores and back-up data centers, again ensuring ACID properties. CloudTran can be accessed through Java Persistence API (JPA via TopLink Grid) and now, through a new Low-Level API, or LLAPI. This is ideal for use in SOA applications that need data reliability, high availability, performance, and scalability. It is still in its limited beta release, the LLAPI gives developers the ability to use standard put/remove logic available in Coherence and then wrap logic with simple Spring annotations or XML+AspectJ to start transactions. An important feature of LLAPI is the ability to join transactions. This is a common outcome for SOA applications that need to reduce network traffic by aggregating data into single cache entries and then doing SOA service processing in the node holding the data. This results in the need to orchestrate transaction processing across multiple service calls. CloudTran has the capability to handle these “multi-client” transactions at speed with no loss in ACID properties. Developing software around an IMDG like Oracle Coherence is an important choice for today’s web-scale applications and services. But this introduces new architectural considerations to maintain scalability in light of increased network loads and data movement. Without using CloudTran, developers are faced with an incredibly difficult task to ensure data reliability, availability, performance, and scalability when working with an IMDG. Working with highly distributed data that is entirely volatile while stored in memory presents numerous edge cases where failures can result in data loss. The CloudTran product takes care of all of this, leaving developers with the confidence and peace of mind that all data is processed correctly. For those interested in evaluating the CloudTran product and IMDGs, take a look at this link for more information: http://www.CloudTran.com/downloadAPI.ph , or send your questions to [email protected]. SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit  www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Mix Forum Technorati Tags: CloudTran,data grid,M,SOA Community,Oracle SOA,Oracle BPM,BPM,Community,OPN,Jürgen Kress

    Read the article

  • How to exclude copy local referenced assemblies from a VSIX

    - by Daniel Cazzulino
    When you add library references to project that are not reference assemblies or installed in the GAC, Visual Studio defaults to setting Copy Local to True: If, however, those dependencies are distributed by some other means (i.e. another extension, or are part of VS private assemblies, or whatever) and you want to avoid including them in your VSIX, you can add the following property to the project file: &lt;PropertyGroup&gt; ... &lt;IncludeCopyLocalReferencesInVSIXContainer&gt;false&lt;/IncludeCopyLocalReferencesInVSIXContainer&gt;Read full article

    Read the article

  • MapRedux - PowerShell and Big Data

    - by Dittenhafer Solutions
    MapRedux – #PowerShell and #Big Data Have you been hearing about “big data”, “map reduce” and other large scale computing terms over the past couple of years and been curious to dig into more detail? Have you read some of the Apache Hadoop online documentation and unfortunately concluded that it wasn't feasible to setup a “test” hadoop environment on your machine? More recently, I have read about some of Microsoft’s work to enable Hadoop on the Azure cloud. Being a "Microsoft"-leaning technologist, I am more inclinded to be successful with experimentation when on the Windows platform. Of course, it is not that I am "religious" about one set of technologies other another, but rather more experienced. Anyway, within the past couple of weeks I have been thinking about PowerShell a bit more as the 2012 PowerShell Scripting Games approach and it occured to me that PowerShell's support for Windows Remote Management (WinRM), and some other inherent features of PowerShell might lend themselves particularly well to a simple implementation of the MapReduce framework. I fired up my PowerShell ISE and started writing just to see where it would take me. Quite simply, the ScriptBlock feature combined with the ability of Invoke-Command to create remote jobs on networked servers provides much of the plumbing of a distributed computing environment. There are some limiting factors of course. Microsoft provided some default settings which prevent PowerShell from taking over a network without administrative approval first. But even with just one adjustment, a given Windows-based machine can become a node in a MapReduce-style distributed computing environment. Ok, so enough introduction. Let's talk about the code. First, any machine that will participate as a remote "node" will need WinRM enabled for remote access, as shown below. This is not exactly practical for hundreds of intended nodes, but for one (or five) machines in a test environment it does just fine. C:> winrm quickconfig WinRM is not set up to receive requests on this machine. The following changes must be made: Set the WinRM service type to auto start. Start the WinRM service. Make these changes [y/n]? y Alternatively, you could take the approach described in the Remotely enable PSRemoting post from the TechNet forum and use PowerShell to create remote scheduled tasks that will call Enable-PSRemoting on each intended node. Invoke-MapRedux Moving on, now that you have one or more remote "nodes" enabled, you can consider the actual Map and Reduce algorithms. Consider the following snippet: $MyMrResults = Invoke-MapRedux -MapReduceItem $Mr -ComputerName $MyNodes -DataSet $dataset -Verbose Invoke-MapRedux takes an instance of a MapReduceItem which references the Map and Reduce scriptblocks, an array of computer names which are the remote nodes, and the initial data set to be processed. As simple as that, you can start working with concepts of big data and the MapReduce paradigm. Now, how did we get there? I have published the initial version of my PsMapRedux PowerShell Module on GitHub. The PsMapRedux module provides the Invoke-MapRedux function described above. Feel free to browse the underlying code and even contribute to the project! In a later post, I plan to show some of the inner workings of the module, but for now let's move on to how the Map and Reduce functions are defined. Map Both the Map and Reduce functions need to follow a prescribed prototype. The prototype for a Map function in the MapRedux module is as follows. A simple scriptblock that takes one PsObject parameter and returns a hashtable. It is important to note that the PsObject $dataset parameter is a MapRedux custom object that has a "Data" property which offers an array of data to be processed by the Map function. $aMap = { Param ( [PsObject] $dataset ) # Indicate the job is running on the remote node. Write-Host ($env:computername + "::Map"); # The hashtable to return $list = @{}; # ... Perform the mapping work and prepare the $list hashtable result with your custom PSObject... # ... The $dataset has a single 'Data' property which contains an array of data rows # which is a subset of the originally submitted data set. # Return the hashtable (Key, PSObject) Write-Output $list; } Reduce Likewise, with the Reduce function a simple prototype must be followed which takes a $key and a result $dataset from the MapRedux's partitioning function (which joins the Map results by key). Again, the $dataset is a MapRedux custom object that has a "Data" property as described in the Map section. $aReduce = { Param ( [object] $key, [PSObject] $dataset ) Write-Host ($env:computername + "::Reduce - Count: " + $dataset.Data.Count) # The hashtable to return $redux = @{}; # Return Write-Output $redux; } All Together Now When everything is put together in a short example script, you implement your Map and Reduce functions, query for some starting data, build the MapReduxItem via New-MapReduxItem and call Invoke-MapRedux to get the process started: # Import the MapRedux and SQL Server providers Import-Module "MapRedux" Import-Module “sqlps” -DisableNameChecking # Query the database for a dataset Set-Location SQLSERVER:\sql\dbserver1\default\databases\myDb $query = "SELECT MyKey, Date, Value1 FROM BigData ORDER BY MyKey"; Write-Host "Query: $query" $dataset = Invoke-SqlCmd -query $query # Build the Map function $MyMap = { Param ( [PsObject] $dataset ) Write-Host ($env:computername + "::Map"); $list = @{}; foreach($row in $dataset.Data) { # Write-Host ("Key: " + $row.MyKey.ToString()); if($list.ContainsKey($row.MyKey) -eq $true) { $s = $list.Item($row.MyKey); $s.Sum += $row.Value1; $s.Count++; } else { $s = New-Object PSObject; $s | Add-Member -Type NoteProperty -Name MyKey -Value $row.MyKey; $s | Add-Member -type NoteProperty -Name Sum -Value $row.Value1; $list.Add($row.MyKey, $s); } } Write-Output $list; } $MyReduce = { Param ( [object] $key, [PSObject] $dataset ) Write-Host ($env:computername + "::Reduce - Count: " + $dataset.Data.Count) $redux = @{}; $count = 0; foreach($s in $dataset.Data) { $sum += $s.Sum; $count += 1; } # Reduce $redux.Add($s.MyKey, $sum / $count); # Return Write-Output $redux; } # Create the item data $Mr = New-MapReduxItem "My Test MapReduce Job" $MyMap $MyReduce # Array of processing nodes... $MyNodes = ("node1", "node2", "node3", "node4", "localhost") # Run the Map Reduce routine... $MyMrResults = Invoke-MapRedux -MapReduceItem $Mr -ComputerName $MyNodes -DataSet $dataset -Verbose # Show the results Set-Location C:\ $MyMrResults | Out-GridView Conclusion I hope you have seen through this article that PowerShell has a significant infrastructure available for distributed computing. While it does take some code to expose a MapReduce-style framework, much of the work is already done and PowerShell could prove to be the the easiest platform to develop and run big data jobs in your corporate data center, potentially in the Azure cloud, or certainly as an academic excerise at home or school. Follow me on Twitter to stay up to date on the continuing progress of my Powershell MapRedux module, and thanks for reading! Daniel

    Read the article

  • Découvrir Hadoop, avec un tutoriel traduit par Stéphane Dupont

    Bonjour,Je vous présente ce tutoriel traduit par Stéphane Dupont intitulé : Tutoriel Hadoop Hadoop est un système distribué, tolérant aux pannes, pour le stockage de données et qui est hautement scalable. Cette capacité de monter en charge est le résultat d'un stockage en cluster à haute bande passante et répliqué, connu sous l'acronyme de HDFS (Hadoop Distributed File System) et d'un traitement distribué...

    Read the article

  • CentOS / Redhat: Setup NFS v4.0 File Server

    <b>nixCraft: </b>"How do I setup NFS v4.0 distributed file system access server under CentOS / RHEL v5.x for sharing files with UNIX and Linux workstations? How to export a directory with NFSv4? How to mount a directory with NFSv4?"

    Read the article

  • WildPackets Monitors Diverse Networks

    WildPackets offers portable network analysis products which are designed for use on enterprise networks and in test and measurement labs, plus distributed network analysis solutions for enterprise-wide applications.

    Read the article

  • Best Ruby Git library?

    - by Jeff Welling
    Which is the best Git library in Ruby to use? Git, Grit, Rugged, Other? Background: I'm the current maintainer of TicGit-ng which is a distributed offline ticket system built on git, and I've read and heard over and over again that Grit is the one I should use because it supersedes the Git gem, but there seems to be either a lack of documentation or a lack of features because myself and others have failed in trying to switch from the deprecated-but-functional Git to the newer Grit gem.

    Read the article

  • Obfuscating Silverlight with SmartAssembly

    If you are in the .NET Software business, you have a problem. .NET assemblies can be read, and debugged, by the purchaser with almost the same ease as if you'd distributed the source code. This isn't always what you wanted or intended, so you'll need an application such as Smart Assembly. Khawar explains the simple process of protecting your company's assets.

    Read the article

  • Move Data into the Grid for Scalable, Predictable Response Times

    - by JuergenKress
    CloudTran is pleased to introduce the availability of the CloudTran Transaction and Persistence Manager for creating scalable, reliable data services on the Oracle Coherence In-Memory Data Grid (IMDG). Use of IMDG architectures has been key to handling today’s web-scale loads because it eliminates database latency by storing important and frequently access data in memory instead of on disk. The CloudTran product lets developers easily use an IMDG for full ACID-compliant transactions without having to be concerned about the location or spread of data. The system has its own implementation of fast, scalable distributed transactions that does NOT depend on XA protocols but still guarantees all ACID properties. Plus, CloudTran asynchronously replicates data going into the IMDG to back-end datastores and back-up data centers, again ensuring ACID properties. CloudTran can be accessed through Java Persistence API (JPA via TopLink Grid) and now, through a new Low-Level API, or LLAPI. This is ideal for use in SOA applications that need data reliability, high availability, performance, and scalability. Still in limited beta release, the LLAPI gives developers the ability to use standard put/remove logic available in Coherence and then wrap logic with simple Spring annotations or XML+AspectJ to start transactions. An important feature of LLAPI is the ability to join transactions. This is a common outcome for SOA applications that need to reduce network traffic by aggregating data into single cache entries and then doing SOA service processing in the node holding the data. This results in the need to orchestrate transaction processing across multiple service calls. CloudTran has the capability to handle these “multi-client” transactions at speed with no loss in ACID properties. Developing software around an IMDG like Oracle Coherence is an important choice for today’s web-scale applications and services. But this introduces new architectural considerations to maintain scalability in light of increased network loads and data movement. Without using CloudTran, developers are faced with an incredibly difficult task to ensure data reliability, availability, performance, and scalability when working with an IMDG. Working with highly distributed data that is entirely volatile while stored in memory presents numerous edge cases where failures can result in data loss. The CloudTran product takes care of all of this, leaving developers with the confidence and peace of mind that all data is processed correctly. For those interested in evaluating the CloudTran product and IMDGs, take a look at this link for more information: http://www.CloudTran.com/downloadAPI.php, or, send your questions to [email protected]. WebLogic Partner Community For regular information become a member in the WebLogic Partner Community please visit: http://www.oracle.com/partners/goto/wls-emea ( OPN account required). If you need support with your account please contact the Oracle Partner Business Center. BlogTwitterLinkedInMixForumWiki Technorati Tags: Coherence,cloudtran,cache,WebLogic Community,Oracle,OPN,Jürgen Kress

    Read the article

  • Zenoss Setup for Windows Servers

    - by Jay Fox
    Recently I was saddled with standing up Zenoss for our enterprise.  We're running about 1200 servers, so manually touching each box was not an option.  We use LANDesk for a lot of automated installs and patching - more about that later.The steps below may not necessarily have to be completed in this order - it's just the way I did it.STEP ONE:Setup a standard AD user.  We want to do this so there's minimal security exposure.  Call the account what ever you want "domain/zenoss" for our examples.***********************************************************STEP TWO:Make the following local groups accessible by your zenoss account.Distributed COM UsersPerformance Monitor UsersEvent Log Readers (which doesn't exist on pre-2008 machines)Here's the Powershell script I used to setup access to these local groups:# Created to add Active Directory account to local groups# Must be run from elevated prompt, with permissions on the remote machine(s).# Create txt file should contain the names of the machines that need the account added, one per line.# Script will process machines line by line.foreach($i in (gc c:\tmp\computers.txt)){# Add the user to the first group$objUser=[ADSI]("WinNT://domain/zenoss")$objGroup=[ADSI]("WinNT://$i/Distributed COM Users")$objGroup.PSBase.Invoke("Add",$objUser.PSBase.Path)# Add the user to the second group$objUser=[ADSI]("WinNT://domain/zenoss")$objGroup=[ADSI]("WinNT://$i/Performance Monitor Users")$objGroup.PSBase.Invoke("Add",$objUser.PSBase.Path)# Add the user to the third group - Group doesn't exist on < Server 2008#$objUser=[ADSI]("WinNT://domain/zenoss")#$objGroup=[ADSI]("WinNT://$i/Event Log Readers")#$objGroup.PSBase.Invoke("Add",$objUser.PSBase.Path)}**********************************************************STEP THREE:Setup security on the machines namespace so our domain/zenoss account can access itThe default namespace for zenoss is:  root/cimv2Here's the Powershell script:#Grant account defined below (line 11) access to WMI Namespace#Has to be run as account with permissions on remote machinefunction get-sid{Param ($DSIdentity)$ID = new-object System.Security.Principal.NTAccount($DSIdentity)return $ID.Translate( [System.Security.Principal.SecurityIdentifier] ).toString()}$sid = get-sid "domain\zenoss"$SDDL = "A;;CCWP;;;$sid" $DCOMSDDL = "A;;CCDCRP;;;$sid"$computers = Get-Content "c:\tmp\computers.txt"foreach ($strcomputer in $computers){    $Reg = [WMIClass]"\\$strcomputer\root\default:StdRegProv"    $DCOM = $Reg.GetBinaryValue(2147483650,"software\microsoft\ole","MachineLaunchRestriction").uValue    $security = Get-WmiObject -ComputerName $strcomputer -Namespace root/cimv2 -Class __SystemSecurity    $converter = new-object system.management.ManagementClass Win32_SecurityDescriptorHelper    $binarySD = @($null)    $result = $security.PsBase.InvokeMethod("GetSD",$binarySD)    $outsddl = $converter.BinarySDToSDDL($binarySD[0])    $outDCOMSDDL = $converter.BinarySDToSDDL($DCOM)    $newSDDL = $outsddl.SDDL += "(" + $SDDL + ")"    $newDCOMSDDL = $outDCOMSDDL.SDDL += "(" + $DCOMSDDL + ")"    $WMIbinarySD = $converter.SDDLToBinarySD($newSDDL)    $WMIconvertedPermissions = ,$WMIbinarySD.BinarySD    $DCOMbinarySD = $converter.SDDLToBinarySD($newDCOMSDDL)    $DCOMconvertedPermissions = ,$DCOMbinarySD.BinarySD    $result = $security.PsBase.InvokeMethod("SetSD",$WMIconvertedPermissions)     $result = $Reg.SetBinaryValue(2147483650,"software\microsoft\ole","MachineLaunchRestriction", $DCOMbinarySD.binarySD)}***********************************************************STEP FOUR:Get the SID for our zenoss account.Powershell#Provide AD User get SID$objUser = New-Object System.Security.Principal.NTAccount("domain", "zenoss") $strSID = $objUser.Translate([System.Security.Principal.SecurityIdentifier]) $strSID.Value******************************************************************STEP FIVE:Modify the Service Control Manager to allow access to the zenoss AD account.This command can be run from an elevated command line, or through Powershellsc sdset scmanager "D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)(A;;CCLCRPRC;;;SU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)(A;;CCLCRPRC;;;PUT_YOUR_SID_HERE_FROM STEP_FOUR)S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)"******************************************************************In step two the script plows through a txt file that processes each computer listed on each line.  For the other scripts I ran them on each machine using LANDesk.  You can probably edit those scripts to process a text file as well.That's what got me off the ground monitoring the machines using Zenoss.  Hopefully this is helpful for you.  Watch the line breaks when copy the scripts.

    Read the article

  • Understanding WCF Hosting

     WCF is a flagship product from  Microsoft for developing distributed application using SOA. Prior to WCF   traditional ASMX Web services were hosted only on Internet Information Services (IIS). The hosting options for WCF services are significantly enhanced from ... [Read Full Article]

    Read the article

  • Implementing a Custom Coherence PartitionAssignmentStrategy

    - by jpurdy
    A recent A-Team engagement required the development of a custom PartitionAssignmentStrategy (PAS). By way of background, a PAS is an implementation of a Java interface that controls how a Coherence partitioned cache service assigns partitions (primary and backup copies) across the available set of storage-enabled members. While seemingly straightforward, this is actually a very difficult problem to solve. Traditionally, Coherence used a distributed algorithm spread across the cache servers (and as of Coherence 3.7, this is still the default implementation). With the introduction of the PAS interface, the model of operation was changed so that the logic would run solely in the cache service senior member. Obviously, this makes the development of a custom PAS vastly less complex, and in practice does not introduce a significant single point of failure/bottleneck. Note that Coherence ships with a default PAS implementation but it is not used by default. Further, custom PAS implementations are uncommon (this engagement was the first custom implementation that we know of). The particular implementation mentioned above also faced challenges related to managing multiple backup copies but that won't be discussed here. There were a few challenges that arose during design and implementation: Naive algorithms had an unreasonable upper bound of computational cost. There was significant complexity associated with configurations where the member count varied significantly between physical machines. Most of the complexity of a PAS is related to rebalancing, not initial assignment (which is usually fairly simple). A custom PAS may need to solve several problems simultaneously, such as: Ensuring that each member has a similar number of primary and backup partitions (e.g. each member has the same number of primary and backup partitions) Ensuring that each member carries similar responsibility (e.g. the most heavily loaded member has no more than one partition more than the least loaded). Ensuring that each partition is on the same member as a corresponding local resource (e.g. for applications that use partitioning across message queues, to ensure that each partition is collocated with its corresponding message queue). Ensuring that a given member holds no more than a given number of partitions (e.g. no member has more than 10 partitions) Ensuring that backups are placed far enough away from the primaries (e.g. on a different physical machine or a different blade enclosure) Achieving the above goals while ensuring that partition movement is minimized. These objectives can be even more complicated when the topology of the cluster is irregular. For example, if multiple cluster members may exist on each physical machine, then clearly the possibility exists that at certain points (e.g. following a member failure), the number of members on each machine may vary, in certain cases significantly so. Consider the case where there are three physical machines, with 3, 3 and 9 members each (respectively). This introduces complexity since the backups for the 9 members on the the largest machine must be spread across the other 6 members (to ensure placement on different physical machines), preventing an even distribution. For any given problem like this, there are usually reasonable compromises available, but the key point is that objectives may conflict under extreme (but not at all unlikely) circumstances. The most obvious general purpose partition assignment algorithm (possibly the only general purpose one) is to define a scoring function for a given mapping of partitions to members, and then apply that function to each possible permutation, selecting the most optimal permutation. This would result in N! (factorial) evaluations of the scoring function. This is clearly impractical for all but the smallest values of N (e.g. a partition count in the single digits). It's difficult to prove that more efficient general purpose algorithms don't exist, but the key take away from this is that algorithms will tend to either have exorbitant worst case performance or may fail to find optimal solutions (or both) -- it is very important to be able to show that worst case performance is acceptable. This quickly leads to the conclusion that the problem must be further constrained, perhaps by limiting functionality or by using domain-specific optimizations. Unfortunately, it can be very difficult to design these more focused algorithms. In the specific case mentioned, we constrained the solution space to very small clusters (in terms of machine count) with small partition counts and supported exactly two backup copies, and accepted the fact that partition movement could potentially be significant (preferring to solve that issue through brute force). We then used the out-of-the-box PAS implementation as a fallback, delegating to it for configurations that were not supported by our algorithm. Our experience was that the PAS interface is quite usable, but there are intrinsic challenges to designing PAS implementations that should be very carefully evaluated before committing to that approach.

    Read the article

< Previous Page | 134 135 136 137 138 139 140 141 142 143 144 145  | Next Page >