Search Results

Search found 282 results on 12 pages for 'aggregation'.

Page 1/12 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Solaris 11 Update 1 - Link Aggregation

- by Wesley Faria

Solaris 11.1 No início desse mês em um evento mundial da Oracle chamado Oracle Open World foi lançada a nova release do Solaris 11. Ela chega cheia de novidades, são aproximadamente 300 novas funcionalidade em rede, segurança, administração e outros. Hoje vou falar de uma funcionalidade de rede muito interessante que é o Link Aggregation. O Solaris já suporta Link Aggregation desde Solaris 10 Update 1 porem no Solaris 11 Update 1 tivemos incrementos significantes. O Link Aggregation como o próprio nome diz, é a agregação de mais de uma inteface física de rede em uma interface lógica .Veja agumas funcionalidade do Link Aggregation: · Aumentar a largura da banda; · Imcrementar a segurança fazendo Failover e Failback; · Melhora a administração da rede; O Solaris 11.1 suporta 2(dois) tipos de Link Aggregation o Trunk aggregation e o Datalink Multipathing aggregation, ambos trabalham fazendo com que o pacote de rede seja distribuído entre as intefaces da agregação garantindo melhor utilização da rede.vamos ver um pouco melhor cada um deles. Trunk Aggregation O Trunk Aggregation tem como objetivo aumentar a largura de banda, seja para aplicações que possue um tráfego de rede alto seja para consolidação. Por exemplo temos um servidor que foi adquirido para comportar várias máquinas virtuais onde cada uma delas tem uma demanda e esse servidor possue 2(duas) placas de rede. Podemos então criar uma agregação entre essas 2(duas) placas de forma que o Solaris 11.1 vai enchergar as 2(duas) placas como se fosse 1(uma) fazendo com que a largura de banda duplique, veja na figura abaixo: A figura mostra uma agregação com 2(duas) placas físicas NIC 1 e NIC 2 conectadas no mesmo switch e 2(duas) interfaces virtuais VNIC A e VNIC B. Porem para que isso funcione temos que ter um switch com suporte a LACP ( Link Aggregation Control Protocol ). A função do LACP é fazer a aggregação na camada do switch pois se isso não for feito o pacote que sairá do servidor não poderá ser montado quando chegar no switch. Uma outra forma de configuração do Trunk Aggregation é o ponto-a-ponto onde ao invéz de se usar um switch, os 2 servidores são conectados diretamente. Nesse caso a agregação de um servidor irá falar diretamente com a agregação do outro garantindo uma proteção contra falhas e tambem uma largura de banda maior. Vejamos como configurar o Trunk Aggregation: 1 – Verificando quais intefaces disponíveis # dladm show-link 2 – Verificando interfaces # ipadm show-if 3 – Apagando o endereçamento das interfaces existentes # ipadm delete-ip <interface> 4 – Criando o Trunk aggregation # dladm create-aggr -L active -l <interface> -l <interface> aggr0 5 – Listando a agregação criada # dladm show-aggr Data Link Multipath Aggregation Como vimos anteriormente o Trunk aggregation é implementado apenas 1(um) switch que possua suporte a LACP portanto, temos um ponto único de falha que é o switch. Para solucionar esse problema no Solaris 10 utilizavamos o IPMP ( IP Multipathing ) que é a combinação de 2(duas) agregações em um mesmo link ou seja, outro camada de virtualização. Agora com o Solaris 11 Update 1 isso não é mais necessário, voce pode ter uma agregação de 2(duas) interfaces físicas e cada uma conectada a 1(um) swtich diferente, veja a figura abaixo: Temos aqui uma agregação chamada aggr contendo 4(quatro) interfaces físicas sendo que as interfaces NIC 1 e NIC 2 estão conectadas em um Switch e as intefaces NIC 3 e NIC 4 estão conectadas em outro Swicth. Além disso foram criadas mais 4(quatro) interfaces virtuais vnic A, vnic B, vnic C e vnic D que podem ser destinadas a diferentes aplicações/zones. Com isso garantimos alta disponibilidade em todas a camadas pois podemos ter falhas tanto em switches, links como em interfaces de rede físicas. Para configurar siga os mesmo passos da configuração do Trunk Aggregation até o passo 3 depois faça o seguinte: 4 – Criando o Trunk aggregation # dladm create-aggr -m haonly -l <interface> -l <interface> aggr0 5 – Listando a agregação criada # dladm show-aggr Depois de configurado seja no modo Trunk aggregation ou no modo Data Link Multipathing aggregation pode ser feito a troca de um modo para o outro, pode adcionar e remover interfaces físicas ou vituais. Bem pessoal, era isso que eu tinha para mostar sobre a nova funcionalidade do Link Aggregation do Solaris 11 Update 1 espero que tenham gostado, até uma próxima novidade.

Read the article
Log Aggregation solutions

- by pdaddy

Hello, We are currently evaluating log aggregation solutions at my company. I understand that Splunk is one of the best solutions, but what are some of the "negatives" with using Splunk? Is there anything else out there that maybe does a better job of log aggregation?

Read the article
Generic Aggregation of C++ Objects by Attribute When Attribute Name is Unknown at Runtime

- by stretch

I'm currently implementing a system with a number of class's representing objects such as client, business, product etc. Standard business logic. As one might expect each class has a number of standard attributes. I have a long list of essentially identical requirements such as: the ability to retrieve all business' whose industry is manufacturing. the ability to retrieve all clients based in London Class business has attribute sector and client has attribute location. Clearly this a relational problem and in pseudo SQL would look something like: SELECT ALL business in business' WHERE sector == manufacturing Unfortunately plugging into a DB is not an option. What I want to do is have a single generic aggregation function whose signature would take the form: vector<generic> genericAggregation(class, attribute, value); Where class is the class of object I want to aggregate, attribute and value being the class attribute and value of interest. In my example I've put vector as return type, but this wouldn't work. Probably better to declare a vector of relevant class type and pass it as an argument. But this isn't the main problem. How can I accept arguments in string form for class, attribute and value and then map these in a generic object aggregation function? Since it's rude not to post code, below is a dummy program which creates a bunch of objects of imaginatively named classes. Included is a specific aggregation function which returns a vector of B objects whose A object is equal to an id specified at the command line e.g. .. $ ./aggregations 5 which returns all B's whose A objects 'i' attribute is equal to 5. See below: #include <iostream> #include <cstring> #include <sstream> #include <vector> using namespace std; //First imaginativly names dummy class class A { private: int i; double d; string s; public: A(){} A(int i, double d, string s) { this->i = i; this->d = d; this->s = s; } ~A(){} int getInt() {return i;} double getDouble() {return d;} string getString() {return s;} }; //second imaginativly named dummy class class B { private: int i; double d; string s; A *a; public: B(int i, double d, string s, A *a) { this->i = i; this->d = d; this->s = s; this->a = a; } ~B(){} int getInt() {return i;} double getDouble() {return d;} string getString() {return s;} A* getA() {return a;} }; //Containers for dummy class objects vector<A> a_vec (10); vector<B> b_vec;//100 //Util function, not important.. string int2string(int number) { stringstream ss; ss << number; return ss.str(); } //Example function that returns a new vector containing on B objects //whose A object i attribute is equal to 'id' vector<B> getBbyA(int id) { vector<B> result; for(int i = 0; i < b_vec.size(); i++) { if(b_vec.at(i).getA()->getInt() == id) { result.push_back(b_vec.at(i)); } } return result; } int main(int argc, char** argv) { //Create some A's and B's, each B has an A... //Each of the 10 A's are associated with 10 B's. for(int i = 0; i < 10; ++i) { A a(i, (double)i, int2string(i)); a_vec.at(i) = a; for(int j = 0; j < 10; j++) { B b((i * 10) + j, (double)j, int2string(i), &a_vec.at(i)); b_vec.push_back(b); } } //Got some objects so lets do some aggregation //Call example aggregation function to return all B objects //whose A object has i attribute equal to argv[1] vector<B> result = getBbyA(atoi(argv[1])); //If some B's were found print them, else don't... if(result.size() != 0) { for(int i = 0; i < result.size(); i++) { cout << result.at(i).getInt() << " " << result.at(i).getA()->getInt() << endl; } } else { cout << "No B's had A's with attribute i equal to " << argv[1] << endl; } return 0; } Compile with: g++ -o aggregations aggregations.cpp If you wish :) Instead of implementing a separate aggregation function (i.e. getBbyA() in the example) I'd like to have a single generic aggregation function which accounts for all possible class attribute pairs such that all aggregation requirements are met.. and in the event additional attributes are added later, or additional aggregation requirements, these will automatically be accounted for. So there's a few issues here but the main one I'm seeking insight into is how to map a runtime argument to a class attribute. I hope I've provided enough detail to adequately describe what I'm trying to do...

Read the article
Setup link aggregation and jumbo frames on VMware ESXi 4

- by Sysadminicus

I'm setting up an ESXi 4 server to connect to an NFS datastore. I'd like to bond two of the NICs together and use jumbo frames for the NFS connection on a private (non-management) network. I setup a new switch with the 2 NICs and am able to connect to the NFS share over it, but could use some guidance on getting jumbo frames and link aggregation/bonding/teaming working.

Read the article
Low-cost, Flexible Log Aggregation [closed]

- by Dan McClain

I'm starting to have quite the collection of Ubuntu VMs that I must manage. I'm starting to investigate Puppet for managing the configuration of all of them, and apticron to let me know what's out of date. But the issue I feel I should deal with sooner than later is log aggregation. I'd like to stay in the free/open source realm for now, seeing that we don't have much budget for something like splunk yet. In addition to syslog, I would like to collect application specific logs (We are running different apps on different machines, from nginx+passenger for rails, to Apache+Tomcat for java, to PHP for expression engine, and mysql/postgresql database server), so that we can analyze the relavent data. For now, I'm just looking to get all the logs one place.

Read the article
Dependency injection and aggregation/association

- by Abhijeet Kashnia

In both association and aggregation, one class maintains a reference to another class. Then, does constructor injection imply composition? Going by the same logic, is it safe to say that setter injection leads to an association, and not an aggregation?

Read the article
Data aggregation mongodb vs mysql

- by Dimitris Stefanidis

I am currently researching on a backend to use for a project with demanding data aggregation requirements. The main project requirements are the following. Store millions of records for each user. Users might have more than 1 million entries per year so even with 100 users we are talking about 100 million entries per year. Data aggregation on those entries must be performed on the fly. The users need to be able to filter on the entries by a ton of available filters and then present summaries (totals , averages e.t.c) and graphs on the results. Obviously I cannot precalculate any of the aggregation results because the filter combinations (and thus the result sets) are huge. Users are going to have access on their own data only but it would be nice if anonymous stats could be calculated for all the data. The data is going to be most of the time in batch. e.g the user will upload the data every day and it could like 3000 records. In some later version there could be automated programs that upload every few minutes in smaller batches of 100 items for example. I made a simple test of creating a table with 1 million rows and performing a simple sum of 1 column both in mongodb and in mysql and the performance difference was huge. I do not remember the exact numbers but it was something like mysql = 200ms , mongodb = 20 sec. I have also made the test with couchdb and had much worse results. What seems promising speed wise is cassandra which I was very enthusiastic about when I first discovered it. However the documentation is scarce and I haven't found any solid examples on how to perform sums and other aggregate functions on the data. Is that possible ? As it seems from my test (Maybe I have done something wrong) with the current performance its impossible to use mongodb for such a project although the automated sharding functionality seems like a perfect fit for it. Does anybody have experience with data aggregation in mongodb or have any insights that might be of help for the implementation of the project ? Thanks, Dimitris

Read the article
Netgear GS724Tv3 and link aggregation Mac OS X Server 10.6.8

- by Manca Weeks

I need to link aggregate 2 sets of ports on the Netgear GS724T with my Apple server tower (latest generation). I have 2 built in ports and 2 ports on a PCIe ethernet card. It is not obvious to me how to properly configure the Netgear end. I have access to the Netgear box through its web interface, just don't know how to properly set the settings. I tried going to Netgear for help, but they said my software support has expired. I bought this unit on their recommendation - they say it is compatible with 802.3ad protocol. I cannot locate any references to this protocol in the manual and I noticed some people in formus say that this device is actually not compatible with 802.3ad and that Netgear is misleading potential customers by saying it is. Any help will be appreciated. Thanks, M My own answer - posted as edit because of restrictions on my user: OK folks, turns out one must use a Windows machine on this one or nothing makes sense. I was unable to get much farther than viewing the default inactive LAGs because in Firefox and Safari on Mac things don't make much sense - i.e. the Apply buttons (supposedly JavaScript) don't work. You can view the configurations, but none of the modifications you make stick. Then, in Switching - LAGs, choose the ports to include and make sure you switch the LAG type from Static to LACP and all is well. Haven't tested the performance of the config yet, but both sides appear to be happy with the configuration. Apple server says link active and so does the Netgear. Will report if any other discoveries. Thanks for all who read and to user84104 for responding. M

Read the article
Podcast aggregation - please recommend for Windows 7

- by bazza-formez

Hi, I need a good podcast aggregator. I was using Juice 2.2, but that will not work on my new pc which is running Windows 7 (64 bit). Could anyone recommend one please ? I need the following functionality : 1- Subscribe to podcasts (mp3's from radio shows) using rss 2- Have OPML support so I can load up my old subscriptions easily 3- That runs quietly in the backgrond and looks after itself 4- That deletes old episodes automatically after a set time 5- Isn't just designed for an IPOD (I use a simple generic mp3 player to listen). Any ideas? Thanks! Bazza

Read the article
what does composition example vs aggregation

- by meWantToLearn

Composition and aggregation both are confusion to me. Does my code sample below indicate composition or aggregation? class A { public static function getData($id) { //something } public static function checkUrl($url) { // something } class B { public function executePatch() { $data = A::getData(12); } public function readUrl() { $url = A::checkUrl('http/erere.com'); } public function storeData() { //something not related to class A at all } } } Is class B a composition of class A or is it aggregation of class A? Does composition purely mean that if class A gets deleted class B does not works at all and aggregation if class A gets deleted methods in class B that do not use class A will work?

Read the article
Sumproduct using Django's aggregation

- by Matthew Rankin

Question Is it possible using Django's aggregation capabilities to calculate a sumproduct? Background I am modeling an invoice, which can contain multiple items. The many-to-many relationship between the Invoice and Item models is handled through the InvoiceItem intermediary table. The total amount of the invoice—amount_invoiced—is calculated by summing the product of unit_price and quantity for each item on a given invoice. Below is the code that I'm currently using to accomplish this, but I was wondering if there is a better way to handle this using Django's aggregation capabilities. Current Code class Item(models.Model): item_num = models.SlugField(unique=True) description = models.CharField(blank=True, max_length=100) class InvoiceItem(models.Model): item = models.ForeignKey(Item) invoice = models.ForeignKey('Invoice') unit_price = models.DecimalField(max_digits=10, decimal_places=2) quantity = models.DecimalField(max_digits=10, decimal_places=4) class Invoice(models.Model): invoice_num = models.SlugField(max_length=25) invoice_items = models.ManyToManyField(Item,through='InvoiceItem') def _get_amount_invoiced(self): invoice_items = self.invoiceitem_set.all() amount_invoiced = 0 for invoice_item in invoice_items: amount_invoiced += (invoice_item.unit_price * invoice_item.quantity) return amount_invoiced amount_invoiced = property(_get_amount_invoiced)

Read the article
Microsoft DBMS on-the-fly aggregation

- by ILya

Some time ago i was reading an article about new MS DBMS technology. It's some kind of OLAP but on the fly. This technology can bind to data flows and then provide a real time aggregation. So the question is "what is it's name?". I need such a technology now but can't remember it's name... Or maybe there are some similar technologies?

Read the article
SMS aggregation service provider

- by Mponnada

Hi, Can someone please tell me what are the pre-requisites for establishing an SMS Aggregation service (as a business), I am after the technology and implementation, a rough overview of what is involved (what components, ex. Gateway, carrier, etc) will be great help. Regards

Read the article
What is the use of Association, Aggregation and Composition (Encapsulation) in Classes

- by SahilMahajanMj

I have gone through lots of theories about what is encapsulation and the three techniques of implementing it, which are Association, Aggregation and Composition. What i found is, Encapsulation Encapsulation is the technique of making the fields in a class private and providing access to the fields via public methods. If a field is declared private, it cannot be accessed by anyone outside the class, thereby hiding the fields within the class. For this reason, encapsulation is also referred to as data hiding. Encapsulation can be described as a protective barrier that prevents the code and data being randomly accessed by other code defined outside the class. Access to the data and code is tightly controlled by an interface. The main benefit of encapsulation is the ability to modify our implemented code without breaking the code of others who use our code. With this feature Encapsulation gives maintainability, flexibility and extensibility to our code. Association Association is a relationship where all object have their own lifecycle and there is no owner. Let’s take an example of Teacher and Student. Multiple students can associate with single teacher and single student can associate with multiple teachers but there is no ownership between the objects and both have their own lifecycle. Both can create and delete independently. Aggregation Aggregation is a specialize form of Association where all object have their own lifecycle but there is ownership and child object can not belongs to another parent object. Let’s take an example of Department and teacher. A single teacher can not belongs to multiple departments, but if we delete the department teacher object will not destroy. We can think about “has-a” relationship. Composition Composition is again specialize form of Aggregation and we can call this as a “death” relationship. It is a strong type of Aggregation. Child object dose not have their lifecycle and if parent object deletes all child object will also be deleted. Let’s take again an example of relationship between House and rooms. House can contain multiple rooms there is no independent life of room and any room can not belongs to two different house if we delete the house room will automatically delete. The question is: Now these all are real world examples. I am looking for some description about how to use these techniques in actual class code. I mean what is the point for using three different techniques for encapsulation, How these techniques could be implemented and How to choose which technique is applicable at time.

Read the article
Application log aggregation, management and notifications...

- by Matthew Savage

I'm wondering what everyone is using for logging, log management and log aggregation on their systems. I am working in a company which uses .NET for all it's applications and all systems are Windows based. Currently each application looks after its own logging and notifications of failures (e.g. if app A fails it will send out its own 'call for help' to an admin). While this current practice works its a bit hacky and hard to manage. I've been trying to find some options for making this work better and I've come up with the following: log4net & Chainsaw (ah, if it works). Logging via log4net or another framework into a central database & rolling our own management tool. Logging to the Windows event log and using MOM or System Center Operations Manager to aggregate and manage each of these servers & their apps. A hand-rolled solution to suck all the log files into one point and work some magic across them. Essentially what we are after is something which can pull log entries all together and allow for some analytics to be run across them, plus use a kind of event based system to, for example, send out a warning email when there have been 30+ warning level logs for an application in the last x minutes. So is there anything I've missed, or something someone else can suggest?

Read the article
Creating Entity as an aggregation

- by Jamie Dixon

I recently asked about how to separate entities from their behaviour and the main answer linked to this article: http://cowboyprogramming.com/2007/01/05/evolve-your-heirachy/ The ultimate concept written about here is that of: OBJECT AS A PURE AGGREGATION. I'm wondering how I might go about creating game entities as pure aggregation using C#. I've not quite grasped the concept of how this might work yet. (Perhaps the entity is an array of objects implementing a certain interface or base type?) My current thinking still involves having a concrete class for each entity type that then implements the relevant interfaces (IMoveable, ICollectable, ISpeakable etc). How can I go about creating an entity purely as an aggregation without having any concrete type for that entity?

Read the article
Hierarchy based aggregation

- by Ganapathy Subramaniam

I have a hierarchy table in SQL Server 2005 which contains employees - managers - department - location - state. Sample table for hierarchy table: ID Name ParentID Type 1 PA NULL 0 (group) 2 Pittsburgh 1 1 (subgroup) 3 Accounts 2 1 4 Alex 3 2 (employee) 5 Robin 3 2 6 HR 2 1 7 Robert 6 2 Second one is fact table which contains employee salary details ID and Salary. Sample data for fact table: ID Salary 4 6000 5 5000 7 4000 Is there any good to way to display the hierarchy from hierarchy table with aggregated sum of salary based on employees. Expected result is like Name Salary PA 15000 (Pittsburgh + others(if any)) Pittusburgh 15000 (Accounts + HR) Accounts 11000 (Alex + Robin) Alex 6000 (direct values) Robin 5000 HR 4000 Robert 4000 In my production environment, hierarchy table may contain 23000+ rows and fact table may contain 300,000+ rows. So, I thought of providing any level of groupid to the query to retrieve just its children and its corresponding aggregated value. Any better solution?

Read the article
Django aggregation query on related one-to-many objects

- by parxier

Here is my simplified model: class Item(models.Model): pass class TrackingPoint(models.Model): item = models.ForeignKey(Item) created = models.DateField() data = models.IntegerField() In many parts of my application I need to retrieve a set of Item's and annotate each item with data field from latest TrackingPoint from each item ordered by created field. For example, instance i1 of class Item has 3 TrackingPoint's: tp1 = TrackingPoint(item=i1, created=date(2010,5,15), data=23) tp2 = TrackingPoint(item=i1, created=date(2010,5,14), data=21) tp3 = TrackingPoint(item=i1, created=date(2010,5,12), data=120) I need a query to retrieve i1 instance annotated with tp1.data field value as tp1 is the latest tracking point ordered by created field. That query should also return Item's that don't have any TrackingPoint's at all. If possible I prefer not to use QuerySet's extra method to do this. That's what I tried so far... and failed :( Item.objects.annotate(max_created=Max('trackingpoint__created'), data=Avg('trackingpoint__data')).filter(trackingpoint__created=F('max_created')) Any ideas?

Read the article
Real-time aggregation of files from multiple machines to one

- by dmitry-kay

I need a tool which gets a list of machine names and file wildcards. Then it connects to all these machines (SSH) and begins to monitor changes (appendings to the end) in each file matched by wildcards. New lines in each such file are saved to the local machine to the file with the same name. (This is a task of real-time log files collecting.) I could use ssh + tail -f, of course, but it is not very robust: if a monitoring process dies and then restarts, some data from remote files may be lost (because tail -f does not save the position at which it is finished before). I may write this tool manually, but before - I'd like to know if such tool already exists or not.

Read the article
Parallelism in .NET – Part 4, Imperative Data Parallelism: Aggregation

- by Reed

In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel. Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation. Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria. This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately. The Task Parallel Library has tools to assist in this synchronization. The main issue with aggregation when parallelizing a routine is that you need to handle synchronization of data. Since multiple threads will need to write to a shared portion of data. Suppose, for example, that we wanted to parallelize a simple loop that looked for the minimum value within a dataset: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This seems like a good candidate for parallelization, but there is a problem here. If we just wrap this into a call to Parallel.ForEach, we’ll introduce a critical race condition, and get the wrong answer. Let’s look at what happens here: // Buggy code! Do not use! double min = double.MaxValue; Parallel.ForEach(collection, item => { double value = item.PerformComputation(); min = System.Math.Min(min, value); }); This code has a fatal flaw: min will be checked, then set, by multiple threads simultaneously. Two threads may perform the check at the same time, and set the wrong value for min. Say we get a value of 1 in thread 1, and a value of 2 in thread 2, and these two elements are the first two to run. If both hit the min check line at the same time, both will determine that min should change, to 1 and 2 respectively. If element 1 happens to set the variable first, then element 2 sets the min variable, we’ll detect a min value of 2 instead of 1. This can lead to wrong answers. Unfortunately, fixing this, with the Parallel.ForEach call we’re using, would require adding locking. We would need to rewrite this like: // Safe, but slow double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach(collection, item => { double value = item.PerformComputation(); lock(syncObject) min = System.Math.Min(min, value); }); This will potentially add a huge amount of overhead to our calculation. Since we can potentially block while waiting on the lock for every single iteration, we will most likely slow this down to where it is actually quite a bit slower than our serial implementation. The problem is the lock statement – any time you use lock(object), you’re almost assuring reduced performance in a parallel situation. This leads to two observations I’ll make: When parallelizing a routine, try to avoid locks. That being said: Always add any and all required synchronization to avoid race conditions. These two observations tend to be opposing forces – we often need to synchronize our algorithms, but we also want to avoid the synchronization when possible. Looking at our routine, there is no way to directly avoid this lock, since each element is potentially being run on a separate thread, and this lock is necessary in order for our routine to function correctly every time. However, this isn’t the only way to design this routine to implement this algorithm. Realize that, although our collection may have thousands or even millions of elements, we have a limited number of Processing Elements (PE). Processing Element is the standard term for a hardware element which can process and execute instructions. This typically is a core in your processor, but many modern systems have multiple hardware execution threads per core. The Task Parallel Library will not execute the work for each item in the collection as a separate work item. Instead, when Parallel.ForEach executes, it will partition the collection into larger “chunks” which get processed on different threads via the ThreadPool. This helps reduce the threading overhead, and help the overall speed. In general, the Parallel class will only use one thread per PE in the system. Given the fact that there are typically fewer threads than work items, we can rethink our algorithm design. We can parallelize our algorithm more effectively by approaching it differently. Because the basic aggregation we are doing here (Min) is communitive, we do not need to perform this in a given order. We knew this to be true already – otherwise, we wouldn’t have been able to parallelize this routine in the first place. With this in mind, we can treat each thread’s work independently, allowing each thread to serially process many elements with no locking, then, after all the threads are complete, “merge” together the results. This can be accomplished via a different set of overloads in the Parallel class: Parallel.ForEach<TSource,TLocal>. The idea behind these overloads is to allow each thread to begin by initializing some local state (TLocal). The thread will then process an entire set of items in the source collection, providing that state to the delegate which processes an individual item. Finally, at the end, a separate delegate is run which allows you to handle merging that local state into your final results. To rewriting our routine using Parallel.ForEach<TSource,TLocal>, we need to provide three delegates instead of one. The most basic version of this function is declared as: public static ParallelLoopResult ForEach<TSource, TLocal>( IEnumerable<TSource> source, Func<TLocal> localInit, Func<TSource, ParallelLoopState, TLocal, TLocal> body, Action<TLocal> localFinally ) The first delegate (the localInit argument) is defined as Func<TLocal>. This delegate initializes our local state. It should return some object we can use to track the results of a single thread’s operations. The second delegate (the body argument) is where our main processing occurs, although now, instead of being an Action<T>, we actually provide a Func<TSource, ParallelLoopState, TLocal, TLocal> delegate. This delegate will receive three arguments: our original element from the collection (TSource), a ParallelLoopState which we can use for early termination, and the instance of our local state we created (TLocal). It should do whatever processing you wish to occur per element, then return the value of the local state after processing is completed. The third delegate (the localFinally argument) is defined as Action<TLocal>. This delegate is passed our local state after it’s been processed by all of the elements this thread will handle. This is where you can merge your final results together. This may require synchronization, but now, instead of synchronizing once per element (potentially millions of times), you’ll only have to synchronize once per thread, which is an ideal situation. Now that I’ve explained how this works, lets look at the code: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Although this is a bit more complicated than the previous version, it is now both thread-safe, and has minimal locking. This same approach can be used by Parallel.For, although now, it’s Parallel.For<TLocal>. When working with Parallel.For<TLocal>, you use the same triplet of delegates, with the same purpose and results. Also, many times, you can completely avoid locking by using a method of the Interlocked class to perform the final aggregation in an atomic operation. The MSDN example demonstrating this same technique using Parallel.For uses the Interlocked class instead of a lock, since they are doing a sum operation on a long variable, which is possible via Interlocked.Add. By taking advantage of local state, we can use the Parallel class methods to parallelize algorithms such as aggregation, which, at first, may seem like poor candidates for parallelization. Doing so requires careful consideration, and often requires a slight redesign of the algorithm, but the performance gains can be significant if handled in a way to avoid excessive synchronization.

Read the article
ScaleMP Steps Up Server Virtualization Aggregation Software

The company started out with a server virtualization aggregation tool that makes many boxes look like one. Now, it's enhancing its formula and splitting that image into a second virtual machine.

Read the article
How to Setup Network Link aggregation (802.3ad) on Ubuntu

- by Sysadmin Geek

Do you need to pump large amounts of data to a multitude of clients simultaneously, while only using a single IP address? By using “link aggregation” we can join several separate network cards on the system into one humongous NIC. Latest Features How-To Geek ETC Learn To Adjust Contrast Like a Pro in Photoshop, GIMP, and Paint.NET Have You Ever Wondered How Your Operating System Got Its Name? Should You Delete Windows 7 Service Pack Backup Files to Save Space? What Can Super Mario Teach Us About Graphics Technology? Windows 7 Service Pack 1 is Released: But Should You Install It? How To Make Hundreds of Complex Photo Edits in Seconds With Photoshop Actions Super-Charge GIMP’s Image Editing Capabilities with G’MIC [Cross-Platform] Access and Manage Your Ubuntu One Account in Chrome and Iron Mouse Over YouTube Previews YouTube Videos in Chrome Watch a Machine Get Upgraded from MS-DOS to Windows 7 [Video] Bring the Whole Ubuntu Gang Home to Your Desktop with this Mascots Wallpaper Hack Apart a Highlighter to Create UV-Reactive Flowers [Science]

Read the article
Data Aggregation of CSV files java

- by royB

I have k csv files (5 csv files for example), each file has m fields which produce a key and n values. I need to produce a single csv file with aggregated data. I'm looking for the most efficient solution for this problem, speed mainly. I don't think by the way that we will have memory issues. Also I would like to know if hashing is really a good solution because we will have to use 64 bit hashing solution to reduce the chance for a collision to less than 1% (we are having around 30000000 rows per aggregation). For example file 1: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,50,60,70,80 a3,b2,c4,60,60,80,90 file 2: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,30,50,90,40 a3,b2,c4,30,70,50,90 result: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,80,110,160,120 a3,b2,c4,90,130,130,180 algorithm that we thought until now: hashing (using concurentHashTable) merge sorting the files DB: using mysql or hadoop or redis. The solution needs to be able to handle Huge amount of data (each file more than two million rows) a better example: file 1 country,city,peopleNum england,london,1000000 england,coventry,500000 file 2: country,city,peopleNum england,london,500000 england,coventry,500000 england,manchester,500000 merged file: country,city,peopleNum england,london,1500000 england,coventry,1000000 england,manchester,500000 The key is: country,city. This is just an example, my real key is of size 6 and the data columns are of size 8 - total of 14 columns. We would like that the solution will be the fastest in regard of data processing.

Read the article
Link aggregation with freebsd8 and a cicso 3550, what am i doing wrong?

- by Flamewires

Hey, I am trying to setup Link Aggrigation with LACP (well, anything that provides increased bandwidth and failover using my setup will work). I'm running FreeBSD 8.0 on 3 machines. M1 is running 2 10/100 ethernetcards setup for link aggrigation using lagg. for reference: ifconfig em0 up ifconfig tx0 up ifconfig create lagg0 ifconfig lagg0 laggproto lacp laggport tx0 laggport em0 192.168.1.16 netmask 255.255.255.0 I plugged them into ports 1 and 2 of a Cicso 3550. then ran: configure terminal interface range Fa0/1 - 2 switchport mode access switchport access vlan 1 channel-group 1 mode active (everythings in vlan 1) Now Im able to connect the other computers to other ports on the switch and failover works great, i can unplug cables in the middle of a transfer and the traffic gets rerouted. However, im not noticing any speed increase. My test setup: load balancing: i tried dst and src on the switch, neither seemed to give me a speed increase. I am SCPing 2 500 meg files from the lagg computer to other computers (one each) which are also running 10/100 full duplex cards. I get transfer speeds of about 11.2-11.4 Mbps to a single host, and about half that (5.9-6.2) Mbps when transferring to both at the same time. From what I understood with destination load balancing the router was suppose to balance traffic headed for 1 computer over 1 port and traffic headed for another over a diff(in this case) the other port. With destination-MAC address forwarding, when packets are forwarded to an EtherChannel, the packets are distributed across the ports in the channel based on the destination host MAC address of the incoming packet. Therefore, packets to the same destination are forwarded over the same port, and packets to a different destination are sent on a different port in the channel. For the 3550 series switch, when source-MAC address forwarding is used, load distribution based on the source and destination IP address is also enabled for routed IP traffic. All routed IP traffic chooses a port based on the source and destination IP address. Packets between two IP hosts always use the same port in the channel, and traffic between any other pair of hosts can use a different port in the channel. (Link) What am i doing wrong/what would i need to do to see a speed increase beyond what i could do with just a single card?

Read the article
Bing brings Twitter aggregation to search results

- by jamiet

I read with interest today a post on the Bing blog Get the Latest on Twitter with Bing Social Search which describes how tweets are soon going to show up in Bing search results. On the surface that isn’t very interesting, Google has been doing this for a while, but of particular interest to myself was the following screenshot: We can see at the bottom of a search result for “TMZ” that Bing is showing us the most popular TMZ stories as determined by the number of tweets that contain links to them. This is great. Bing are applying a principle that those of us in the Business Intelligence (BI) trade have known for ages: a piece of data in isolation is not very interesting but when you aggregate a lot of that data you find the trends that actually matter and when you surface that data in a meaningful way then people can derive real value from it. That sounds obvious but this new Bing feature is the first time I have seen the principle applied in a useful way to tweets and I applaud them for that; its certainly a lot more useful than the pointless constant tweet scroll that you see on Google. What a shame its going to be, yet again from Bing, a US-only feature. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

Search Results

Search found 282 results on 12 pages for 'aggregation'.

Page 1/12 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Wesley Faria

- by pdaddy

- by stretch

- by Sysadminicus

- by Dan McClain

- by Abhijeet Kashnia

- by Dimitris Stefanidis

- by Manca Weeks

- by bazza-formez

- by meWantToLearn

- by Matthew Rankin

- by ILya

- by Mponnada

- by SahilMahajanMj

- by Matthew Savage

- by Jamie Dixon

- by Ganapathy Subramaniam

- by parxier

- by dmitry-kay

- by Reed

- by Sysadmin Geek

- by royB

- by Flamewires

- by jamiet

1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >