performance dashboard - Page 486

Doctrine - get the offset of an object in a collection (implementing an infinite scroll)

- by dan

I am using Doctrine and trying to implement an infinite scroll on a collection of notes displayed on the user's browser. The application is very dynamic, therefore when the user submits a new note, the note is added to the top of the collection straightaway, besides being sent (and stored) to the server. Which is why I can't use a traditional pagination method, where you just send the page number to the server and the server will figure out the offset and the number of results from that. To give you an example of what I mean, imagine there are 20 notes displayed, then the user adds 2 more notes, therefore there are 22 notes displayed. If I simply requests "page 2", the first 2 items of that page will be the last two items of the page currently displayed to the user. Which is why I am after a more sophisticated method, which is the one I am about to explain. Please consider the following code, which is part of the server code serving an AJAX request for more notes: // $lastNoteDisplayedId is coming from the AJAX request $lastNoteDisplayed = $repository->findBy($lastNoteDisplayedId); $allNotes = $repository->findBy($filter, array('createdAt' => 'desc')); $offset = getLastNoteDisplayedOffset($allNotes, $lastNoteDisplayedId); // retrieve the page to send back so that it can be appended to the listing $notesPerPage = 30 $notes = $repository->findBy( array(), array('createdAt' => 'desc'), $notesPerPage, $offset ); $response = json_encode($notes); return $response; Basically I would need to write the method getLastNoteDisplayedOffset, that given the whole set of notes and one particoular note, it can give me its offset, so that I can use it for the pagination of the previous Doctrine statement. I know probably a possible implementation would be: getLastNoteDisplayedOffset($allNotes, $lastNoteDisplayedId) { $i = 0; foreach ($allNotes as $note) { if ($note->getId() === $lastNoteDisplayedId->getId()) { break; } $i++; } return $i; } I would prefer not to loop through all notes because performance is an important factor. I was wondering if Doctrine has got a method itself or if you can suggest a different approach.

Read the article

Google Adwords API response parse

- by Yun Ling

I am trying to figure out how to parse the Adword API query response without exceptions and one issue that i came across is that sometimes, the data itself contains comma besides the comma between each column. Say i do a query on Adroup, campaign and impression by using <reportDefinition xmlns="https://adwords.google.com/api/adwords/cm/v201209"> <selector> <fields>CampaignName</fields> <fields>AdgroupName</fields> <fields>Impressions</fields> <predicates> <field>Status</field> <operator>IN</operator> <values>ENABLED</values> <values>PAUSED</values> </predicates> </selector> <reportName>Custom Adgroup Performance Report</reportName> <reportType>ADGROUP_PERFORMANCE_REPORT</reportType> <dateRangeType>LAST_7_DAYS</dateRangeType> <downloadFormat>CSV</downloadFormat> </reportDefinition> Since my campaign has comma within the string like below: "Adroup,Campaign,Impressions, Premiun Beer, Beer, Chicago, 1000" where the adgroup is "premium beer" and campaign is "Beer,Chicago". that will cause an issue if we parse this information by using comma. Does anyone know how to solve this problem?

Read the article

Converting ntext to nvcharmax(max) - Getting around size limitation

- by Overflew

Hi all, I'm trying to change an existing SQL NText column to nvcharmax(max), and encountering an error on the size limit. There's a large amount of existing data, some of which is more than the 8k limit, I believe. We're looking to convert this, so that the field is searchable in LINQ. The 2x SQL statements I've tried are: update Table set dataNVarChar = convert(nvarchar(max), dataNtext) where dataNtext is not null update Table set dataNVarChar = cast(dataNtext as nvarchar(max)) where dataNtext is not null And the error I get is: Cannot create a row of size 8086 which is greater than the allowable maximum row size of 8060. This is using SQL Server 2008. Any help appreciated, Thanks. Update / Solution: The marked answer below is correct, and SQL 2008 can change the column to the correct data type in my situation, and there are no dramas with the LINQ-utilising application we use on top of it: alter table [TBL] alter column [COL] nvarchar(max) I've also been advised to follow it up with: update [TBL] set [COL] = [COL] Which completes the conversion by moving the data from the lob structure to the table (if the length in less than 8k), which improves performance / keeps things proper.

Read the article

How can I handle dynamic calculated attributes in a model in Django?

- by bullfish

In Django I calculate the breadcrumb (a list of fathers) for an geographical object. Since it is not going to change very often, I am thinking of pre calculating it once the object is saved or initialized. 1.) What would be better? Which solution would have a better performance? To calculate it at _init_ or to calculate it when the object is saved (the object takes about 500-2000 characters in the DB)? 2.) I tried to overwrite the _init_ or save() methods but I don't know how to use attributes of the just saved object. Accessing *args, **kwargs did not work. How can I access them? Do I have to save, access the father and then save again? 3.) If I decide to save the breadcrumb. Whats the best way to do it? I used http://www.djangosnippets.org/snippets/1694/ and have crumb = PickledObjectField(). Thats the method to calculate the attribute crumb() def _breadcrumb(self): breadcrumb = [ ] x = self while True: x = x.father try: if hasattr(x, 'country'): breadcrumb.append(x.country) elif hasattr(x, 'region'): breadcrumb.append(x.region) elif hasattr(x, 'city'): breadcrumb.append(x.city) else: break except: break breadcrumb.reverse() return breadcrumb Thats my save-Method: def save(self,*args, **kwargs): # how can I access the father ob the object? father = self.father # does obviously not work father = kwargs['father'] # does not work either # the breadcrumb gets calculated here self.crumb = self._breadcrumb(father) super(GeoObject, self).save(*args,**kwargs) Please help me out. I am working on this for days now. Thank you.

Read the article

Architecture for database analytics

- by David Cournapeau

Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

Read the article

How many users are sufficient to make a heavy load for web application

- by galymzhan

I have a web application, which has been suffering high load recent days. The application runs on single server which has 8-core Intel CPU and 4gb of RAM. Software: Drupal 5 (Apache 2, PHP5, MySQL5) running on Debian. After reaching 500 authenticated and 200 anonymous users (simultaneous), the application drastically decreases its performance up to total failure. The biggest load comes from authenticated users, who perform activities, causing insert/update/deletes on db. I think mysql is a bottleneck. Is it normal to slow down on such number of users? EDIT: I forgot to mention that I did some kind of profiling. I runned commands top, htop and they showed me that all memory was being used by MySQL! After some time MySQL starts to perform terribly slow, site goes down, and we have to restart/stop apache to reduce load. Administrators said that there was about 200 active mysql connections at that moment. The worst point is that we need to solve this ASAP, I can't do deep profiling analysis/code refactoring, so I'm considering 2 ways: my tables are MyIsam, I heard they use table-level locking which is very slow, is it right? could I change it to Innodb without worry? what if I take MySQL, and move it to dedicated machine with a lot of RAM?

Read the article

Research and replace Word Rtf

- by Perello

I'm working on an application which has a workflow for postal mails. These postal mails are generated according to my application business rules. Models are in html or Rtf and it works perfectly as long the user do not create the rtf with word. This is not within the specs, but my hierarchy would welcome a Word compatibility if it don't involve too much work, and it would please and ease the life of our customer. The Rtf models have tags which are replaced by application values. In most RTF, tags are not splitted, so the search and replace works perfectly. I wish to be handle word with few modifications. Example data : [[FooBuzz]] in most rtf it's not splited. In word 2003 : {\rtlch\fcs1 \af0 \ltrch\fcs0 \insrsid5517131 [[}{\rtlch\fcs1 \af0 \ltrch\fcs0 \insrsid2708730 FooBuzz}{\rtlch\fcs1 \af0 \ltrch\fcs0 \insrsid5517131 ]]} And their word (word 2007) splitted also Foo{garbage inside} Buzz. So i wish to be able to handle common RTF perfectly, and detect tags even if they are splitted. I have 2 constraints. First no regression, second it has to stay simple. Performance is not an issue here. I'm using symfony 1.4. The actual relevant research code part : $regExpression = '/\[\[([^\[\]]*)\]\]/'; preg_match_all($regExpression, $sTemplate, $outKeys);

Read the article

How to efficiently convert String, integer, double, datetime to binary and vica versa?

- by Ben

Hi, I'm quite new to C# (I'm using .NET 4.0) so please bear with me. I need to save some object properties (their properties are int, double, String, boolean, datetime) to a file. But I want to encrypt the files using my own encryption, so I can't use FileStream to convert to binary. Also I don't want to use object serialization, because of performance issues. The idea is simple, first I need to somehow convert objects (their properties) to binary (array), then encrypt (some sort of xor) the array and append it to the end of the file. When reading first decrypt the array and then somehow convert the binary array back to object properties (from which I'll generate objects). I know (roughly =) ) how to convert these things by hand and I could code it, but it would be useless (too slow). I think the best way would be just to get properties' representation in memory and save that. But I don't know how to do it using C# (maybe using pointers?). Also I though about using MemoryStream but again I think it would be inefficient. I am thinking about class Converter, but it does not support toByte(datetime) (documentation says it always throws exception). For converting back I think the only options is class Converter. Note: I know the structure of objects and they will not change, also the maximum String length is also known. Thank you for all your ideas and time. EDIT: I will be storing only parts of objects, in some cases also parts of different objects (a couple of properties from one object type and a couple from another), thus I think that serialization is not an option for me.

Read the article

How can I optimize the SELECT statement running on an Oracle database?

- by Elvis Lou

I have a SELECT statement in ORACLE: SELECT COUNT(DISTINCT ds1.endpoint_msisdn) multiple30, dss1.service, dss1.endpoint_provisioning_id, dss1.company_scope, Nvl(x.subscription_status, dss1.subscription_status) subscription_status FROM daily_summary ds1 join daily_summary ds2 ON ds1.endpoint_msisdn = ds2.endpoint_msisdn, daily_summary_static dss1, daily_summary_static dss2, (SELECT NULL subscription_status FROM dual UNION ALL SELECT -2 subscription_status FROM dual) x WHERE ds1.summary_ts >= To_date('10-04-2012', 'dd-mm-yyyy') - 30 AND ds1.summary_ts <= To_date('10-04-2012', 'dd-mm-yyyy') AND dss1.last_active >= To_date('10-04-2012', 'dd-mm-yyyy') - 30 AND dss1.last_active <= To_date('10-04-2012', 'dd-mm-yyyy') AND dss2.last_active >= To_date('10-04-2012', 'dd-mm-yyyy') - 30 AND dss2.last_active <= To_date('10-04-2012', 'dd-mm-yyyy') AND dss1.service <> dss2.service AND ( dss1.company_scope = 2 OR dss1.company_scope = 5 ) AND ( dss2.company_scope = 2 OR dss2.company_scope = 5 ) AND dss1.company_scope = dss2.company_scope AND ds1.endpoint_noc_id = dss1.endpoint_noc_id AND ds1.endpoint_host_id = dss1.endpoint_host_id AND ds1.endpoint_instance_id = dss1.endpoint_instance_id AND ds2.endpoint_noc_id = dss2.endpoint_noc_id AND ds2.endpoint_host_id = dss2.endpoint_host_id AND ds2.endpoint_instance_id = dss2.endpoint_instance_id AND dss1.endpoint_provisioning_id = dss2.endpoint_provisioning_id AND Least(1, ds1.total_actions) = 1 AND Least(1, ds2.total_actions) = 1 GROUP BY dss1.service, dss1.endpoint_provisioning_id, dss1.company_scope, Nvl(x.subscription_status, dss1.subscription_status); This query took about 26 minutes to return in my environment, but if I remove the section: dss1.last_active >= to_date('10-04-2012','dd-mm-yyyy') - 30 AND dss1.last_active <= to_date('10-04-2012','dd-mm-yyyy') AND dss2.last_active >= to_date('10-04-2012','dd-mm-yyyy') - 30 AND dss2.last_active <= to_date('10-04-2012','dd-mm-yyyy') AND it only took 20 seconds to run. We have index on the column last_active, I don't know why the section slow down the performance so much? any ideas?

Read the article

Dangers when deploying Flash/Flex UI test automation hooks to production?

- by Merlyn Morgan-Graham

I am interested in doing automated testing against a Flex based UI. I have found out that my best options for UI automation (due to being C# controllable, good licensing conditions, etc) all seem to require that I compile test hooks into my application. Because of this, I am thinking of recommending that these hooks be compiled into our build. I have found a few places on the net that recommend not deploying bits with this instrumentation enabled, and I'd like to know why. Is it a performance drain, or a security risk? If it is a security risk, can you explain how the attack surface is increased? I am not a Flash or Flex developer, though I have some experience with threat modeling. For reference, here's the tools I'm specifically considering: QTP Selenium-Flex API I am having problems finding all the warnings/suggestions I found last night, but here's an example that I can find: http://www.riatest.com/products/getting-started.html Warning! Automation enabled applications expose all properties of all GUI components. This makes them vulnerable to malicious use. Never make automation enabled application publicly available. Always restrict access to such applications and to RIATest Loader to trusted users only. Related question (how to do conditional compilation to insert/remove those hooks): Conditionally including Flex libraries (SWCs) in mxmlc/compc ant tasks

Read the article

How to code a 'Next in Results' within search results in PHP

- by thebluefox

Right, bit of a head scratcher, although I've got a feeling there's an obvious answer and I'm just not seeing the wood for the trees. Baiscally, using Solr as a search engine for my site, bringing back 15 results per page. When you click on a result, you get a detail page, that has a "Next in Results" link on it, which obviously forwards you on to the next result. Whats the best way of doing this? I've come up with a few solutions but they're either too inpractical or just don't work. I could store all the ids in a session array, then grab the one after the current one and put that in the link. But with possibly hundreds/thousands of results, the memory that array would need, and the performance hit of dealing with it isn't practical. I could take the same approach and put it into the db, but I'll still have to deal with a potentially huge array when I grab them out of the db. Or; I could do the search again, only returning the id's, and grab the one after the one we're currently looking at. I think this could be the best option? Although it does seem kind of messy, namely because of when I have to select the id thats on a different 'page' (ie the 16th, 31st etc result). Unless I pass through where it was in the results, and select from there, but that still doesn't seem like the right way to do it. I'm really sorry if this is just complete nonsense, any help is massively appreciated as always, Cheers guys!

Read the article

Drawbacks with using Class Methods in Objective C.

- by RickiG

Hi I was wondering if there are any memory/performance drawbacks, or just drawbacks in general, with using Class Methods like: + (void)myClassMethod:(NSString *)param { // much to be done... } or + (NSArray*)myClassMethod:(NSString *)param { // much to be done... return [NSArray autorelease]; } It is convenient placing a lot of functionality in Class Methods, especially in an environment where I have to deal with memory management(iPhone), but there is usually a catch when something is convenient? An example could be a thought up Web Service that consisted of a lot of classes with very simple functionality. i.e. TomorrowsXMLResults; TodaysXMLResults; YesterdaysXMLResults; MondaysXMLResults; TuesdaysXMLResults; . . . n I collect a ton of these in my Web Service Class and just instantiate the web service class and let methods on this class call Class Methods on the 'Results' Classes. The classes are simple but they handle large amount of Xml, instantiate lots of objects etc. I guess I am asking if Class Methods lives or are treated different on the stack and in memory than messages to instantiated objects? Or are they just instantiated and pulled down again behind the scenes and thus, just a way of saving a few lines of code?

Read the article

MySQL Database Design with Internationalization

- by Some name

Hello, I'm going to start work on a medium sized application, and i'm planning it's db design. One thing that I'm not sure about is this. I will have many tables which will need internationalization, such as: "membership_options, gender_options, language_options etc" Each of these tables will share common i18n fields, like: "title, alternative_title, short_description, description" In your opinion which is the best way to do it? Have an i18n table with the same fields for each of the tables that will need them? or do something like: Membership table Gender table ---------------- -------------- id | created_at id | created_at 1 - 22.03.2001 1 - 14.08.2002 2 - 22.03.2001 2 - 14.08.2002 General translation table ------------------------- record_id | table_name | string_name | alternative_title| .... |id_language 1 - membership regular null 1 (english) 1 - membership normale null 2 (italian) 1 - gender man null 1(english) 1 -gender uomo null 2(italian) This would avoid me repeating something like: membership_translation table ----------------------------- membership_id | name | alternative_title | id_lang 1 regular null 1 1 normale null 2 gender_translation table ----------------------------- gender_id | name | alternative_title | id_lang 1 man null 1 1 uomo null 2 and so on, so i would probably reduce the number of db tables, but i'm not sure about performance.I'm not much of a DB designer, so please let me know.

Read the article

How to insert zeros between bits in a bitmap?

- by anatolyg

I have some performance-heavy code that performs bit manipulations. It can be reduced to the following well-defined problem: Given a 13-bit bitmap, construct a 26-bit bitmap that contains the original bits spaced at even positions. To illustrate: 0000000000000000000abcdefghijklm (input, 32 bits) 0000000a0b0c0d0e0f0g0h0i0j0k0l0m (output, 32 bits) I currently have it implemented in the following way in C: if (input & (1 << 12)) output |= 1 << 24; if (input & (1 << 11)) output |= 1 << 22; if (input & (1 << 10)) output |= 1 << 20; ... My compiler (MS Visual Studio) turned this into the following: test eax,1000h jne 0064F5EC or edx,1000000h ... (repeated 13 times with minor differences in constants) I wonder whether i can make it any faster. I would like to have my code written in C, but switching to assembly language is possible. Can i use some MMX/SSE instructions to process all bits at once? Maybe i can use multiplication? (multiply by 0x11111111 or some other magical constant) Would it be better to use condition-set instruction (SETcc) instead of conditional-jump instruction? If yes, how can i make the compiler produce such code for me? Any other idea how to make it faster? Any idea how to do the inverse bitmap transformation (i have to implement it too, bit it's less critical)?

Read the article

Is it possible to definitively identify whether a DML command was issued from a stored procedure?

- by Ed Harper

I have inherited a SQL Server 2008 database to which calling applications have access through stored procedures. Each table in the database has a shadow audit table into which Insert/Update/Delete operations for are logged. Performance testing on populating the audit tables showed that inserting the audit records using OUTPUT clauses was 20% or so faster than using triggers, so this has been implemented in the stored procedures. However, because this design cannot track changes made directly to the tables through DML statements issued directly against the tables, triggers have also been implemented which use the value of @@NESTLEVEL to determine whether or not to run the trigger (the assumption being that all DML run through stored procedures will have @@NESTLEVEL 1). i.e. the body of the trigger code looks something like: IF @@NESTLEVEL = 1 -- implies call is direct sql so generate history from here BEGIN ... insert into audit table This design is flawed because it won't track updates where DML statements are executed in dynamic SQL, or any other context where @@NESTLEVEL is raised above 1. Can anyone suggest a completely reliable method we can use in the triggers to execute them only if not triggered by a stored procedure? Or is this (as I suspect) not possible?

Read the article

PHP Socket Server vs node.js: Web Chat

- by Eliasdx

I want to program a HTTP WebChat using long-held HTTP requests (Comet), ajax and websockets (depending on the browser used). Userdatabase is in mysql. Chat is written in PHP except maybe the chat stream itself which could also be written in javascript (node.js): I don't want to start a php process per user as there is no good way to send the chat messages between these php childs. So I thought about writing an own socket server in either PHP or node.js which should be able to handle more then 1000 connections (chat users). As a purely web developer (php) I'm not much familiar with sockets as I usually let web server care about connections. The chat messages won't be saved on disk nor in mysql but in RAM as an array or object for best speed. As far as I know there is no way to handle multiple connections at the same time in a single php process (socket server), however you can accept a great amount of socket connections and process them successive in a loop (read and write; incoming message - write to all socket connections). The problem is that there will most-likely be a lag with ~1000 users and mysql operations could slow the whole thing down which will then affect all users. My question is: Can node.js handle a socket server with better performance? Node.js is event-based but I'm not sure if it can process multiple events at the same time (wouldn't that need multi-threading?) or if there is just an event queue. With an event queue it would be just like php: process user after user. I could also spawn a php process per chat room (much less users) but afaik there are singlethreaded IRC servers which are also capable to handle thousands of users. (written in c++ or whatever) so maybe it's also possible in php. I would prefer PHP over Node.js because then the project would be php-only and not a mixture of programming languages. However if Node can process connections simultaneously I'd probably choose it.

Read the article

Generic InBetween Function.

- by Luiscencio

I am tired of writing x > min && x < max so i wawnt to write a simple function but I am not sure if I am doing it right... actually I am not cuz I get an error: bool inBetween<T>(T x, T min, T max) where T:IComparable { return (x > min && x < max); } errors: Operator '>' cannot be applied to operands of type 'T' and 'T' Operator '<' cannot be applied to operands of type 'T' and 'T' may I have a bad understanding of the where part in the function declaring note: for those who are going to tell me that I will be writing more code than before... think on readability =) any help will be appreciated EDIT deleted cuz it was resolved =) ANOTHER EDIT so after some headache I came out with this (ummm) thing following @Jay Idea of extreme readability: public static class test { public static comparision Between<T>(this T a,T b) where T : IComparable { var ttt = new comparision(); ttt.init(a); ttt.result = a.CompareTo(b) > 0; return ttt; } public static bool And<T>(this comparision state, T c) where T : IComparable { return state.a.CompareTo(c) < 0 && state.result; } public class comparision { public IComparable a; public bool result; public void init<T>(T ia) where T : IComparable { a = ia; } } } now you can compare anything with extreme readability =) what do you think.. I am no performance guru so any tweaks are welcome

Read the article

Why would this Lua optimization hack help?

- by Ian Boyd

i'm looking over a document that describes various techniques to improve performance of Lua script code, and i'm shocked that such tricks would be required. (Although i'm quoting Lua, i've seen similar hacks in Javascript). Why would this optimization be required: For instance, the code for i = 1, 1000000 do local x = math.sin(i) end runs 30% slower than this one: local sin = math.sin for i = 1, 1000000 do local x = sin(i) end They're re-declaring sin function locally. Why would this be helpful? It's the job of the compiler to do that anyway. Why is the programmer having to do the compiler's job? i've seen similar things in Javascript; and so obviously there must be a very good reason why the interpreting compiler isn't doing its job. What is it? i see it repeatedly in the Lua environment i'm fiddling in; people redeclaring variables as local: local strfind = strfind local strlen = strlen local gsub = gsub local pairs = pairs local ipairs = ipairs local type = type local tinsert = tinsert local tremove = tremove local unpack = unpack local max = max local min = min local floor = floor local ceil = ceil local loadstring = loadstring local tostring = tostring local setmetatable = setmetatable local getmetatable = getmetatable local format = format local sin = math.sin What is going on here that people have to do the work of the compiler? Is the compiler confused by how to find format? Why is this an issue that a programmer has to deal with? Why would this not have been taken care of in 1993? i also seem to have hit a logical paradox: Optimizatin should not be done without profiling Lua has no ability to be profiled Lua should not be optimized

Read the article

Non standard interaction among two tables to avoid very large merge

- by riko

Suppose I have two tables A and B. Table A has a multi-level index (a, b) and one column (ts). b determines univocally ts. A = pd.DataFrame( [('a', 'x', 4), ('a', 'y', 6), ('a', 'z', 5), ('b', 'x', 4), ('b', 'z', 5), ('c', 'y', 6)], columns=['a', 'b', 'ts']).set_index(['a', 'b']) AA = A.reset_index() Table B is another one-column (ts) table with non-unique index (a). The ts's are sorted "inside" each group, i.e., B.ix[x] is sorted for each x. Moreover, there is always a value in B.ix[x] that is greater than or equal to the values in A. B = pd.DataFrame( dict(a=list('aaaaabbcccccc'), ts=[1, 2, 4, 5, 7, 7, 8, 1, 2, 4, 5, 8, 9])).set_index('a') The semantics in this is that B contains observations of occurrences of an event of type indicated by the index. I would like to find from B the timestamp of the first occurrence of each event type after the timestamp specified in A for each value of b. In other words, I would like to get a table with the same shape of A, that instead of ts contains the "minimum value occurring after ts" as specified by table B. So, my goal would be: C: ('a', 'x') 4 ('a', 'y') 7 ('a', 'z') 5 ('b', 'x') 7 ('b', 'z') 7 ('c', 'y') 8 I have some working code, but is terribly slow. C = AA.apply(lambda row: ( row[0], row[1], B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))), axis=1).set_index(['a', 'b']) Profiling shows the culprit is obviously B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))). However, standard solutions using merge/join would take too much RAM in the long run. Consider that now I have 1000 a's, assume constant the average number of b's per a (probably 100-200), and consider that the number of observations per a is probably in the order of 300. In production I will have 1000 more a's. 1,000,000 x 200 x 300 = 60,000,000,000 rows may be a bit too much to keep in RAM, especially considering that the data I need is perfectly described by a C like the one I discussed above. How would I improve the performance?

Read the article

How to control virtual memory management in linux?

- by chmike

I'm writing a program that uses an mmap file to hold a huge buffer organized as an array of 64 MB blocks. The blocks are used to aggregate data received from different hosts through the network. As a consequence the total data size written in each block is not known in advance. Most of the time it is only 2MB but in some cases it can be up to 20MB or more. The data doesn't stay long in the buffer. 90% is deleted after less than a second and the rest is transmitted to another host. I would like to know if there is a way to tell the virtual memory manager that ram pages are not dirty anymore when data is deleted. Should I use mmap and munmap when a block is used and released to control the virtual memory ? What would be the overhead of doing this ? Also, some colleagues expressed concerns about the performance impact of allocating such a big mmap space. I expect it to behave like a swap file so that only dirty pages are to be considered.

Read the article

AppFabric caching's local cache isnt working for us... What are we doing wrong?

- by Olly

We are using appfabric as the 2ndlevel cache for an NHibernate asp.net application comprising a customer facing website and an admin website. They are both connected to the same cache so when admin updates something, the customer facing site is updated. It seems to be working OK - we have a CacheCLuster on a seperate server and all is well but we want to enable localcache to get better performance, however, it dosnt seem to be working. We have enabled it like this... bool UseLocalCache = int LocalCacheObjectCount = int.MaxValue; TimeSpan LocalCacheDefaultTimeout = TimeSpan.FromMinutes(3); DataCacheLocalCacheInvalidationPolicy LocalCacheInvalidationPolicy = DataCacheLocalCacheInvalidationPolicy.TimeoutBased; if (UseLocalCache) { configuration.LocalCacheProperties = new DataCacheLocalCacheProperties( LocalCacheObjectCount, LocalCacheDefaultTimeout, LocalCacheInvalidationPolicy ); // configuration.NotificationProperties = new DataCacheNotificationProperties(500, TimeSpan.FromSeconds(300)); } Initially we tried using a timeout invalidation policy (3mins) and our app felt like it was running faster. HOWEVER, we noticed that if we changed something in the admin site, it was immediatley updated in the live site. As we are using timeouts not notifications, this demonstrates that the local cache isnt being queried (or is, but is always missing). The cache.GetType().Name returns "LocalCache" - so the factory has made a local cache. Running "Get-Cache-Statistics MyCache" in PS on my dev environment (asp.net app running local from vs2008, cache cluster running on a seperate w2k8 machine) show a handful of Request Counts. However, on the Production environment, the Request Count increases dramaticaly. We tried following the method here to se the cache cliebt-server traffic... http://blogs.msdn.com/b/appfabriccat/archive/2010/09/20/appfabric-cache-peeking-into-client-amp-server-wcf-communication.aspx but the log file had nothing but the initial header in it - i.e no loggin either. I cant find anything in SO or Google. Have we done something wrong? Have we got a screwy install of AppFabric - we installed it via WebPlatform Installer - I think? (note: the IIS box running ASp.net isnt in yhe cluster - it is just the client). Any insights greatfully received!

Read the article

Heavy Mysql operation & Time Constraints [closed]

- by Rahul Jha

There is a performance issue where that I have stuck with my application which is based on PHP & MySql. The application is for Data Migration where data has to be uploaded and after various processes (Cleaning from foreign characters, duplicate check, id generation) it has to be inserted into one central table and then to 5 different tables. There, an id is generated and that id has to be updated to central table. There are different sets of records and validation rules. The problem I am facing is that when I insert say(4K) rows file (containing 20 columns) it is working fine within 15 min it gets inserted everywhere. But, when I insert the same records again then at this time it is taking one hour to insert (ideally it should get inserted by marking earlier inserted data as duplicate). After going through the log file, I noticed is that there is a Mysql select statement where I am checking the duplicates and getting ID which are duplicates. Then I am calling a function inside for loop which is basically inserting records into 5 tables and updates id to central table. This Calling function is major time of whole process. P.S. The records has to be inserted record by record.. Kindly Suggest some solution.. //This is that sample code $query=mysql_query("SELECT DISTINCT p1.ID FROM table1 p1, table2 p2, table3 a WHERE p2.datatype =0 AND (p1.datatype =1 || p1.datatype=2) AND p2.ID =0 AND p1.ID = a.ID AND p1.coulmn1 = p2.column1 AND p1.coulmn2 = p2.coulmn2 AND a.coulmn3 = p2.column3"); $num=mysql_num_rows($query); for($i=0;$i<$num;$i++) { $f=mysql_result($query,$i,"ID"); //calling function RecordInsert($f); }

Read the article

How to justify using a scripting language as part of a project

- by sylvanaar

I have a specific project in which I want to use either a scripting language + C, or as an alternative a 100% Java solution. The program adapts a legacy system for use with other moderns systems. Basically, I have few choices as to what language I can use. I have C/C++, Java 1.4, and I have also compiled the Lua for this environment. The program does 'screen scraping' and has to deal with alot of strings. That part of the code is highly variable. Most of the developers at my company use C, so - my original design was to write some portions in C, and use Lua for the part that dealt with strings and changed freqently. I was told 'You have to justify your use of the scripting language.' So i reworked my design using 100% Java, and was told - Java wont have enough performance. You should do the whole thing in C. I'm not controlling lasers or doing image processing - just some screen scraping. I still have to provide justification for using anything but C - so what justification can I provide?

Read the article

Mamimum page fetch with maximum bandwith

- by Ehsan

Hi I want to create an application like a spider I've implement fetching page as the following code in multi-thread application but there is two problem 1) I want to use my maximum bandwidth to send/receive request, how should I config my request to do so (Like Download Accelerator application and the like) cause I heard the normal application will use 66% of the available bandwidth. 2) I don't know what exactly HttpWebRequest.KeepAlive do, but as its name implies I think i can create a connection to a website and without closing the connection sending another request to that web site using existing connection. does it boost performance or Im wrong public PageFetchResult Fetch() { PageFetchResult fetchResult = new PageFetchResult(); try { HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(URLAddress); HttpWebResponse resp = (HttpWebResponse)req.GetResponse(); Uri requestedURI = new Uri(URLAddress); Uri responseURI = resp.ResponseUri; if (Uri.Equals(requestedURI, responseURI)) { string resultHTML = ""; byte[] reqHTML = ResponseAsBytes(resp); if (!string.IsNullOrEmpty(FetchingEncoding)) resultHTML = Encoding.GetEncoding(FetchingEncoding).GetString(reqHTML); else if (!string.IsNullOrEmpty(resp.CharacterSet)) resultHTML = Encoding.GetEncoding(resp.CharacterSet).GetString(reqHTML); req.Abort(); resp.Close(); fetchResult.IsOK = true; fetchResult.ResultHTML = resultHTML; } else { URLAddress = responseURI.AbsoluteUri; relayPageCount++; if (relayPageCount > 5) { fetchResult.IsOK = false; fetchResult.ErrorMessage = "Maximum page redirection occured."; relayPageCount = 0; return fetchResult; } req.Abort(); resp.Close(); return Fetch(); } } catch (Exception ex) { fetchResult.IsOK = false; fetchResult.ErrorMessage = ex.Message; } return fetchResult; }

Read the article

Rewriting a for loop in pure NumPy to decrease execution time

- by Statto

I recently asked about trying to optimise a Python loop for a scientific application, and received an excellent, smart way of recoding it within NumPy which reduced execution time by a factor of around 100 for me! However, calculation of the B value is actually nested within a few other loops, because it is evaluated at a regular grid of positions. Is there a similarly smart NumPy rewrite to shave time off this procedure? I suspect the performance gain for this part would be less marked, and the disadvantages would presumably be that it would not be possible to report back to the user on the progress of the calculation, that the results could not be written to the output file until the end of the calculation, and possibly that doing this in one enormous step would have memory implications? Is it possible to circumvent any of these? import numpy as np import time def reshape_vector(v): b = np.empty((3,1)) for i in range(3): b[i][0] = v[i] return b def unit_vectors(r): return r / np.sqrt((r*r).sum(0)) def calculate_dipole(mu, r_i, mom_i): relative = mu - r_i r_unit = unit_vectors(relative) A = 1e-7 num = A*(3*np.sum(mom_i*r_unit, 0)*r_unit - mom_i) den = np.sqrt(np.sum(relative*relative, 0))**3 B = np.sum(num/den, 1) return B N = 20000 # number of dipoles r_i = np.random.random((3,N)) # positions of dipoles mom_i = np.random.random((3,N)) # moments of dipoles a = np.random.random((3,3)) # three basis vectors for this crystal n = [10,10,10] # points at which to evaluate sum gamma_mu = 135.5 # a constant t_start = time.clock() for i in range(n[0]): r_frac_x = np.float(i)/np.float(n[0]) r_test_x = r_frac_x * a[0] for j in range(n[1]): r_frac_y = np.float(j)/np.float(n[1]) r_test_y = r_frac_y * a[1] for k in range(n[2]): r_frac_z = np.float(k)/np.float(n[2]) r_test = r_test_x +r_test_y + r_frac_z * a[2] r_test_fast = reshape_vector(r_test) B = calculate_dipole(r_test_fast, r_i, mom_i) omega = gamma_mu*np.sqrt(np.dot(B,B)) # write r_test, B and omega to a file frac_done = np.float(i+1)/(n[0]+1) t_elapsed = (time.clock()-t_start) t_remain = (1-frac_done)*t_elapsed/frac_done print frac_done*100,'% done in',t_elapsed/60.,'minutes...approximately',t_remain/60.,'minutes remaining'

Search Results

Search found 13608 results on 545 pages for 'performance dashboard'.

Page 486/545 | < Previous Page | 482 483 484 485 486 487 488 489 490 491 492 493 | Next Page >

- by dan

- by Yun Ling

- by Overflew

- by bullfish

- by David Cournapeau

- by galymzhan

- by Perello

- by Ben

- by Elvis Lou

- by Merlyn Morgan-Graham

- by thebluefox

- by RickiG

- by Some name

- by anatolyg

- by Ed Harper

- by Eliasdx

- by Luiscencio

- by Ian Boyd

- by riko

- by chmike

- by Olly

- by Rahul Jha

- by sylvanaar

- by Ehsan

- by Statto

< Previous Page | 482 483 484 485 486 487 488 489 490 491 492 493 | Next Page >