Search Results

Search found 28685 results on 1148 pages for 'query performance'.

Page 305/1148 | < Previous Page | 301 302 303 304 305 306 307 308 309 310 311 312  | Next Page >

  • Efficient alternative to merge() when building dataframe from json files with R?

    - by Bryan
    I have written the following code which works, but is painfully slow once I start executing it over thousands of records: require("RJSONIO") people_data <- data.frame(person_id=numeric(0)) json_data <- fromJSON(json_file) n_people <- length(json_data) for(lender in 1:n_people) { person_dataframe <- as.data.frame(t(unlist(json_data[[person]]))) people_data <- merge(people_data, person_dataframe, all=TRUE) } output_file <- paste("people_data",".csv") write.csv(people_data, file=output_file) I am attempting to build a unified data table from a series of json-formated files. The fromJSON() function reads in the data as lists of lists. Each element of the list is a person, which then contains a list of the attributes for that person. For example: [[1]] person_id name gender hair_color [[2]] person_id name location gender height [[...]] structure(list(person_id = "Amy123", name = "Amy", gender = "F", hair_color = "brown"), .Names = c("person_id", "name", "gender", "hair_color")) structure(list(person_id = "matt53", name = "Matt", location = structure(c(47231, "IN"), .Names = c("zip_code", "state")), gender = "M", height = 172), .Names = c("person_id", "name", "location", "gender", "height")) The end result of the code above is matrix where the columns are every person-attribute that appears in the structure above, and the rows are the relevant values for each person. As you can see though, some data is missing for some of the people, so I need to ensure those show up as NA and make sure things end up in the right columns. Further, location itself is a vector with two components: state and zip_code, meaning it needs to be flattened to location.state and location.zip_code before it can be merged with another person record; this is what I use unlist() for. I then keep the running master table in people_data. The above code works, but do you know of a more efficient way to accomplish what I'm trying to do? It appears the merge() is slowing this to a crawl... I have hundreds of files with hundreds of people in each file. Thanks! Bryan

    Read the article

  • Eclipse JUnit Plugin Test very slow to re-execute Test Suite on Windows

    - by soundasleepful
    I'm having an odd, and stressing, problem with running a large JUnit Plugin test suite in Eclipse. When I try to re-run a JUnit plugin suite that has just been executed, Eclipse hangs for quite some time before it eventually wakes up and launches. It can take up to 5 minutes sometimes, and increases with the size of the suite. Visually, it appears as a GC cleanup, except that I have plenty of GC space available (400 MB freely allocated). The size of the workspace that is has to delete is well under 1 GB, and there are not too many files - definitely less than 20,000. While I was waiting for a new run to start, I decided to manually kill explorer.exe to see if it had any effect. Surprisingly, Eclipse instantly fell out of its freeze and ran as normal. This makes me think that Windows is somehow interfering with the deletion of these workspace files. They're not being put into the Recycle Bin though. The workspace is in C: which I think is out of the range of any workspace/domain stuff. Any ideas?

    Read the article

  • SQL Query - group by more than one column, but distinct

    - by Ranhiru
    I have a bidding table, as follows: SellID INT FOREIGN KEY REFERENCES SellItem(SellID), CusID INT FOREIGN KEY REFERENCES Customer(CusID), Amount FLOAT NOT NULL, BidTime DATETIME DEFAULT getdate() Now in my website I need to show the user the current bids; only the highest bid but without repeating the same user. SELECT CusID, Max(Amount) FROM Bid WHERE SellID = 10 GROUP BY CusID ORDER BY Max(Amount) DESC This is the best I have achieved so far. This gives the CusID of each user with the maximum bid and it is ordered ascending. But I need to get the BidTime for each result as well. When I try to put the BidTime in to the query: SELECT CusID, Max(Amount), BidTime FROM Bid WHERE SellID = 10 GROUP BY CusID ORDER BY Max(Amount) DESC I am told that "Column 'Bid.BidTime' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." Thus I tried: SELECT CusID, Max(Amount), BidTime FROM Bid WHERE SellID = 10 GROUP BY CusID, BidTime ORDER BY Max(Amount) DESC But this returns all the rows. No distinction. Any suggestions on solving this issue?

    Read the article

  • yuicompressor error, not sure what is wrong?

    - by mrblah
    Hi, Very confused here, trying out the yuicompressor on a simple javascript file. My js file looks like: function splitText(text) { return text.split('-')[1]; } The error is: [INFO] Using charset Cp1252 [Error] 1:20:illegal character [Error] 1:20:syntax error [Error] 1:40:illegal character [Error] 1:49:missing ; before statement [Error] 1:50:illegal character .. .. [Error] 7:3:missing | in compound statement [error] 1:0:compilation produced 38 syntax errors ... Can someone please explain to me what is wrong?

    Read the article

  • Searching 2 fields at the same time

    - by donpal
    I have a table of first and last names firstname lastname --------- --------- Joe Robertson Sally Robert Jim Green Sandra Jordan I'm trying to search this table based on an input that consists of the full name. For example: input: Joe Robert I thought about using SELECT * FROM tablename WHERE firstname LIKE BUT the table stores the first and last name separately, so I'm not sure how to do the search in this case

    Read the article

  • fast algorithm implementation to sort very small set

    - by aaa
    hello. This is the problem I ran into long time ago. I thought I may ask your for your ideas. assume I have very small set of numbers (integers), 4 or 8 elements, that need to be sorted, fast. what would be the best approach/algorithm? my approach was to use the max/min functions. I guess my question pertains more to implementation, rather than type of algorithm. At this point it becomes somewhat hardware dependent , so let us assume Intel 64-bit processor with SSE3 . Thanks

    Read the article

  • Query optimization (OR based)

    - by john194
    I have googled but I can't find answers for these questions. Your advice is appreciated. centOS on vps with 512MB RAM, nginx, php5 (fastcgi), mysql5 (myisam, not innodb). I need to optimize this app created by some ex-employee. This app is working, but it's slow. Table: t1(id[bigint(20)],c1[mediumtext],c2[mediumtext],c3[mediumtext],c4[mediumtext]) id is some random big number, and is PK Those mediumtext rows look like this: c1="|box-002877|" c2="|ct-2348|rd-11124854|hw-3949|wd-8872|hw-119037736|...etc.. " c3="|fg-2448|wd-11172|hw-1656|...etc.. " c4="|hg-2448|qd-16667|...etc." (some columns contain a lot of data, around 900 KiB, database around 300 MiB) Yes, mediumtext "is bad", and (20) is too big... but I didn't create this. Those codes can be found on any of those 4 mediumtext's... //he needs all the columns of the row containing $code, so he wrote this: function f1($code) { SELECT * FROM t1 WHERE c1 LIKE '%$code%' OR c2 LIKE '%$code%' OR c3 LIKE '%$code%' OR c4 LIKE '%$code%'; Questions: Q1. If $code is found on c1... mysql automatically stops checking and returns row=id+c1+c2+c3+c4? or it will continue (wasting time) checking c2, c3 and c4?... Q2. Mysql is working with this table on disk (not RAM) because of the mediumtext, right? is this the primary cause of slowness? Q3. That query can be cached by mysql (if using a big query_cache_size=128M value on the my.cnf)? or that's not cacheable due to the mediumtexts, or due to the "OR LIKE"...? Q4. Do you recommend rewriting this with mysql's INSTR() / LOCATE() / MATCH..AGAINST [FULLTEXT]?

    Read the article

  • What can cause legit MySql INSERT INTO command to fail?

    - by Makis
    I can't figure out what's causing my INSERT INTO's to fail to certain table in MySql. I can manage them to other tables. The table looks like: CREATE TABLE IF NOT EXISTS `Match` ( `id` int(11) NOT NULL AUTO_INCREMENT, `match_no` int(11) NOT NULL, `season` int(11) NOT NULL, `hometeam` int(11) NOT NULL, `awayteam` int(11) NOT NULL, PRIMARY KEY (`id`), KEY `match_no` (`match_no`), KEY `season` (`season`), KEY `hometeam` (`hometeam`), KEY `awayteam` (`awayteam`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ; And the command is INSERT INTO Match (`match_no`, `season`, `hometeam`, `awaytem`) VALUES (1, 1, 2, 3) All I get is: 1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Match (match_no, season, hometeam, awaytem) VALUES (1, 1, 2, 3)' at line 1 I have checked the manual and half-a-dozen examples from the web and whatnought and tried all sorts of changes to the syntax in case there is some MySql specific oddity, but nothing seems to work.

    Read the article

  • Good way to make animations with cocos2d?

    - by Johnny Oin
    Hi there, I'm making a little iphone game, and I would get some clues. Let's imagine: Two background sprites moving pretty fast from right to left, and moving up and down with accelerometer. I guess I can't use animations here, cause the movement of the background is recalculated at each frame. So I use a schedule with an interval of 0.025s and move my sprites at each clock with a : sprite.position = ccp(x, y); So here is my problem: the result is laggy, with only these two sprites. I tried both declaring sprites in the header, and getting them with CCNodes and Tags. It's quite the same. So if someone can give me a hint on what is the best way to do that, that would be so nice. I wonder if the problem can't be the fact that sprites are moving very fast, but i'm not sure. Anyway, thanks for your time. J.

    Read the article

  • Why is numpy's einsum faster than numpy's built in functions?

    - by Ophion
    Lets start with three arrays of dtype=np.double. Timings are performed on a intel CPU using numpy 1.7.1 compiled with icc and linked to intel's mkl. A AMD cpu with numpy 1.6.1 compiled with gcc without mkl was also used to verify the timings. Please note the timings scale nearly linearly with system size and are not due to the small overhead incurred in the numpy functions if statements these difference will show up in microseconds not milliseconds: arr_1D=np.arange(500,dtype=np.double) large_arr_1D=np.arange(100000,dtype=np.double) arr_2D=np.arange(500**2,dtype=np.double).reshape(500,500) arr_3D=np.arange(500**3,dtype=np.double).reshape(500,500,500) First lets look at the np.sum function: np.all(np.sum(arr_3D)==np.einsum('ijk->',arr_3D)) True %timeit np.sum(arr_3D) 10 loops, best of 3: 142 ms per loop %timeit np.einsum('ijk->', arr_3D) 10 loops, best of 3: 70.2 ms per loop Powers: np.allclose(arr_3D*arr_3D*arr_3D,np.einsum('ijk,ijk,ijk->ijk',arr_3D,arr_3D,arr_3D)) True %timeit arr_3D*arr_3D*arr_3D 1 loops, best of 3: 1.32 s per loop %timeit np.einsum('ijk,ijk,ijk->ijk', arr_3D, arr_3D, arr_3D) 1 loops, best of 3: 694 ms per loop Outer product: np.all(np.outer(arr_1D,arr_1D)==np.einsum('i,k->ik',arr_1D,arr_1D)) True %timeit np.outer(arr_1D, arr_1D) 1000 loops, best of 3: 411 us per loop %timeit np.einsum('i,k->ik', arr_1D, arr_1D) 1000 loops, best of 3: 245 us per loop All of the above are twice as fast with np.einsum. These should be apples to apples comparisons as everything is specifically of dtype=np.double. I would expect the speed up in an operation like this: np.allclose(np.sum(arr_2D*arr_3D),np.einsum('ij,oij->',arr_2D,arr_3D)) True %timeit np.sum(arr_2D*arr_3D) 1 loops, best of 3: 813 ms per loop %timeit np.einsum('ij,oij->', arr_2D, arr_3D) 10 loops, best of 3: 85.1 ms per loop Einsum seems to be at least twice as fast for np.inner, np.outer, np.kron, and np.sum regardless of axes selection. The primary exception being np.dot as it calls DGEMM from a BLAS library. So why is np.einsum faster that other numpy functions that are equivalent? The DGEMM case for completeness: np.allclose(np.dot(arr_2D,arr_2D),np.einsum('ij,jk',arr_2D,arr_2D)) True %timeit np.einsum('ij,jk',arr_2D,arr_2D) 10 loops, best of 3: 56.1 ms per loop %timeit np.dot(arr_2D,arr_2D) 100 loops, best of 3: 5.17 ms per loop The leading theory is from @sebergs comment that np.einsum can make use of SSE2, but numpy's ufuncs will not until numpy 1.8 (see the change log). I believe this is the correct answer, but have not been able to confirm it. Some limited proof can be found by changing the dtype of input array and observing speed difference and the fact that not everyone observes the same trends in timings.

    Read the article

  • Slow SelectSingleNode

    - by Simon
    I have a simple structured XML file like this: <ttest ID="ttest00001", NickName="map00001"/> <ttest ID="ttest00002", NickName="map00002"/> <ttest ID="ttest00003", NickName="map00003"/> <ttest ID="ttest00004", NickName="map00004"/> ..... This xml file can be around 2.5MB. In my source code I will have a loop to get nicknames In each loop, I have something like this: nickNameLoopNum = MyXmlDoc.SelectSingleNode("//ttest[@ID=' + testloopNum + "']").Attributes["NickName"].Value This single line will cost me 30 to 40 millisecond. I searched some old articles (dated back to 2002) saying, use some sort of compiled "xpath" can help the situation, but that was 5 years ago. I wonder is there a mordern practice to make it faster? (I'm using .NET 3.5)

    Read the article

  • 2 approaches for tracking online users with Redis. Which one is faster?

    - by Stanislav
    Recently I found an nice blog post presenting 2 approaches for tracking online users of a web site with the help of Redis. 1) Smart-keys and setting their expiration http://techno-weenie.net/2010/2/3/where-s-waldo-track-user-locations-with-node-js-and-redis 2) Set-s and intersects http://www.lukemelia.com/blog/archives/2010/01/17/redis-in-practice-whos-online/ Can you judge which one should be faster and why?

    Read the article

  • Tuning MySQL to take advantage of a 4GB VPS

    - by alistair.mp
    Hello, We're running a large site at the moment which has a dedicated VPS for it's database server which is running MySQL and nothing else. At the moment all four CPU cores are running at close to 100% all of the time but the memory usage sticks at around 268MB out of an available 4096MB. I'm wondering what we can do to better utilise the memory and reduce the CPU load by tweaking MySQL's settings? Here is what we currently have in my.cnf: http://pastie.org/private/hxeji9o8n3u9up9mvtinbq Thanks

    Read the article

  • Is there a fast way to jump to element using XMLReader?

    - by Derk
    I am using XMLReader to read a large XML file with about 1 million elements on the level I am reading from. However, I've calculated it will take over 10 seconds when I jump to -for instance- element 500.000 using XMLReader::next ([ string $localname ] ) or XMLReader::read ( void ) This is not very usable. Is there a faster way to do this?

    Read the article

  • Problem with SQL, ResultSet in java

    - by aphex
    How can I iterate ResultSet ? I've tried with the following code, but i get the error java.sql.SQLException: Illegal operation on empty result set. while ( !rs.isLast()) { rs.next(); int id = rs.getInt("person_id"); SQL.getInstance().getSt().execute("INSERT ref_person_pub(person_id) VALUES(" + id + ")"); }

    Read the article

  • fastest (low latency) method for Inter Process Communication between Java and C/C++

    - by Bastien
    Hello, I have a Java app, connecting through TCP socket to a "server" developed in C/C++. both app & server are running on the same machine, a Solaris box (but we're considering migrating to Linux eventually). type of data exchanged is simple messages (login, login ACK, then client asks for something, server replies). each message is around 300 bytes long. Currently we're using Sockets, and all is OK, however I'm looking for a faster way to exchange data (lower latency), using IPC methods. I've been researching the net and came up with references to the following technologies: - shared memory - pipes - queues but I couldn't find proper analysis of their respective performances, neither how to implement them in both JAVA and C/C++ (so that they can talk to each other), except maybe pipes that I could imagine how to do. can anyone comment about performances & feasibility of each method in this context ? any pointer / link to useful implementation information ? thanks for your help

    Read the article

  • High CPU usage when running several "java -version" in parallel

    - by Prateesh
    This is just out of curiosity to understand i have a small shell script for ((i = 0; i < 50; i++)) do java -version & done when i run this my CPU usage report by sar is as below 07:51:25 PM CPU %user %nice %system %iowait %steal %idle 07:51:30 PM all 6.98 0.00 1.75 1.00 0.00 90.27 07:51:31 PM all 43.00 0.00 12.00 0.00 0.00 45.00 07:51:32 PM all 86.28 0.00 13.72 0.00 0.00 0.00 07:51:33 PM all 5.25 0.00 1.75 0.50 0.00 92.50 As you can see, on the third line the CPU is at 100% My java version is 1.5.0_22-b03.

    Read the article

  • SQL Joining Two or More from Table B with Common Data in Table A

    - by Matthew Frederick
    The real-world situation is a series of events that each have two or more participants (like sports teams, though there can be more than two in an event), only one of which is the host of the event. There is an Event db table for each unique event and a Participant db table with unique participants. They are joined together using a Matchup table. They look like this: Event EventID (PK) (other event data like the date, etc.) Participant ParticipantID (PK) Name Matchup EventID (FK to Event table) ParicipantID (FK to Participant) Host (1 or 0, only 1 host = 1 per EventID) What I'd like to get as a result is something like this: EventID ParticipantID where host = 1 Participant.Name where host = 1 ParticipantID where host = 0 Participant.Name where host = 0 ParticipantID where host = 0 Participant.Name where host = 0 ... Where one event has 2 participants and another has 3 participants, for example, the third participant column data would be null or otherwise noticeable, something like (PID = ParticipantID): EventID PID-1(host) Name-1 (host) PID-2 Name-2 PID-3 Name-3 ------- ----------- ------------- ----- ------ ----- ------ 1 7 Lions 8 Tigers 12 Bears 2 11 Dogs 9 Cats NULL NULL I suspect the answer is reasonably straightforward but for some reason I'm not wrapping my head around it. Alternately it's very difficult. :) I'm using MYSQL 5 if that affects the available SQL.

    Read the article

  • WPF Dragging causes renderer to stop

    - by Cameron MacFarland
    I'm having a problem with my WPF app, where any sort of drag operation stops the UI from updating. The issue seems periodic, as in, the item drags, stops, drags again, stops, etc. in 2 second intervals. It's affecting all controls, including scroll bars. If checked this question as well as this one, and it doesn't seem to be caused by window transparencies. I'm running Win7 x64 with .NET 3.5sp1. Does anyone know what might be causing this, or a way of figuring out what might be causing this?

    Read the article

  • nested sql statment using and

    - by klay
    hi guys, how to make this work in mysql? select ID,COMPANY_NAME,contact1, SUBURB, CATEGORY, PHONE from Victoria where (city in ( select suburb from allsuburbs)) and CATEGORY='Banks' this below statement is working: select ID,COMPANY_NAME,contact1, SUBURB, CATEGORY, PHONE from Victoria where city in ( select suburb from allsuburbs) if I add "and" , it gives me an empty resultset, thanks

    Read the article

  • How to merge two tables based on common column and sort the results by date

    - by techiepark
    Hello friends, I have two mysql tables and i want to merge the results of these two tables based on the common column rev_id. The merged results should be sorted by the date of two tables. Please help me. CREATE TABLE `reply` ( `id` int(3) NOT NULL auto_increment, `name` varchar(25) NOT NULL default '', `member_id` varchar(45) NOT NULL, `rev_id` int(3) NOT NULL default '0', `description` text, `post_date` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `flag` char(2) NOT NULL default 'N', PRIMARY KEY (`id`), KEY `member_id` (`member_id`) ) ENGINE=MyISAM; CREATE TABLE `comment` ( `com_id` int(8) NOT NULL auto_increment, `rev_id` int(5) NOT NULL default '0', `member_id` varchar(50) NOT NULL, `comm_desc` text NOT NULL, `date_created` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, PRIMARY KEY (`com_id`), KEY `member_id` (`member_id`) ) ENGINE=MyISAM;

    Read the article

  • Perf4J Logging Config Help

    - by manyxcxi
    I currently have a long running process that I am trying to analyze with Perf4J. I currently have it writing results in CSV format to its own log file using the AsyncCoalescingStatisticsAppender and a StatisticsCsvLayout on the file appender. My question is; when I try and use the --graph option from the command line (using the perf4j jar) it isn't populating the data points- it isn't populating anything. Are my appenders set incorrectly? The log file contains hundreds (sometimes thousands) of data points of about 10 different tag names. <appender name="perfAppender" class="org.apache.log4j.FileAppender"> <param name="File" value="perfStats.log"/> <layout class="org.perf4j.log4j.StatisticsCsvLayout"> </layout> </appender> <appender name="CoalescingStatistics" class="org.perf4j.log4j.AsyncCoalescingStatisticsAppender"> <!-- The TimeSlice option is used to determine the time window for which all received StopWatch logs are aggregated to create a single GroupedTimingStatistics log. Here we set it to 10 seconds, overriding the default of 30000 ms --> <param name="TimeSlice" value="10000"/> <appender-ref ref="ConsoleAppender"/> <appender-ref ref="CompositeRollingFileAppender"/> <appender-ref ref="perfAppender"/> </appender>

    Read the article

  • Loading animation Memory leak

    - by Ayaz Alavi
    Hi, I have written network class that is managing all network calls for my application. There are two methods showLoadingAnimationView and hideLoadingAnimationView that will show UIActivityIndicatorView in a view over my current viewcontroller with fade background. I am getting memory leaks somewhere on these two methods. Here is the code -(void)showLoadingAnimationView { textmeAppDelegate *textme = (textmeAppDelegate *)[[UIApplication sharedApplication] delegate]; [[UIApplication sharedApplication] setNetworkActivityIndicatorVisible:YES]; if(wrapperLoading != nil) { [wrapperLoading release]; } wrapperLoading = [[UIView alloc] initWithFrame:CGRectMake(0.0, 0.0, 320.0, 480.0)]; wrapperLoading.backgroundColor = [UIColor clearColor]; wrapperLoading.alpha = 0.8; UIView *_loadingBG = [[UIView alloc] initWithFrame:CGRectMake(0.0, 0.0, 320.0, 480.0)]; _loadingBG.backgroundColor = [UIColor blackColor]; _loadingBG.alpha = 0.4; circlingWheel = [[UIActivityIndicatorView alloc] initWithActivityIndicatorStyle:UIActivityIndicatorViewStyleWhiteLarge]; CGRect wheelFrame = circlingWheel.frame; circlingWheel.frame = CGRectMake(((320.0 - wheelFrame.size.width) / 2.0), ((480.0 - wheelFrame.size.height) / 2.0), wheelFrame.size.width, wheelFrame.size.height); [wrapperLoading addSubview:_loadingBG]; [wrapperLoading addSubview:circlingWheel]; [circlingWheel startAnimating]; [textme.window addSubview:wrapperLoading]; [_loadingBG release]; [circlingWheel release]; } -(void)hideLoadingAnimationView { [[UIApplication sharedApplication] setNetworkActivityIndicatorVisible:NO]; wrapperLoading.alpha = 0.0; [self.wrapperLoading removeFromSuperview]; //[NSTimer scheduledTimerWithTimeInterval:0.8 target:wrapperLoading selector:@selector(removeFromSuperview) userInfo:nil repeats:NO]; } Here is how I am calling these two methods [NSThread detachNewThreadSelector:@selector(showLoadingAnimationView) toTarget:self withObject:nil]; and then somewhere later in the code i am using following function call to hide animation. [self hideLoadingAnimationView]; I am getting memory leaks when I call showLoadingAnimationView function. Anything wrong in the code or is there any better technique to show loading animation when we do network calls?

    Read the article

  • Image 8-connectivity without excessive branching?

    - by shoosh
    I'm writing a low level image processing algorithm which needs to do alot of 8-connectivity checks for pixels. For every pixel I often need to check the pixels above it, below it and on its sides and diagonals. On the edges of the image there are special cases where there are only 5 or 3 neighbors instead of 8 neighbors for a pixels. The naive way to do it is for every access to check if the coordinates are in the right range and if not, return some default value. I'm looking for a way to avoid all these checks since they introduce a large overhead to the algorithm. Are there any tricks to avoid it altogether?

    Read the article

< Previous Page | 301 302 303 304 305 306 307 308 309 310 311 312  | Next Page >