median - Page 4 - Developer IT

Google Analytics: How long does it take users to trigger an event

- by Stephen Ostermiller

I implemented Google Analytics event tracking on my currency conversion website. The typical user flow is: User lands on a page about two currencies. User enters an amount to be converted. The site shows the user the value in the other currency. The JavaScript sends Google Analytics an "converted" event when the currency conversion is done. Because most of the sessions on my site are single page, the event tracking is very important to me to be able to know if users find my page useful. I'm looking for a way to be able to figure out how long it typically takes users to enter a value in the form. I expect that this data would form a bell curve with around a specific amount of time after page load. If I can't get a graph, I could make do with a median value. I would like to be able to use this as a core metric around usability testing. Is there a way to get this information out of Google Analytics?

Read the article

Finding maximum number of congruent numbers

- by Stefan Czarnecki

Let's say we have a multiset (set with possible duplicates) of integers. We would like to find the size of the largest subset of the multiset such that all numbers in the subset are congruent to each other modulo some m 1. For example: 1 4 7 7 8 10 for m = 2 the subsets are: (1, 7, 7) and (4, 8, 10), both having size 3. for m = 3 the subsets are: (1, 4, 7, 7, 10) and (8), the larger set of size 5. for m = 4 the subsets are: (1), (4, 8), (7, 7), (10), the largest set of size 2. At this moment it is evident that the best answer is 5 for m = 3. Given m we can find the size of the largest subset in linear time. Because the answer is always equal or larger than half of the size of the set, it is enough to check for values of m upto median of the set. Also I noticed it is necessary to check for only prime values of m. However if values in the set are large the algorithm is still rather slow. Does anyone have any ideas how to improve it?

Read the article

Location Change Salary Differences [closed]

- by GameDev

DISCLAIMER: I know that this might be a "regional" question but I'm also asking for help as far as what resources to use to determine my decision. I'm currently talking to a recruiter for a game developer in the SF Bay area. I work in a relatively low-cost area in the south. I really want to get into game development but my current career is general web development. I'm very interested in taking the job, but my concern is that the amount they're willing to pay might be a relative pay cut. Here are some factors: It's not an entry-level position, the title is Senior Software Engineer. I have 5+ years of experience. The calculators online tell me that I should be expecting around 2x my current pay rate(http://www.bestplaces.net/col/). My current pay is in the mid $60k/yr, so that's like 120-130k. The recruiter told me at my experience level I can expect about $90-100/yr, and that those cost of living calculators were way off. The benefits will definitely be better, it's much larger company (help with commuting, catered meals, etc). But is the recruiter trying to give me a snow job on the pay scale, or is that a reasonable change from a smallish town in the south to somewhere in the SF bay area? How can I find this out? Glassdoor and Payscale seem to say "senior software developers" in that area make around 110 in median salary, but Payscale says it's closer to $135k, that range seems pretty large.

Read the article

Node.js vs PHP processing speed

- by Cody Craven

I've been looking into node.js recently and wanted to see a true comparison of processing speed for PHP vs Node.js. In most of the comparisons I had seen, Node trounced Apache/PHP set ups handily. However all of the tests were small 'hello worlds' that would not accurately reflect any webpage's markup. So I decided to create a basic HTML page with 10,000 hello world paragraph elements. In these tests Node with Cluster was beaten to a pulp by PHP on Nginx utilizing PHP-FPM. So I'm curious if I am misusing Node somehow or if Node is really just this bad at processing power. Note that my results were equivalent outputting "Hello world\n" with text/plain as the HTML, but I only included the HTML as it's closer to the use case I was investigating. My testing box: Core i7-2600 Intel CPU (has 8 threads with 4 cores) 8GB DDR3 RAM Fedora 16 64bit Node.js v0.6.13 Nginx v1.0.13 PHP v5.3.10 (with PHP-FPM) My test scripts: Node.js script var cluster = require('cluster'); var http = require('http'); var numCPUs = require('os').cpus().length; if (cluster.isMaster) { // Fork workers. for (var i = 0; i < numCPUs; i++) { cluster.fork(); } cluster.on('death', function (worker) { console.log('worker ' + worker.pid + ' died'); }); } else { // Worker processes have an HTTP server. http.Server(function (req, res) { res.writeHead(200, {'Content-Type': 'text/html'}); res.write('<html>\n<head>\n<title>Speed test</title>\n</head>\n<body>\n'); for (var i = 0; i < 10000; i++) { res.write('<p>Hello world</p>\n'); } res.end('</body>\n</html>'); }).listen(80); } This script is adapted from Node.js' documentation at http://nodejs.org/docs/latest/api/cluster.html PHP script <?php echo "<html>\n<head>\n<title>Speed test</title>\n</head>\n<body>\n"; for ($i = 0; $i < 10000; $i++) { echo "<p>Hello world</p>\n"; } echo "</body>\n</html>"; My results Node.js $ ab -n 500 -c 20 http://speedtest.dev/ This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking speedtest.dev (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Finished 500 requests Server Software: Server Hostname: speedtest.dev Server Port: 80 Document Path: / Document Length: 190070 bytes Concurrency Level: 20 Time taken for tests: 14.603 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 95066500 bytes HTML transferred: 95035000 bytes Requests per second: 34.24 [#/sec] (mean) Time per request: 584.123 [ms] (mean) Time per request: 29.206 [ms] (mean, across all concurrent requests) Transfer rate: 6357.45 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 2 Processing: 94 547 405.4 424 2516 Waiting: 0 331 399.3 216 2284 Total: 95 547 405.4 424 2516 Percentage of the requests served within a certain time (ms) 50% 424 66% 607 75% 733 80% 813 90% 1084 95% 1325 98% 1843 99% 2062 100% 2516 (longest request) PHP/Nginx $ ab -n 500 -c 20 http://speedtest.dev/test.php This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking speedtest.dev (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Finished 500 requests Server Software: nginx/1.0.13 Server Hostname: speedtest.dev Server Port: 80 Document Path: /test.php Document Length: 190070 bytes Concurrency Level: 20 Time taken for tests: 0.130 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 95109000 bytes HTML transferred: 95035000 bytes Requests per second: 3849.11 [#/sec] (mean) Time per request: 5.196 [ms] (mean) Time per request: 0.260 [ms] (mean, across all concurrent requests) Transfer rate: 715010.65 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 1 Processing: 3 5 0.7 5 7 Waiting: 1 4 0.7 4 7 Total: 3 5 0.7 5 7 Percentage of the requests served within a certain time (ms) 50% 5 66% 5 75% 5 80% 6 90% 6 95% 6 98% 6 99% 6 100% 7 (longest request) Additional details Again what I'm looking for is to find out if I'm doing something wrong with Node.js or if it is really just that slow compared to PHP on Nginx with FPM. I certainly think Node has a real niche that it could fit well, however with these test results (which I really hope I made a mistake with - as I like the idea of Node) lead me to believe that it is a horrible choice for even a modest processing load when compared to PHP (let alone JVM or various other fast solutions). As a final note, I also tried running an Apache Bench test against node with $ ab -n 20 -c 20 http://speedtest.dev/ and consistently received a total test time of greater than 0.900 seconds.

Read the article

What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least s

- by fmark

Could someone explain to the statistically naive what the difference between Multiple R-squared and Adjusted R-squared is? I am doing a single-variate regression analysis as follows: v.lm <- lm(epm ~ n_days, data=v) print(summary(v.lm)) Results: Call: lm(formula = epm ~ n_days, data = v) Residuals: Min 1Q Median 3Q Max -693.59 -325.79 53.34 302.46 964.95 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2550.39 92.15 27.677 <2e-16 *** n_days -13.12 5.39 -2.433 0.0216 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 410.1 on 28 degrees of freedom Multiple R-squared: 0.1746, Adjusted R-squared: 0.1451 F-statistic: 5.921 on 1 and 28 DF, p-value: 0.0216 Apologies for the newbiness of this question.

Read the article

A database of questions with unambiguous numeric answers.

- by dreeves

I (and co-hackers) are building a sort of trivia game inspired by this blog post: http://messymatters.com/calibration. The idea is to give confidence intervals and learn how to be calibrated (when you're "90% sure" you should be right 90% of the time). We're thus looking for, ideally, thousands of questions with unambiguous numerical answers. Also, they shouldn't be too boring. There are a lot of random statistics out there -- eg, enclosed water area in different countries -- that would make the game mind-numbing. Things like release dates of classic movies are more interesting (to most people). Other interesting ones we've found include Olympic records, median incomes for different professions, dates of famous inventions, and celebrity ages. Scraping things like above, by the way, was my reason for asking this question: http://stackoverflow.com/questions/2611418/scrape-html-tables So, if you know of other sources of interesting numerical facts (in a parsable form) I'm eager for pointers to them. Thanks!

Read the article

R Question. Numeric variable vs. Non-numeric and "names" function

- by Michael

> scores=cbind(UNCA.score, A.score, B.score, U.m.A, U.m.B) > names(scores)=c('UNCA.scores', 'A.scores', 'B.scores','UNCA.minus.A', 'UNCA.minus.B') > names(scores) [1] "UNCA.scores" "A.scores" "B.scores" "UNCA.minus.A" "UNCA.minus.B" > summary(UNCA.scores) X6.69230769230769 Min. : 4.154 1st Qu.: 7.333 Median : 8.308 Mean : 8.451 3rd Qu.: 9.538 Max. :12.000 > is.numeric(UNCA.scores) [1] FALSE > is.numeric(scores[,1]) [1] TRUE My question is, what is the difference between UNCA.scores and scores[,1]? UNCA.scores is the first column in the data.frame 'scores', but they are not the same thing, since one is numeric and the other isn't. If UNCA.scores is just a label here how can I make it be equivalent to 'scores[,1]? Thanks!

Read the article

filter that uses elements from two arrays at the same time

- by Gacek

Let's assume we have two arrays of the same size - A and B. Now, we need a filter that, for a given mask size, selects elements from A, but removes the central element of the mask, and inserts there corresponding element from B. So the 3x3 "pseudo mask" will look similar to this: A A A A B A A A A Doing something like this for averaging filter is quite simple. We can compute the mean value for elements from A without the central element, and then combine it with a proper proportion with elements from B: h = ones(3,3); h(2,2) =0; h = h/sum(h(:)); A_ave = filter2(h, A); C = (8/9) * A_ave + (1/9) * B; But how to do something similar for median filter (medfilt2 or even better for ordfilt2)

Read the article

Methodologies or algorithms for filling in missing data

- by tbone

I am dealing with datasets with missing data and need to be able to fill forward, backward, and gaps. So, for example, if I have data from Jan 1, 2000 to Dec 31, 2010, and some days are missing, when a user requests a timespan that begins before, ends after, or encompasses the missing data points, I need to "fill in" these missing values. Is there a proper term to refer to this concept of filling in data? Imputation is one term, don't know if it is "the" term for it though. I presume there are multiple algorithms & methodologies for filling in missing data (use last measured, using median/average/moving average, etc between 2 known numbers, etc. Anyone know the proper term for this problem, any online resources on this topic, or ideally links to open source implementations of some algorithms (C# preferably, but any language would be useful)

Read the article

Calculating percentiles in Excel with "buckets" data instead of the data list itself

- by G B

I have a bunch of data in Excel that I need to get certain percentile information from. The problem is that instead of having the data set made up of each value, I instead have info on the number of or "bucket" data. For example, imagine that my actual data set looks like this: 1,1,2,2,2,2,3,3,4,4,4 The data set that I have is this: Value No. of occurrences 1 2 2 4 3 2 4 3 Is there an easy way for me to calculate percentile information (as well as the median) without having to explode the summary data out to full data set? (Once I did that, I know that I could just use the Percentile(A1:A5, p) function) This is important because my data set is very large. If I exploded the data out, I would have hundreds of thousands of rows and I would have to do it for a couple of hundred data sets. Help!

Read the article

small string optimization for vector?

- by BuschnicK

I know several (all?) STL implementations implement a "small string" optimization where instead of storing the usual 3 pointers for begin, end and capacity a string will store the actual character data in the memory used for the pointers if sizeof(characters) <= sizeof(pointers). I am in a situation where I have lots of small vectors with an element size <= sizeof(pointer). I cannot use fixed size arrays, since the vectors need to be able to resize dynamically and may potentially grow quite large. However, the median (not mean) size of the vectors will only be 4-12 bytes. So a "small string" optimization adapted to vectors would be quite useful to me. Does such a thing exist? I'm thinking about rolling my own by simply brute force converting a vector to a string, i.e. providing a vector interface to a string. Good idea?

Read the article

How do you calculate expanding mean on time series using pandas?

- by mlo

How would you create a column(s) in the below pandas DataFrame where the new columns are the expanding mean/median of 'val' for each 'Mod_ID_x'. Imagine this as if were time series data and 'ID' 1-2 was on Day 1 and 'ID' 3-4 was on Day 2. I have tried every way I could think of but just can't seem to get it right. left4 = pd.DataFrame({'ID': [1,2,3,4],'val': [10000, 25000, 20000, 40000],'Mod_ID': [15, 35, 15, 42], 'car': ['ford','honda', 'ford', 'lexus']}) right4 = pd.DataFrame({'ID': [3,1,2,4],'color': ['red', 'green', 'blue', 'grey'], 'wheel': ['4wheel','4wheel', '2wheel', '2wheel'], 'Mod_ID': [15, 15, 35, 42]}) df1 = pd.merge(left4, right4, on='ID').drop('Mod_ID_y', axis=1)

Read the article

SQL SERVER – Introduction to PERCENTILE_DISC() – Analytic Functions Introduced in SQL Server 2012

- by pinaldave

SQL Server 2012 introduces new analytical function PERCENTILE_DISC(). The book online gives following definition of this function: Computes a specific percentile for sorted values in an entire rowset or within distinct partitions of a rowset in Microsoft SQL Server 2012 Release Candidate 0 (RC 0). For a given percentile value P, PERCENTILE_DISC sorts the values of the expression in the ORDER BY clause and returns the value with the smallest CUME_DIST value (with respect to the same sort specification) that is greater than or equal to P. If you are clear with understanding of the function – no need to read further. If you got lost here is the same in simple words – find value of the column which is equal or more than CUME_DIST. Before you continue reading this blog I strongly suggest you read about CUME_DIST function over here Introduction to CUME_DIST – Analytic Functions Introduced in SQL Server 2012. Now let’s have fun following query: USE AdventureWorks GO SELECT SalesOrderID, OrderQty, ProductID, CUME_DIST() OVER(PARTITION BY SalesOrderID ORDER BY ProductID ) AS CDist, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY ProductID) OVER (PARTITION BY SalesOrderID) AS PercentileDisc FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY SalesOrderID DESC GO The above query will give us the following result: You can see that I have used PERCENTILE_DISC(0.5) in query, which is similar to finding median but not exactly. PERCENTILE_DISC() function takes a percentile as a passing parameters. It returns the value as answer which value is equal or great to the percentile value which is passed into the example. For example in above example we are passing 0.5 into the PERCENTILE_DISC() function. It will go through the resultset and identify which rows has values which are equal to or great than 0.5. In first example it found two rows which are equal to 0.5 and the value of ProductID of that row is the answer of PERCENTILE_DISC(). In some third windowed resultset there is only single row with the CUME_DIST() value as 1 and that is for sure higher than 0.5 making it as a answer. To make sure that we are clear with this example properly. Here is one more example where I am passing 0.6 as a percentile. Now let’s have fun following query: USE AdventureWorks GO SELECT SalesOrderID, OrderQty, ProductID, CUME_DIST() OVER(PARTITION BY SalesOrderID ORDER BY ProductID ) AS CDist, PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY ProductID) OVER (PARTITION BY SalesOrderID) AS PercentileDisc FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY SalesOrderID DESC GO The above query will give us the following result: The result of the PERCENTILE_DISC(0.6) is ProductID of which CUME_DIST() is more than 0.6. This means for SalesOrderID 43670 has row with CUME_DIST() 0.75 is the qualified row, resulting answer 773 for ProductID. I hope this explanation makes it further clear. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Function, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

Apache https is slsow

- by raucous12

Hey, I've set apache up to use SSL with a self signed certificate. With http (KeepAlive off), I can get over 5000 requests per second. However, with https, I can only get 13 requests per second. I know there is supposed to be a bit of an overhead, but this seems abnormal. Can anyone suggest how I might go about debugging this. Here is the ab log for https: Server Software: Apache/2.2.3 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1/SSLv3,DHE-RSA-AES256-SHA,4096,256 Document Path: /hello.html Document Length: 29 bytes Concurrency Level: 5 Time taken for tests: 30.49425 seconds Complete requests: 411 Failed requests: 0 Write errors: 0 Total transferred: 119601 bytes HTML transferred: 11919 bytes Requests per second: 13.68 [#/sec] (mean) Time per request: 365.565 [ms] (mean) Time per request: 73.113 [ms] (mean, across all concurrent requests) Transfer rate: 3.86 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 190 347 74.3 333 716 Processing: 0 14 24.0 1 166 Waiting: 0 11 21.6 0 165 Total: 191 361 80.8 345 716 Percentage of the requests served within a certain time (ms) 50% 345 66% 377 75% 408 80% 421 90% 468 95% 521 98% 578 99% 596 100% 716 (longest request)

Read the article

Apache https is slow

- by raucous12

Hey, I've set apache up to use SSL with a self signed certificate. With https (KeepAlive on), I can get over 3000 requests per second. However, with https (KeepAlive off), I can only get 13 requests per second. I know there is supposed to be a bit of an overhead, but this seems abnormal. Can anyone suggest how I might go about debugging this. Here is the ab log for https: Server Software: Apache/2.2.3 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1/SSLv3,DHE-RSA-AES256-SHA,4096,256 Document Path: /hello.html Document Length: 29 bytes Concurrency Level: 5 Time taken for tests: 30.49425 seconds Complete requests: 411 Failed requests: 0 Write errors: 0 Total transferred: 119601 bytes HTML transferred: 11919 bytes Requests per second: 13.68 [#/sec] (mean) Time per request: 365.565 [ms] (mean) Time per request: 73.113 [ms] (mean, across all concurrent requests) Transfer rate: 3.86 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 190 347 74.3 333 716 Processing: 0 14 24.0 1 166 Waiting: 0 11 21.6 0 165 Total: 191 361 80.8 345 716 Percentage of the requests served within a certain time (ms) 50% 345 66% 377 75% 408 80% 421 90% 468 95% 521 98% 578 99% 596 100% 716 (longest request)

Read the article

amazon ec2-medium apache requests per second terrible

- by TheDayIsDone

EDITED -- test running from localhost now to rule out network... i have a c1.medium using EBS. when i do an apache benchmark and i'm just printing a "hello" for the test from localhost - no database hits, it's very slow. i can repeat this test many times with the same results. any thoughts? thanks in advance. ab -n 1000 -c 100 http://localhost/home/test/ Benchmarking localhost (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Completed 1000 requests Finished 1000 requests Server Software: Apache/2.2.23 Server Hostname: localhost Server Port: 80 Document Path: /home/test/ Document Length: 5 bytes Concurrency Level: 100 Time taken for tests: 25.300 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 816000 bytes HTML transferred: 5000 bytes Requests per second: 39.53 [#/sec] (mean) Time per request: 2530.037 [ms] (mean) Time per request: 25.300 [ms] (mean, across all concurrent requests) Transfer rate: 31.50 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 7 21.0 0 73 Processing: 81 2489 665.7 2500 4057 Waiting: 80 2443 654.0 2445 4057 Total: 85 2496 653.5 2500 4057 Percentage of the requests served within a certain time (ms) 50% 2500 66% 2651 75% 2842 80% 2932 90% 3301 95% 3506 98% 3762 99% 3838 100% 4057 (longest request)

Read the article

Strange 3-second tcp connection latencies (Linux, HTTP)

- by user25417

Our webservers with static content are experiencing strange 3 second latencies occasionally. Typically, an ApacheBench run ( 10000 requests, concurrency 1 or 40, no difference, but keepalive off) looks like this: Connection Times (ms) min mean[+/-sd] median max Connect: 2 10 152.8 3 3015 Processing: 2 8 34.7 3 663 Waiting: 2 8 34.7 3 663 Total: 4 19 157.2 6 3222 Percentage of the requests served within a certain time (ms) 50% 6 66% 7 75% 7 80% 7 90% 9 95% 11 98% 223 99% 225 100% 3222 (longest request) I have tried many things: - Apache2 2.2.9 with worker or prefork MPM, no difference (with KeepAliveTimeout 10-15) - Nginx 0.6.32 - various tcp parameters (net.core.somaxconn=3000, net.ipv4.tcp_sack=0, net.ipv4.tcp_dsack=0) - putting the files/DocumentRoot on tmpfs - shorewall on or off (i.e. empty iptables or not) - AllowOverride None is on for /, so no .htaccess checks (verified with strace) - the problem persists whether the webservers are accessed directly or through a Foundry load balancer Kernel is 2.6.32 (Debian Lenny backports), but it occurred with 2.6.26 also. IPv6 is enabled, but not used. Does the issue look familiar to anyone? Help/suggestions are much appreciated. It sounds a bit like a SYN,ACK packet getting lost or ignored.

Read the article

Intern Screening - Software 'Quiz'

- by Jeremy1026

I am in charge of selecting a new software development intern for a company that I work with. I wanted to throw a little 'quiz' at the applicants before moving forth with interviews so as to weed out the group a little bit to find some people that can demonstrate some skill. I put together the following quiz to send to applicants, it focuses only on PHP, but that is because that is what about 95% of the work will be done in. I'm hoping to get some feedback on A. if its a good idea to send this to applicants and B. if it can be improved upon. # 1. FizzBuzz # Write a small application that does the following: # Counts from 1 to 100 # For multiples of 3 output "Fizz" # For multiples of 5 output "Buzz" # For multiples of 3 and 5 output "FizzBuzz" # For numbers that are not multiples of 3 nor 5 output the number. <?php ?> # 2. Arrays # Create a multi-dimensional array that contains # keys for 'id', 'lot', 'car_model', 'color', 'price'. # Insert three sets of data into the array. <?php ?> # 3. Comparisons # Without executing the code, tell if the expressions # below will return true or false. <?php if ((strpos("a","abcdefg")) == TRUE) echo "True"; else echo "False"; //True or False? if ((012 / 4) == 3) echo "True"; else echo "False"; //True or False? if (strcasecmp("abc","ABC") == 0) echo "True"; else echo "False"; //True or False? ?> # 4. Bug Checking # The code below is flawed. Fix it so that the code # runs properly without producing any Errors, Warnings # or Notices, and returns the proper value. <?php //Determine how many parts are needed to create a 3D pyramid. function find_3d_pyramid($rows) { //Loop through each row. for ($i = 0; $i < $rows; $i++) { $lastRow++; //Append the latest row to the running total. $total = $total + (pow($lastRow,3)); } //Return the total. return $total; } $i = 3; echo "A pyramid consisting of $i rows will have a total of ".find_3d_pyramid($i)." pieces."; ?> # 5. Quick Examples # Create a small example to complete the task # for each of the following problems. # Create an md5 hash of "Hello World"; # Replace all occurances of "_" with "-" in the string "Welcome_to_the_universe." # Get the current date and time, in the following format, YYYY/MM/DD HH:MM:SS AM/PM # Find the sum, average, and median of the following set of numbers. 1, 3, 5, 6, 7, 9, 10. # Randomly roll a six-sided die 5 times. Store the 5 rolls into an array. <?php ?>

Read the article

GPGPU programming with OpenGL ES 2.0

- by Albus Dumbledore

I am trying to do some image processing on the GPU, e.g. median, blur, brightness, etc. The general idea is to do something like this framework from GPU Gems 1. I am able to write the GLSL fragment shader for processing the pixels as I've been trying out different things in an effect designer app. I am not sure however how I should do the other part of the task. That is, I'd like to be working on the image in image coords and then outputting the result to a texture. I am aware of the gl_FragCoords variable. As far as I understand it it goes like that: I need to set up a view (an orthographic one maybe?) and a quad in such a way so that the pixel shader would be applied once to each pixel in the image and so that it would be rendering to a texture or something. But how can I achieve that considering there's depth that may make things somewhat awkward to me... I'd be very grateful if anyone could help me with this rather simple task as I am really frustrated with myself. UPDATE It seems I'll have to use an FBO, getting one like this: glBindFramebuffer(...)

Read the article

Worse is better. Is there an example?

- by J.F. Sebastian

Is there a widely-used algorithm that has time complexity worse than that of another known algorithm but it is a better choice in all practical situations (worse complexity but better otherwise)? An acceptable answer might be in a form: There are algorithms A and B that have O(N**2) and O(N) time complexity correspondingly, but B has such a big constant that it has no advantages over A for inputs less then a number of atoms in the Universe. Examples highlights from the answers: Simplex algorithm -- worst-case is exponential time -- vs. known polynomial-time algorithms for convex optimization problems. A naive median of medians algorithm -- worst-case O(N**2) vs. known O(N) algorithm. Backtracking regex engines -- worst-case exponential vs. O(N) Thompson NFA -based engines. All these examples exploit worst-case vs. average scenarios. Are there examples that do not rely on the difference between the worst case vs. average case scenario? Related: The Rise of ``Worse is Better''. (For the purpose of this question the "Worse is Better" phrase is used in a narrower (namely -- algorithmic time-complexity) sense than in the article) Python's Design Philosophy: The ABC group strived for perfection. For example, they used tree-based data structure algorithms that were proven to be optimal for asymptotically large collections (but were not so great for small collections). This example would be the answer if there were no computers capable of storing these large collections (in other words large is not large enough in this case). Coppersmith–Winograd algorithm for square matrix multiplication is a good example (it is the fastest (2008) but it is inferior to worse algorithms). Any others? From the wikipedia article: "It is not used in practice because it only provides an advantage for matrices so large that they cannot be processed by modern hardware (Robinson 2005)."

Read the article

how to define fill colours in ggplot histogram?

- by Andreas

I have the following simple data data <- structure(list(status = c(9, 5, 9, 10, 11, 10, 8, 6, 6, 7, 10, 10, 7, 11, 11, 7, NA, 9, 11, 9, 10, 8, 9, 10, 7, 11, 9, 10, 9, 9, 8, 9, 11, 9, 11, 7, 8, 6, 11, 10, 9, 11, 11, 10, 11, 10, 9, 11, 7, 8, 8, 9, 4, 11, 11, 8, 7, 7, 11, 11, 11, 6, 7, 11, 6, 10, 10, 9, 10, 10, 8, 8, 10, 4, 8, 5, 8, 7), statusgruppe = c(0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, NA, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0)), .Names = c("status", "statusgruppe"), class = "data.frame", row.names = c(NA, -78L )) from that I'd like to make a histogram: ggplot(data, aes(status))+ geom_histogram(aes(y=..density..), binwidth=1, colour = "black", fill="white")+ theme_bw()+ scale_x_continuous("Staus", breaks=c(min(data$status,na.rm=T), median(data$status, na.rm=T), max(data$status, na.rm=T)),labels=c("Low", "Middle", "High"))+ scale_y_continuous("Percent", formatter="percent") Now - i'd like for the bins to take colou according to value - e.g. bins with value 9 gets dark grey - everything else should be light grey. I have tried with "fill=statusgruppe", scale_fill_grey(breaks=9) etc. - but I can't get it to work. Any ideas?

Read the article

jitter if multiple outliers in ggplot2 boxplot

- by Andreas

I am trying to find a suitable display to illustrate various properties within and across school classes. For each class there is only 15-30 data points (pupils). Right now i am leaning towards a whisker-less boxplot, showing only 1.,2. and 3. quartile + datapoints more then e.g. 1 population SD +/- the sample median. This I can do. However - I need to show this graph to some teachers, in order to gauge what they like most. I'd like to compare my graph with a normal boxplot. But the normal boxplot looks the same if there is only one outlier, or e.g. 5 outliers at the same value. In this case this would be a deal-breaker. e.g. test <-structure(list(value = c(3, 5, 3, 3, 6, 4, 5, 4, 6, 4, 6, 4, 4, 6, 5, 3, 3, 4, 4, 4, 3, 4, 4, 4, 3, 4, 5, 6, 6, 4, 3, 5, 4, 6, 5, 6, 4, 5, 5, 3, 4, 4, 6, 4, 4, 5, 5, 3, 4, 5, 8, 8, 8, 8, 9, 6, 6, 7, 6, 9), places = structure(c(1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L), .Label = c("a", "b"), class = "factor")), .Names = c("value", "places"), row.names = c(NA, -60L), class = "data.frame") ggplot(test, aes(x=places,y=value))+geom_boxplot() Here there are two outliers at ("a",9) - but only one "dot" shown. So my question: How to jitter the outliers. And - what kind of display would you suggest for this kind of data?

Read the article

Memory problems while code is running (Python, Networkx)

- by MIN SU PARK

I made a code for generate a graph with 379613734 edges. But the code couldn't be finished because of memory. It takes about 97% of server memory when it go through 62 million lines. So I killed it. Do you have any idea to solve this problem? My code is like this: import os, sys import time import networkx as nx G = nx.Graph() ptime = time.time() j = 1 for line in open("./US_Health_Links.txt", 'r'): #for line in open("./test_network.txt", 'r'): follower = line.strip().split()[0] followee = line.strip().split()[1] G.add_edge(follower, followee) if j%1000000 == 0: print j*1.0/1000000, "million lines done", time.time() - ptime ptime = time.time() j += 1 DG = G.to_directed() # P = nx.path_graph(DG) Nn_G = G.number_of_nodes() N_CC = nx.number_connected_components(G) LCC = nx.connected_component_subgraphs(G)[0] n_LCC = LCC.nodes() Nn_LCC = LCC.number_of_nodes() inDegree = DG.in_degree() outDegree = DG.out_degree() Density = nx.density(G) # Diameter = nx.diameter(G) # Centrality = nx.betweenness_centrality(PDG, normalized=True, weighted_edges=False) # Clustering = nx.average_clustering(G) print "number of nodes in G\t" + str(Nn_G) + '\n' + "number of CC in G\t" + str(N_CC) + '\n' + "number of nodes in LCC\t" + str(Nn_LCC) + '\n' + "Density of G\t" + str(Density) + '\n' # sys.exit() # j += 1 The edge data is like this: 1000 1001 1000245 1020191 1000 10267352 1000653 10957902 1000 11039092 1000 1118691 10346 11882 1000 1228281 1000 1247041 1000 12965332 121340 13027572 1000 13075072 1000 13183162 1000 13250162 1214 13326292 1000 13452672 1000 13844892 1000 14061830 12340 1406481 1000 14134703 1000 14216951 1000 14254402 12134 14258044 1000 14270791 1000 14278978 12134 14313332 1000 14392970 1000 14441172 1000 14497568 1000 14502775 1000 14595635 1000 14620544 1000 14632615 10234 14680596 1000 14956164 10230 14998341 112000 15132211 1000 15145450 100 15285998 1000 15288974 1000 15300187 1000 1532061 1000 15326300 Lastly, is there anybody who has an experience to analyze Twitter link data? It's quite hard to me to take a directed graph and calculate average/median indegree and outdegree of nodes. Any help or idea?

Read the article

incremental way of counting quantiles for large set of data

- by Gacek

I need to count the quantiles for a large set of data. Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all the portions of the data and store it somewhere, then sort it and count the quantile: List<double> allData = new List<double>(); foreach(var row in matrix) // this is only example. In fact the portions of data are not rows of some matrix { allData.AddRange(row); } allData.Sort(); double p = 0.75*allData.Count; int idQ3 = (int)Math.Ceiling(p) - 1; double Q3 = allData[idQ3]; Now, I would like to find a way of counting this without storing the data in some separate variable. The best solution would be to count some parameters od mid-results for first row and then adjust it step by step for next rows. Note: These datasets are really big (ca 5000 elements in each row) The Q3 can be estimated, it doesn't have to be an exact value. I call the portions of data "rows", but they can have different leghts! Usually it varies not so much (+/- few hundred samples) but it varies! This question is similar to this one: http://stackoverflow.com/questions/1058813/on-line-iterator-algorithms-for-estimating-statistical-median-mode-skewness But I need to count quantiles. ALso there are few articles in this topic, i.e.: http://web.cs.wpi.edu/~hofri/medsel.pdf http://portal.acm.org/citation.cfm?id=347195&dl But before I would try to implement these, I wanted to ask you if there are maybe any other, qucker ways of counting the 0.25/0.75 quantiles?

Read the article

APC decreasing php performance??? (php 5.3, apache 2.2, windows vista 64bit)

- by M.M.

Hi, I have an Apache/2.2.15 (VC9) and PHP/5.3.2 (VC9 thread safe) running as an apache module on Vista 64bit machine. All running fine. Project that I'm benchmarking (with apache's ab utility) is basically standard Zend Framework project with no db connection involved. Average (median) apache response is about 0.15 seconds. After I've installed APC (3.1.4-dev VC9 thread safe) with standard settings suddenly the request response time raised to 1.3 seconds (!), which is unacceptable... All apc settings looked always good (through the apc.php script: enough shm memory, no cache full, fragmentation 0%). Only difference was to disable the stats lookup (apc.stat = 0). Then the response dropped to 0.09 seconds which was finally better than without the apc. IIRC, it's expected and obvious that the stat lookup creates some overhead, but shouldn't it still be far more performant compared to running wihout the apc extension at all? Or put it differently why is the apc.stat creating so much overhead? Apparently, something is not working as it should, I don't really know where to start looking. Thank you for your time/answers/direction in advance. Cheers, m.

Search Results

Search found 114 results on 5 pages for 'median'.

Page 4/5 | < Previous Page | 1 2 3 4 5 | Next Page >

- by Stephen Ostermiller

- by Stefan Czarnecki

- by GameDev

- by Cody Craven

- by fmark

- by dreeves

- by Michael

- by Gacek

- by tbone

- by G B

- by BuschnicK

- by mlo

- by pinaldave

- by raucous12

- by raucous12

- by TheDayIsDone

- by user25417

- by Jeremy1026

- by Albus Dumbledore

- by J.F. Sebastian

- by Andreas

- by Andreas

- by MIN SU PARK

- by Gacek

- by M.M.

< Previous Page | 1 2 3 4 5 | Next Page >