Search Results

Search found 2042 results on 82 pages for 'average'.

Page 16/82 | < Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23  | Next Page >

  • ping alternative to measure routing distance (on Windows)

    - by Marco Demaio
    Hello, in order to measure aprroximately the rouitng distance (to see if a server is close to my country or too far away) I usually use ping command. I'm in Italy, when I ping Italian servers I get 36ms when I ping US EAST servers I get an average of 120ms when I ping US WEST servers I get an average of 200ms etc. Unfortunately some web hosters turn off the ping reply on their servers, so my question is how do I detect the routing distance, is there another easy to use command in Windows to accomplish the same task? Thanks!

    Read the article

  • How to know if my nginx is in good health?

    - by Howard
    I am running a nginx on EC2 (m1.small) for SSL termination. I am using 2 workers on Ubuntu, with latest nginx (stable), the network throughput is around 2Mbps and system load average is around 2 to 3. I am wondering if this system is in good health for now, e.g. what is the queue length (I know nginx can handle a lot of concurrent request, but I mean before the request is being served, how many of them need to wait before being served) what is the average queue time for a given request to be served. I want to know because if my nginx is cpu bounded (e.g. due to SSL), I will need to upgrade to a faster instance. My current nginx status Active connections: 4076 server accepts handled requests 90664283 90664283 104117012 Reading: 525 Writing: 81 Waiting: 3470

    Read the article

  • Decreasing lagging on router, while gaming

    - by user2699451
    I had absolutely no idea where to post this question and get a professional answer for it but here goes... Okay, so I guess everyone whos is reading this had played online, and so I was playing LoL again tonight and my brother decided that now was a great time to go on youtube and start watching a movie, so my ping (connecting from South Africa to EU west server) is around 190-220 average, however it started spiking to 2000 and average was 600-800, so it arised the question, how ther hell can I "kick" him off for the time being I tried reasoning it out with him but its like playing chess with a pigeon, he's studying to be an engineer, and I just cant win an argument with him, so i need to step it up a level... I have in the past used the aireplay method by sending deauth packets but it only helped so much, is there another way of either kicking a peer of the local wifi or decreasing the lag spikes while in session or even splitting the bandwidth equally in 2 or 3,etc What do I do p.s. sorry if off topic, if it is not appropriate, just say which website will be able to help or assist me...

    Read the article

  • When is it time to buy a new hard drive, and what considerations go into buying a new hard drive?

    - by user1125620
    I've had my current hard drive for about 4-5 years now, and I've never had a problem with it before, but now it's making whirring noises. It's done this before and, last time, the noise did go away the next day, but I have accumulated quite a bit of information that I wouldn't want to lose on the drive. HD Tune Pro and Berlac Advisor both said the drive was healthy, and I wouldn't want to get a new one unless it was absolutely necessary or was going to show drastic performance improvements. My only knock against the drive would be that Visual Studio takes longer to load than I'd like it to. HD Tune Pro says the average read speed is 54.3MB/s. I'm not sure if that's good or bad, but it seems about average compared to similar drives on http://www.hdtune.com/testresults.html. Model #: WDC WD5000AAJS-22YFA0 So, should hard drives be replaced after a certain amount of time? Has mine reached that point? Would a new hard drive be any faster?

    Read the article

  • Video encoding is very slow on Amazon EC2 instance

    - by Timka
    We are using Amazon EC2 m1.xlarge instance for video re-encoding and it looks like the actual encoding process takes a very long time. For an average 250mb video file it takes about an hour to encode. Intance: m1.xlarge (Xeon E5645 x 15gb ram) Windows Server 2008 R2 64-bit AviSynth version 2.5 (32bit) + ffms2 plugin (FFmpegSource 1.21) FFmpeg SVN-r13712 libavutil 3213056 libavcodec 3356930 libavformat 3411456 libavdevice 3407872 Number of parallel jobs is 3 Average CPU utilization ~96% Update#1 Source video: mp4/h.264 Parameters for ffmpeg: --enable-memalign-hack --enable-avisynth --enable-libxvid --enable-libx264 + --enable-libgsm --enable-libfaac --enable-libfaad --enable-liba52 + --enable-libmp3lame --enable-libvorbis --enable-libtheora --enable-pthreads + --enable-swscale --enable-gpl Video files encoded to mp4/h.264 with the following extra command line options: -threads 0 -coder 0 -bf 0 -refs 1 -level 30 -maxrate 10000000 -bufsize 10000000

    Read the article

  • TCP 30 small packets per second flood connection with server

    - by Denis Ermolin
    I'm testing connection with flash client and cloud server(boost::asio for software) over TCP connection. My connection with server already is really poor - 120 ms ping in average. I found when i start to send packets with 2 bytes size (without tcp header) with speed 30 packets/s - ping grow to 170-200 average. I think that it's really bad and my bad connection and bad cloud provider is reason for this high ping without any load. What do you think? (I tested my software - it can compute about 50k small packets/s so software is not a problem). I measure my ping through flash client - send packet with timestamp and immediatly send from server to client.

    Read the article

  • Mounting Solaris UFS partition on Debian(with FreeBSD kernel)

    - by hayalci
    I have some disks that were being used on a Solaris system. The disks are formatted as UFS. I attached them to a Debian system (with FreeBSD kernel. Debian/kFreeBSD), but I cannot mount them. $ mount -t ufs /dev/da2s1 /mnt/diska mount: /dev/da2s1 : Invalid argument Also the tunefs.ufs does not work; $ tunefs.ufs -p /dev/da2s1 tunefs.ufs: /dev/da2s1: could not read superblock to fill out disk Is there an incompatibility between FreeBSD UFS and Solaris UFS? Is it possible to mount one, under the other OS ? Note: tunefs.ufs works on the root partition $ tunefs.ufs -p /dev/da7s2 tunefs.ufs: ACLs: (-a) disabled tunefs.ufs: MAC multilabel: (-l) disabled tunefs.ufs: soft updates: (-n) disabled tunefs.ufs: gjournal: (-J) disabled tunefs.ufs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs.ufs: average file size: (-f) 16384 tunefs.ufs: average number of files in a directory: (-s) 64 tunefs.ufs: minimum percentage of free space: (-m) 8% tunefs.ufs: optimization preference: (-o) time tunefs.ufs: volume label: (-L)

    Read the article

  • How to measure that a host is good for users in Egypt ?

    - by Sherif Buzz
    Hi all, I currently have a site that's hosted in Texas. The majority of my users are from Egypt and I'm a bit concerned that the current hosting is not the optimal in terms of performance. The site is not slow but for how can I know if, for example, hosting it in Europe or Asia is better ? To clarify I need to know there is a way that I can test different hosting options - for example how can I test the average response time between Egypt and a host in Texas, the average response time between Egypt and a host in the UK ?

    Read the article

  • Server Requirement and Cost for an android Application [duplicate]

    - by CagkanToptas
    This question already has an answer here: How do you do load testing and capacity planning for web sites? 3 answers Can you help me with my capacity planning? 2 answers I am working on a project which is an android application. For my project proposal, I need to calculate what is my server requirements to overcome the traffic I explained below? and if possible, I want to learn what is approximate cost of such server? I am giving the maximum expected values for calculation : -Database will be in mysql (Average service time of DB is 100-110ms in my computer[i5,4GB Ram]) -A request will transfer 150Kb data for each request on average. -Total user count : 1m -Active user count : 50k -Estimated request/sec for 1 active user : 0.06 -Total expected request/second to the server = ~5000 I am expecting this traffic between 20:00-1:00 everyday and then this values will decrease to 1/10 rest of the day. Is there any solution to this? [e.g increasing server capacity in a specific time period everyday to reduce cost]

    Read the article

  • How many websites can my server potentially hold?

    - by Daniel Kindler
    Sorry for the "noob" question, but... About how many medium-sized websites with average traffic could this server hold? Just like the average website, kind of like a small business site. How many sites could this server hold, but still maintain nice, decent speed? PowerEdge R510 PE R510 Chassis for Up to Four 3.5" Cabled Hard Drives, LED edit Processor Intel® Xeon® E5630 2.53Ghz, 12M Cache,Turbo, HT, 1066MHz Max Mem edit Memory 8GB Memory (4x2GB), 1333MHz Single Ranked UDIMMs for 1 Procs, Optimized edit Operating System SUSE Linux Enterprise Server 10, SP3, Up To 32 CPU Lic, 1 YR Sub, DIB, Media edit Red Hat Enterprise Linux Licensing Hard Drives 250GB 7.2K RPM SATA 3.5" Cabled Hard Drive edit Hard Drives 1TB 7.2K RPM SATA 3.5" Cabled Hard Drive edit Hard Drives 2 X 2TB 7.2K RPM SATA 3.5in Cabled Hard Drive Hard Drive Configuration No RAID, Embedded SATA Controller for x4 Chassis edit Power Supply 480 Watt Non-Redundant Power Supply edit Thank you!

    Read the article

  • I know this is a stupid question but... How many websites can my server potentially hold?

    - by Daniel Kindler
    Sorry for the "noob" question, but... About how many medium-sized websites with average traffic could this server hold? Just like the average website, kind of like a small business site. How many sites could this server hold, but still maintain nice, decent speed? PowerEdge R510 PE R510 Chassis for Up to Four 3.5" Cabled Hard Drives, LED edit Processor Intel® Xeon® E5630 2.53Ghz, 12M Cache,Turbo, HT, 1066MHz Max Mem edit Memory 8GB Memory (4x2GB), 1333MHz Single Ranked UDIMMs for 1 Procs, Optimized edit Operating System SUSE Linux Enterprise Server 10, SP3, Up To 32 CPU Lic, 1 YR Sub, DIB, Media edit Red Hat Enterprise Linux Licensing Hard Drives 250GB 7.2K RPM SATA 3.5" Cabled Hard Drive edit Hard Drives 1TB 7.2K RPM SATA 3.5" Cabled Hard Drive edit Hard Drives 2 X 2TB 7.2K RPM SATA 3.5in Cabled Hard Drive Hard Drive Configuration No RAID, Embedded SATA Controller for x4 Chassis edit Power Supply 480 Watt Non-Redundant Power Supply edit Thank you!

    Read the article

  • How to make variable range of cells?

    - by Ertai
    In A column I have a set of numbers (over 1 000). I want to get average of ten of them (a1:a10) and wrtite into next column (B). Now I want to get next ten numbers and get average of them (a11:a20). And so on... How to get this if in C1 i would have number which is range (i.e 10 = a1:a10/a11:a20 ; i.e 25 a1:a25/a26:a50) of the cells? When I change C1 value I want to column B to update automaticaly? Is this possible?

    Read the article

  • Progress bar in a Flash MP3 Player

    - by Deryck
    Hi I have coded a simple XML driven MP3 player. I have used Sound and SoundChannel objects and method but I can´t find a way of make a progress bar. I don´t need a loading progress I need a song progress status bar. Canbd anybody help me? Thanks. UPDATE: Theres is the code. var musicReq: URLRequest; var thumbReq: URLRequest; var music:Sound = new Sound(); var sndC:SoundChannel; var currentSnd:Sound = music; var position:Number; var currentIndex:Number = 0; var songPaused:Boolean; var songStopped:Boolean; var lineClr:uint; var changeClr:Boolean; var xml:XML; var songList:XMLList; var loader:URLLoader = new URLLoader(); loader.addEventListener(Event.COMPLETE, Loaded); loader.load(new URLRequest("musiclist.xml")); var thumbHd:MovieClip = new MovieClip(); thumbHd.x = 50; thumbHd.y = 70; addChild(thumbHd); function Loaded(e:Event):void{ xml = new XML(e.target.data); songList = xml.song; musicReq = new URLRequest(songList[0].url); thumbReq = new URLRequest(songList[0].thumb); music.load(musicReq); sndC = music.play(); title_txt.text = songList[0].title + " - " + songList[0].artist; loadThumb(); sndC.addEventListener(Event.SOUND_COMPLETE, nextSong); } function loadThumb():void{ var thumbLoader:Loader = new Loader(); thumbReq = new URLRequest(songList[currentIndex].thumb); thumbLoader.load(thumbReq); thumbLoader.contentLoaderInfo.addEventListener(Event.COMPLETE, thumbLoaded); } function thumbLoaded(e:Event):void { var thumb:Bitmap = (Bitmap)(e.target.content); var holder:MovieClip = thumbHd; holder.addChild(thumb); } prevBtn.addEventListener(MouseEvent.CLICK, prevSong); nextBtn.addEventListener(MouseEvent.CLICK, nextSong); playBtn.addEventListener(MouseEvent.CLICK, playSong); function prevSong(e:Event):void{ if(currentIndex 0){ currentIndex--; } else{ currentIndex = songList.length() - 1; } var prevReq:URLRequest = new URLRequest(songList[currentIndex].url); var prevPlay:Sound = new Sound(prevReq); sndC.stop(); title_txt.text = songList[currentIndex].title + " - " + songList[currentIndex].artist; sndC = prevPlay.play(); currentSnd = prevPlay; songPaused = false; loadThumb(); sndC.addEventListener(Event.SOUND_COMPLETE, nextSong); } function nextSong(e:Event):void { if(currentIndex And here the code for the lenght and position. It´s inside a MovieClip. That´s why I use absolute path for find the Sound object. this.addEventListener(Event.ENTER_FRAME, moveSpeaker); var initWidth:Number = this.SpkCone.width; var initHeight:Number = this.SpkCone.height; var rootObj:Object = root; function moveSpeaker(eventArgs:Event) { var average:Number = ((rootObj.audioPlayer_mc.sndC.leftPeak + rootObj.audioPlayer_mc.sndC.rightPeak) / 2) * 10; // trace(average); // trace(initWidth + ":" + initHeight); trace(rootObj.audioPlayer_mc.sndC.position + "/" + rootObj.audioPlayer_mc.music.length); this.SpkCone.width = initWidth + average; this.SpkCone.height = initHeight + average; }

    Read the article

  • Getting timing consistency in Linux

    - by Jim Hunziker
    I can't seem to get a simple program (with lots of memory access) to achieve consistent timing in Linux. I'm using a 2.6 kernel, and the program is being run on a dual-core processor with realtime priority. I'm trying to disable cache effects by declaring the memory arrays as volatile. Below are the results and the program. What are some possible sources of the outliers? Results: Number of trials: 100 Range: 0.021732s to 0.085596s Average Time: 0.058094s Standard Deviation: 0.006944s Extreme Outliers (2 SDs away from mean): 7 Average Time, excluding extreme outliers: 0.059273s Program: #include <stdio.h> #include <stdlib.h> #include <math.h> #include <sched.h> #include <sys/time.h> #define NUM_POINTS 5000000 #define REPS 100 unsigned long long getTimestamp() { unsigned long long usecCount; struct timeval timeVal; gettimeofday(&timeVal, 0); usecCount = timeVal.tv_sec * (unsigned long long) 1000000; usecCount += timeVal.tv_usec; return (usecCount); } double convertTimestampToSecs(unsigned long long timestamp) { return (timestamp / (double) 1000000); } int main(int argc, char* argv[]) { unsigned long long start, stop; double times[REPS]; double sum = 0; double scale, avg, newavg, median; double stddev = 0; double maxval = -1.0, minval = 1000000.0; int i, j, freq, count; int outliers = 0; struct sched_param sparam; sched_getparam(getpid(), &sparam); sparam.sched_priority = sched_get_priority_max(SCHED_FIFO); sched_setscheduler(getpid(), SCHED_FIFO, &sparam); volatile float* data; volatile float* results; data = calloc(NUM_POINTS, sizeof(float)); results = calloc(NUM_POINTS, sizeof(float)); for (i = 0; i < REPS; ++i) { start = getTimestamp(); for (j = 0; j < NUM_POINTS; ++j) { results[j] = data[j]; } stop = getTimestamp(); times[i] = convertTimestampToSecs(stop-start); } free(data); free(results); for (i = 0; i < REPS; i++) { sum += times[i]; if (times[i] > maxval) maxval = times[i]; if (times[i] < minval) minval = times[i]; } avg = sum/REPS; for (i = 0; i < REPS; i++) stddev += (times[i] - avg)*(times[i] - avg); stddev /= REPS; stddev = sqrt(stddev); for (i = 0; i < REPS; i++) { if (times[i] > avg + 2*stddev || times[i] < avg - 2*stddev) { sum -= times[i]; outliers++; } } newavg = sum/(REPS-outliers); printf("Number of trials: %d\n", REPS); printf("Range: %fs to %fs\n", minval, maxval); printf("Average Time: %fs\n", avg); printf("Standard Deviation: %fs\n", stddev); printf("Extreme Outliers (2 SDs away from mean): %d\n", outliers); printf("Average Time, excluding extreme outliers: %fs\n", newavg); return 0; }

    Read the article

  • How to gain accurate results with Painter's algorithm?

    - by pimvdb
    A while ago I asked how to determine when a face is overlapping another. The advice was to use a Z-buffer. However, I cannot use a Z-buffer in my current project and hence I would like to use the Painter's algorithm. I have no good clue as to when a surface is behind or in front of another, though. I've tried numerous methods but they all fail in edge cases, or they fail even in general cases. This is a list of sorting methods I've tried so far: Distance to midpoint of each face Average distance to each vertex of each face Average z value of each vertex Higest z value of vertices of each face and draw those first Lowest z value of vertices of each face and draw those last The problem is that a face might have a closer distance but is still further away. All these methods seem unreliable. Edit: For example, in the following image the surface with the blue point as midpoint is painted over the surface with the red point as midpoint, because the blue point is closer. However, this is because the surface of the red point is larger and the midpoint is further away. The surface with the red point should be painted over the blue one, because it is closer, whilst the midpoint distance says the opposite. What exactly is used in the Painter's algorithm to determine the order in which objects should be drawn?

    Read the article

  • Whats the greatest most impressive programing feat you ever witnessed? [closed]

    - by David Reis
    Everyone knows of the old adage that the best programmers can be orders of magnitude better than the average. I've personally seen good code and programmers, but never something so absurd. So the questions is, what is the most impressive feat of programming you ever witnessed or heard of? You can define impressive by: The scope of the task at hand e.g. John single handedly developed the framework for his company, a work comparable in scope to what the other 200 employed were doing combined. Speed e.g. Stu programmed an entire real time multi-tasking app OS on an weekened including its own C compiler and shell command line tools Complexity e.g. Jane rearchitected our entire 10 millon LOC app to work in a cluster of servers. And she did it in an afternoon. Quality e.g. Charles's code had a rate of defects per LOC 100 times lesser than the company average. Furthermore he code was clean and understandable by all. Obviously, the more of these characteristics combined, and the more extreme each of them, the more impressive is the feat. So, let me have it. What's the most absurd feat you can recount? Please provide as much detail as possible and try to avoid urban legends or exaggerations. Post only what you can actually vouch for. Bonus questions: Was the herculean task a one-of, or did the individual regularly amazed people? How do you explain such impressive performance? How was the programmer recognized for such awesome work?

    Read the article

  • How to manage a developer who has poor communication skills

    - by djcredo
    I manage a small team of developers on an application which is in the mid-point of its lifecycle, within a big firm. This unfortunately means there is commonly a 30/70 split of Programming tasks to "other technical work". This work includes: Working with DBA / Unix / Network / Loadbalancer teams on various tasks Placing & managing orders for hardware or infrastructure in different regions Running tests that have not yet been migrated to CI Analysis Support / Investigation Its fair to say that the Developers would all prefer to be coding, rather than doing these more mundane tasks, so I try to hand out the fun programming jobs evenly amongst the team. Most of the team was hired because, though they may not have the elite programming skills to write their own compiler / game engine / high-frequency trading system etc., they are good communicators who "can get stuff done", work with other teams, and somewhat navigate the complex beaurocracy here. They are good developers, but they are also good all-round technical staff. However, one member of the team probably has above-average coding skills, but below-average communication skills. Traditionally, the previous Development Manager tended to give him the Programming tasks and not the more mundane tasks listed above. However, I don't feel that this is fair to the rest of the team, who have shown an aptitute for developing a well-rounded skillset that is commonly required in a big-business IT department. What should I do in this situation? If I continue to give him more programming work, I know that it will be done faster (and conversly, I would expect him to complete the other work slower). But it goes against my principles, and promotes the idea that you can carve out a "comfortable niche" for yourself simply by being bad at the tasks you don't like.

    Read the article

  • Solving Big Problems with Oracle R Enterprise, Part II

    - by dbayard
    Part II – Solving Big Problems with Oracle R Enterprise In the first post in this series (see https://blogs.oracle.com/R/entry/solving_big_problems_with_oracle), we showed how you can use R to perform historical rate of return calculations against investment data sourced from a spreadsheet.  We demonstrated the calculations against sample data for a small set of accounts.  While this worked fine, in the real-world the problem is much bigger because the amount of data is much bigger.  So much bigger that our approach in the previous post won’t scale to meet the real-world needs. From our previous post, here are the challenges we need to conquer: The actual data that needs to be used lives in a database, not in a spreadsheet The actual data is much, much bigger- too big to fit into the normal R memory space and too big to want to move across the network The overall process needs to run fast- much faster than a single processor The actual data needs to be kept secured- another reason to not want to move it from the database and across the network And the process of calculating the IRR needs to be integrated together with other database ETL activities, so that IRR’s can be calculated as part of the data warehouse refresh processes In this post, we will show how we moved from sample data environment to working with full-scale data.  This post is based on actual work we did for a financial services customer during a recent proof-of-concept. Getting started with the Database At this point, we have some sample data and our IRR function.  We were at a similar point in our customer proof-of-concept exercise- we had sample data but we did not have the full customer data yet.  So our database was empty.  But, this was easily rectified by leveraging the transparency features of Oracle R Enterprise (see https://blogs.oracle.com/R/entry/analyzing_big_data_using_the).  The following code shows how we took our sample data SimpleMWRRData and easily turned it into a new Oracle database table called IRR_DATA via ore.create().  The code also shows how we can access the database table IRR_DATA as if it was a normal R data.frame named IRR_DATA. If we go to sql*plus, we can also check out our new IRR_DATA table: At this point, we now have our sample data loaded in the database as a normal Oracle table called IRR_DATA.  So, we now proceeded to test our R function working with database data. As our first test, we retrieved the data from a single account from the IRR_DATA table, pull it into local R memory, then call our IRR function.  This worked.  No SQL coding required! Going from Crawling to Walking Now that we have shown using our R code with database-resident data for a single account, we wanted to experiment with doing this for multiple accounts.  In other words, we wanted to implement the split-apply-combine technique we discussed in our first post in this series.  Fortunately, Oracle R Enterprise provides a very scalable way to do this with a function called ore.groupApply().  You can read more about ore.groupApply() here: https://blogs.oracle.com/R/entry/analyzing_big_data_using_the1 Here is an example of how we ask ORE to take our IRR_DATA table in the database, split it by the ACCOUNT column, apply a function that calls our SimpleMWRR() calculation, and then combine the results. (If you are following along at home, be sure to have installed our myIRR package on your database server via  “R CMD INSTALL myIRR”). The interesting thing about ore.groupApply is that the calculation is not actually performed in my desktop R environment from which I am running.  What actually happens is that ore.groupApply uses the Oracle database to perform the work.  And the Oracle database is what actually splits the IRR_DATA table by ACCOUNT.  Then the Oracle database takes the data for each account and sends it to an embedded R engine running on the database server to apply our R function.  Then the Oracle database combines all the individual results from the calls to the R function. This is significant because now the embedded R engine only needs to deal with the data for a single account at a time.  Regardless of whether we have 20 accounts or 1 million accounts or more, the R engine that performs the calculation does not care.  Given that normal R has a finite amount of memory to hold data, the ore.groupApply approach overcomes the R memory scalability problem since we only need to fit the data from a single account in R memory (not all of the data for all of the accounts). Additionally, the IRR_DATA does not need to be sent from the database to my desktop R program.  Even though I am invoking ore.groupApply from my desktop R program, because the actual SimpleMWRR calculation is run by the embedded R engine on the database server, the IRR_DATA does not need to leave the database server- this is both a performance benefit because network transmission of large amounts of data take time and a security benefit because it is harder to protect private data once you start shipping around your intranet. Another benefit, which we will discuss in a few paragraphs, is the ability to leverage Oracle database parallelism to run these calculations for dozens of accounts at once. From Walking to Running ore.groupApply is rather nice, but it still has the drawback that I run this from a desktop R instance.  This is not ideal for integrating into typical operational processes like nightly data warehouse refreshes or monthly statement generation.  But, this is not an issue for ORE.  Oracle R Enterprise lets us run this from the database using regular SQL, which is easily integrated into standard operations.  That is extremely exciting and the way we actually did these calculations in the customer proof. As part of Oracle R Enterprise, it provides a SQL equivalent to ore.groupApply which it refers to as “rqGroupEval”.  To use rqGroupEval via SQL, there is a bit of simple setup needed.  Basically, the Oracle Database needs to know the structure of the input table and the grouping column, which we are able to define using the database’s pipeline table function mechanisms. Here is the setup script: At this point, our initial setup of rqGroupEval is done for the IRR_DATA table.  The next step is to define our R function to the database.  We do that via a call to ORE’s rqScriptCreate. Now we can test it.  The SQL you use to run rqGroupEval uses the Oracle database pipeline table function syntax.  The first argument to irr_dataGroupEval is a cursor defining our input.  You can add additional where clauses and subqueries to this cursor as appropriate.  The second argument is any additional inputs to the R function.  The third argument is the text of a dummy select statement.  The dummy select statement is used by the database to identify the columns and datatypes to expect the R function to return.  The fourth argument is the column of the input table to split/group by.  The final argument is the name of the R function as you defined it when you called rqScriptCreate(). The Real-World Results In our real customer proof-of-concept, we had more sophisticated calculation requirements than shown in this simplified blog example.  For instance, we had to perform the rate of return calculations for 5 separate time periods, so the R code was enhanced to do so.  In addition, some accounts needed a time-weighted rate of return to be calculated, so we extended our approach and added an R function to do that.  And finally, there were also a few more real-world data irregularities that we needed to account for, so we added logic to our R functions to deal with those exceptions.  For the full-scale customer test, we loaded the customer data onto a Half-Rack Exadata X2-2 Database Machine.  As our half-rack had 48 physical cores (and 96 threads if you consider hyperthreading), we wanted to take advantage of that CPU horsepower to speed up our calculations.  To do so with ORE, it is as simple as leveraging the Oracle Database Parallel Query features.  Let’s look at the SQL used in the customer proof: Notice that we use a parallel hint on the cursor that is the input to our rqGroupEval function.  That is all we need to do to enable Oracle to use parallel R engines. Here are a few screenshots of what this SQL looked like in the Real-Time SQL Monitor when we ran this during the proof of concept (hint: you might need to right-click on these images to be able to view the images full-screen to see the entire image): From the above, you can notice a few things (numbers 1 thru 5 below correspond with highlighted numbers on the images above.  You may need to right click on the above images and view the images full-screen to see the entire image): The SQL completed in 110 seconds (1.8minutes) We calculated rate of returns for 5 time periods for each of 911k accounts (the number of actual rows returned by the IRRSTAGEGROUPEVAL operation) We accessed 103m rows of detailed cash flow/market value data (the number of actual rows returned by the IRR_STAGE2 operation) We ran with 72 degrees of parallelism spread across 4 database servers Most of our 110seconds was spent in the “External Procedure call” event On average, we performed 8,200 executions of our R function per second (110s/911k accounts) On average, each execution was passed 110 rows of data (103m detail rows/911k accounts) On average, we did 41,000 single time period rate of return calculations per second (each of the 8,200 executions of our R function did rate of return calculations for 5 time periods) On average, we processed over 900,000 rows of database data in R per second (103m detail rows/110s) R + Oracle R Enterprise: Best of R + Best of Oracle Database This blog post series started by describing a real customer problem: how to perform a lot of calculations on a lot of data in a short period of time.  While standard R proved to be a very good fit for writing the necessary calculations, the challenge of working with a lot of data in a short period of time remained. This blog post series showed how Oracle R Enterprise enables R to be used in conjunction with the Oracle Database to overcome the data volume and performance issues (as well as simplifying the operations and security issues).  It also showed that we could calculate 5 time periods of rate of returns for almost a million individual accounts in less than 2 minutes. In a future post, we will take the same R function and show how Oracle R Connector for Hadoop can be used in the Hadoop world.  In that next post, instead of having our data in an Oracle database, our data will live in Hadoop and we will how to use the Oracle R Connector for Hadoop and other Oracle Big Data Connectors to move data between Hadoop, R, and the Oracle Database easily.

    Read the article

  • Building Queries Systematically

    - by Jeremy Smyth
    The SQL language is a bit like a toolkit for data. It consists of lots of little fiddly bits of syntax that, taken together, allow you to build complex edifices and return powerful results. For the uninitiated, the many tools can be quite confusing, and it's sometimes difficult to decide how to go about the process of building non-trivial queries, that is, queries that are more than a simple SELECT a, b FROM c; A System for Building Queries When you're building queries, you could use a system like the following:  Decide which fields contain the values you want to use in our output, and how you wish to alias those fields Values you want to see in your output Values you want to use in calculations . For example, to calculate margin on a product, you could calculate price - cost and give it the alias margin. Values you want to filter with. For example, you might only want to see products that weigh more than 2Kg or that are blue. The weight or colour columns could contain that information. Values you want to order by. For example you might want the most expensive products first, and the least last. You could use the price column in descending order to achieve that. Assuming the fields you've picked in point 1 are in multiple tables, find the connections between those tables Look for relationships between tables and identify the columns that implement those relationships. For example, The Orders table could have a CustomerID field referencing the same column in the Customers table. Sometimes the problem doesn't use relationships but rests on a different field; sometimes the query is looking for a coincidence of fact rather than a foreign key constraint. For example you might have sales representatives who live in the same state as a customer; this information is normally not used in relationships, but if your query is for organizing events where sales representatives meet customers, it's useful in that query. In such a case you would record the names of columns at either end of such a connection. Sometimes relationships require a bridge, a junction table that wasn't identified in point 1 above but is needed to connect tables you need; these are used in "many-to-many relationships". In these cases you need to record the columns in each table that connect to similar columns in other tables. Construct a join or series of joins using the fields and tables identified in point 2 above. This becomes your FROM clause. Filter using some of the fields in point 1 above. This becomes your WHERE clause. Construct an ORDER BY clause using values from point 1 above that are relevant to the desired order of the output rows. Project the result using the remainder of the fields in point 1 above. This becomes your SELECT clause. A Worked Example   Let's say you want to query the world database to find a list of countries (with their capitals) and the change in GNP, using the difference between the GNP and GNPOld columns, and that you only want to see results for countries with a population greater than 100,000,000. Using the system described above, we could do the following:  The Country.Name and City.Name columns contain the name of the country and city respectively.  The change in GNP comes from the calculation GNP - GNPOld. Both those columns are in the Country table. This calculation is also used to order the output, in descending order To see only countries with a population greater than 100,000,000, you need the Population field of the Country table. There is also a Population field in the City table, so you'll need to specify the table name to disambiguate. You can also represent a number like 100 million as 100e6 instead of 100000000 to make it easier to read. Because the fields come from the Country and City tables, you'll need to join them. There are two relationships between these tables: Each city is hosted within a country, and the city's CountryCode column identifies that country. Also, each country has a capital city, whose ID is contained within the country's Capital column. This latter relationship is the one to use, so the relevant columns and the condition that uses them is represented by the following FROM clause:  FROM Country JOIN City ON Country.Capital = City.ID The statement should only return countries with a population greater than 100,000,000. Country.Population is the relevant column, so the WHERE clause becomes:  WHERE Country.Population > 100e6  To sort the result set in reverse order of difference in GNP, you could use either the calculation, or the position in the output (it's the third column): ORDER BY GNP - GNPOld or ORDER BY 3 Finally, project the columns you wish to see by constructing the SELECT clause: SELECT Country.Name AS Country, City.Name AS Capital,        GNP - GNPOld AS `Difference in GNP`  The whole statement ends up looking like this:  mysql> SELECT Country.Name AS Country, City.Name AS Capital, -> GNP - GNPOld AS `Difference in GNP` -> FROM Country JOIN City ON Country.Capital = City.ID -> WHERE Country.Population > 100e6 -> ORDER BY 3 DESC; +--------------------+------------+-------------------+ | Country            | Capital    | Difference in GNP | +--------------------+------------+-------------------+ | United States | Washington | 399800.00 | | China | Peking | 64549.00 | | India | New Delhi | 16542.00 | | Nigeria | Abuja | 7084.00 | | Pakistan | Islamabad | 2740.00 | | Bangladesh | Dhaka | 886.00 | | Brazil | Brasília | -27369.00 | | Indonesia | Jakarta | -130020.00 | | Russian Federation | Moscow | -166381.00 | | Japan | Tokyo | -405596.00 | +--------------------+------------+-------------------+ 10 rows in set (0.00 sec) Queries with Aggregates and GROUP BY While this system might work well for many queries, it doesn't cater for situations where you have complex summaries and aggregation. For aggregation, you'd start with choosing which columns to view in the output, but this time you'd construct them as aggregate expressions. For example, you could look at the average population, or the count of distinct regions.You could also perform more complex aggregations, such as the average of GNP per head of population calculated as AVG(GNP/Population). Having chosen the values to appear in the output, you must choose how to aggregate those values. A useful way to think about this is that every aggregate query is of the form X, Y per Z. The SELECT clause contains the expressions for X and Y, as already described, and Z becomes your GROUP BY clause. Ordinarily you would also include Z in the query so you see how you are grouping, so the output becomes Z, X, Y per Z.  As an example, consider the following, which shows a count of  countries and the average population per continent:  mysql> SELECT Continent, COUNT(Name), AVG(Population)     -> FROM Country     -> GROUP BY Continent; +---------------+-------------+-----------------+ | Continent     | COUNT(Name) | AVG(Population) | +---------------+-------------+-----------------+ | Asia          |          51 |   72647562.7451 | | Europe        |          46 |   15871186.9565 | | North America |          37 |   13053864.8649 | | Africa        |          58 |   13525431.0345 | | Oceania       |          28 |    1085755.3571 | | Antarctica    |           5 |          0.0000 | | South America |          14 |   24698571.4286 | +---------------+-------------+-----------------+ 7 rows in set (0.00 sec) In this case, X is the number of countries, Y is the average population, and Z is the continent. Of course, you could have more fields in the SELECT clause, and  more fields in the GROUP BY clause as you require. You would also normally alias columns to make the output more suited to your requirements. More Complex Queries  Queries can get considerably more interesting than this. You could also add joins and other expressions to your aggregate query, as in the earlier part of this post. You could have more complex conditions in the WHERE clause. Similarly, you could use queries such as these in subqueries of yet more complex super-queries. Each technique becomes another tool in your toolbox, until before you know it you're writing queries across 15 tables that take two pages to write out. But that's for another day...

    Read the article

  • Should we choose Java over C# or we should consider using Mono?

    - by A. Karimi
    We are a small team of independent developers with an average experience of 7 years in C#/.NET platform. We almost work on small to average web application projects that allows us to choose our favorite platform. I believe that our current platform (C#/.NET) allows us to be more productive than if we were working in Java but what makes me think about choosing Java over C# is the costs and the community (of the open source). Our projects allow us even work with various frameworks as well as various platforms. For example we can even use Nancy. So we are able to decrease the costs by using Mono which can be deployed on Linux servers. But I'm looking for a complete ecosystem (IDE/Platform/Production Environment) that decreases our costs and makes us feel completely supported by the community. As an example of issues I've experienced with MonoDevelop, I can refer to the poor support of the Razor syntax on MonoDevelop. As another example, We are using "VS 2012 Express for Web" as our IDE to decrease the costs but as you know it doesn't support plugins and I have serious problems with XML comments (I missed GhostDoc). We strongly believe in strongly-typed programming languages so please don't offer the other languages and platforms such as Ruby, PHP, etc. Now I want to choose between: Keep going on C#, buy some products and be hopeful about openness of .NET ecosystem and its open source community. Changing the platform and start using the Java open source ecosystem

    Read the article

  • Information about how much time in spent in a function, based on the input of this function

    - by olchauvin
    Is there a (quantitative) tool to measure performance of functions based on its input? So far, the tools I used to measure performance of my code, tells me how much time I spent in functions (like Jetbrain Dottrace for .Net), but I'd like to have more information about the parameters passed to the function in order to know which parameters impact the most the performance. Let's say that I have function like that: int myFunction(int myParam1, int myParam 2) { // Do and return something based on the value of myParam1 and myParam2. // The code is likely to use if, for, while, switch, etc.... } If would like a tool that would allow me to tell me how much time is spent in myFunction based on the value of myParam1 and myParam2. For example, the tool would give me a result looking like this: For "myFunction" : value | value | Number of | Average myParam1 | myParam2 | call | time ---------|----------|-----------|-------- 1 | 5 | 500 | 301 ms 2 | 5 | 250 | 1253 ms 3 | 7 | 1268 | 538 ms ... That would mean that myFunction has been call 500 times with myParam1=1 and myParam2=5, and that with those parameters, it took on average 301ms to return a value. The idea behind that is to do some statistical optimization by organizing my code such that, the blocs of codes that are the most likely to be executed are tested before the one that are less likely to be executed. To put it bluntly, if I know which values are used the most, I can reorganize the if/while/for etc.. structure of the function (and the whole program) to optimize it. I'd like to find such tools for C++, Java or.Net. Note: I am not looking for technical tips to optimize the code (like passing parameters as const, inlining functions, initializing the capacity of vectors and the like).

    Read the article

  • Software vs Network Engineer (Salary, Difficulty, Learning, Happiness)

    - by B Z
    What are your thoughts on being a Software Engineer vs a Network Engineer? I've been on the software field for almost 10 years now and although I still have a great deal of fun (and challenges), I am starting to think it could be better on the "other" side. Not to degrade network engineers (i know there are many great ones out there), it seems (in general) their job is easier, the learning curve from average to good is not as steep, job is less stressful and pay is better on average. I think as software developer I could make the switch to networking and still enjoy working with computers and feel productive. I spend an enormous amount of time learning about software, practices, new technologies, new patters, etc...I think I could spend a much smaller amount of time learning about networking and be just as "good". What are your thoughts? EDIT: This is not about making easy money. Networking and Software are closely related, I love computers and programming, but if I can work with both, make more money and have less stress in my life and can spend more time with my family, then I am willing to consider a change and hence I am looking for advice that Do or Don't support this view.

    Read the article

  • How to sync client and server at the first frame

    - by wheelinlight
    I'm making a game where an authoritative server sends information to all clients about states and positions for objects in a 3d world. The player can control his character by clicking on the screen to set a destination for the character, much like in the Diablo series. I've read most information I can find online about interpolation, reconciliation, and general networking architecture (Valve's for instance). I think I understand everything but one thing seems to be missing in every article I read. Let say we have an interpolation delay of 100ms, server tickrate=50ms, latency=200ms; How do I know when 100ms has past on the client? If the server sends the first update on t=0, can I assume it arrives at t=200, therefore assuming that all packets takes the same amount of time to reach the client? What if the first packet arrives a little quick, for instance at t=150. I would then be starting the client with t=150 and at t=250 it will think it has past 100ms since its connect to the server when it in fact only 50ms has past. Hopefully the above paragraph is understandable. The summarized question would be: How do I know at what tick to start simulating the client? EDIT: This is how I ended up doing it: The client keeps a clock (approximately) in sync with the server. The client then simulates the world at simulationTime = syncedTime - avg(RTT)/2 - interpolationTime The round-trip time can fluctuate so therefore I average it out over time. By only keeping the most recent values when calculating the average I hope to adapt to more permanent changes in latency. It's still to early to draw any conclusion. I'm currently simulating bad network connections, but it's looking good so far. Anyone see any possible problems?

    Read the article

  • Randomly and uniquely iterating over a range

    - by Synetech
    Say you have a range of values (or anything else) and you want to iterate over the range and stop at some indeterminate point. Because the stopping value could be anywhere in the range, iterating sequentially is no good because it causes the early values to be accessed more often than later values (which is bad for things that wear out), and also because it reduces performance since it must traverse extra values. Randomly iterating is better because it will (on average) increase the hit-rate so that fewer values have to be accessed before finding the right one, and also distribute the accesses more evenly (again, on average). The problem is that the standard method of randomly jumping around will result in values being accessed multiple times, and has no automatic way of determining when each value has been checked and thus the whole range has been exhausted. One simplified and contrived solution could be to make a list of each value, pick one at random, then remove it. Each time through the loop, you pick one fromt he set of remaining items. Unfortunately this only works for small lists. As a (forced) example, say you are creating a game where the program tries to guess what number you picked and shows how many guess it took. The range is between 0-255 and instead of asking Is it 0? Is it 1? Is it 2?…, you have it guess randomly. You could create a list of 255 numbers, pick randomly and remove it. But what if the range was between 0-232? You can’t really create a 4-billion item list. I’ve seen a couple of implementations RNGs that are supposed to provide a uniform distribution, but none that area also supposed to be unique, i.e., no repeated values. So is there a practical way to randomly, and uniquely iterate over a range?

    Read the article

< Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23  | Next Page >