statistics - Page 16 - Developer IT

Proper use of use of "cor" function in R

- by order

I am interested to know what a proper x (vector matrix or data frame) input looks like. I am currently using the function in two different sorts of matrices. However, I am not sure how R would interpret my data the way I intend. I will explain the types of matrix by example. Type 1 Gene1 Gene2 Gene3 sample1 sample2 Type 2 Sample1 Sample2 Sample3 gene 1 gene 2 gene 3 Are either of these formats valid x parameters? I input both of types of matrices and get some results, but without knowing whether or not this a proper use the function, these are just random numbers. Thank you for your time. I apologize that this isn't more interesting.

Read the article

Which reporting tool would you recommend?

- by grady

Hi, I have some reporting functionality in my app and I want to improve it a bit. Its only SQL in a XML file which is read by some parser. There can be some params for that SQL and when its parsed and the params are injected, it is executed against my DB (SQL Server). I want to improve that a bit so that the results look better and are more flexible. This are the most important points I need to have: subtotals layout that can change dynamically according to settings in DB (like logo, slogan) possibility to use the same report template for several customers (same fields, but different logos,colors, slogans etc.) should run from an ASP.NET application It should be as dynamic as possible. I know of Crystal Reports and the Microsoft Reporting Tool. Are there any others that might be of interest and are my above points possible at all? Thanks for some ideas and hints :-)...

Read the article

How can I neatly clean my R workspace while preserving certain objects?

- by briandk

Suppose I'm messing about with some data by binding vectors together, as I'm wont to do on a lazy sunday afternoon. x <- rnorm(25, mean = 65, sd = 10) y <- rnorm(25, mean = 75, sd = 7) z <- 1:25 dd <- data.frame(mscore = x, vscore = y, caseid = z) I've now got my new dataframe dd, which is wonderful. But there's also still the detritus from my prior slicings and dicings: > ls() [1] "dd" "x" "y" "z" What's a simple way to clean up my workspace if I no longer need my "source" columns, but I want to keep the dataframe? That is, now that I'm done manipulating data I'd like to just have dd and none of the smaller variables that might inadvertently mask further analysis: > ls() [1] "dd" I feel like the solution must be of the form rm(ls[ -(dd) ]) or something, but I can't quite figure out how to say "please clean up everything BUT the following objects."

Read the article

Usage Tracking for Windows desktop applications ...

- by sdaas

Hi, I am looking for some frameworks that can be used to collect usage information for Windows desktop application and analyze it. For example, I would like to be able to answer questions like (a) how many times do people use this application in a day (b) which are their favorite menu items, etc. I looked briefly at Google Analytics and Omniture SiteCatalyst but they seem to work only on web applications. Thanks SD

Read the article

Cosmic Rays: what is the probability they will affect a program?

- by Mark Harrison

Once again I was in a design review, and encountered the claim that the probability of a particular scenario was "less than the risk of cosmic rays" affecting the program, and it occurred to me that I didn't have the faintest idea what that probability is. What is the probability of a cosmic ray hitting a computer and affecting the execution of the program?

Read the article

R: Given a set of random numbers drawn from a continuous univariate distribution, find the distribut

- by knorv

Given a set of real numbers drawn from a unknown continuous univariate distribution (let's say is is one of beta, Cauchy, chi-square, exponential, F, gamma, Laplace, log-normal, normal, Pareto, Student's t, uniform and Weibull).. x <- c(15.771062,14.741310,9.081269,11.276436,11.534672,17.980860,13.550017,13.853336,11.262280,11.049087,14.752701,4.481159,11.680758,11.451909,10.001488,11.106817,7.999088,10.591574,8.141551,12.401899,11.215275,13.358770,8.388508,11.875838,3.137448,8.675275,17.381322,12.362328,10.987731,7.600881,14.360674,5.443649,16.024247,11.247233,9.549301,9.709091,13.642511,10.892652,11.760685,11.717966,11.373979,10.543105,10.230631,9.918293,10.565087,8.891209,10.021141,9.152660,10.384917,8.739189,5.554605,8.575793,12.016232,10.862214,4.938752,14.046626,5.279255,11.907347,8.621476,7.933702,10.799049,8.567466,9.914821,7.483575,11.098477,8.033768,10.954300,8.031797,14.288100,9.813787,5.883826,7.829455,9.462013,9.176897,10.153627,4.922607,6.818439,9.480758,8.166601,12.017158,13.279630,14.464876,13.319124,12.331335,3.194438,9.866487,11.337083,8.958164,8.241395,4.289313,5.508243,4.737891,7.577698,9.626720,16.558392,10.309173,11.740863,8.761573,7.099866,10.032640) .. is there some easy way in R to programmatically and automatically find the most likely distribution and the estimated distribution parameters?

Read the article

In R: How do I choose the first 4 Rows of a Data Frame

- by Moe

How Do I go about getting the first 4 rows of my Dataframe: Weight Response 1 Control 59 0.0 2 Treatment 90 0.8 3 Treatment 47 0.1 4 Treamment 106 0.1 5 Control 85 0.7 6 Treatment 73 0.6 7 Control 61 0.2 In 'R'?

Read the article

Why to use local pipes instead of sockets for programs communication inside one computer?

- by Ole Jak

Why to use local pipes instead of sockets for programs communication inside one computer? Has Any one FRESH stats on who is faster and how nmuch, what is more sequre or less?

Read the article

Histrogram matching - image processing - c/c++

- by Raj

Hello I have two histograms. int Hist1[10] = {1,4,3,5,2,5,4,6,3,2}; int Hist1[10] = {1,4,3,15,12,15,4,6,3,2}; Hist1's distribution is of type multi-modal; Hist2's distribution is of type uni-modal with single prominent peak. My questions are Is there any way that i could determine the type of distribution programmatically? How to quantify whether these two histograms are similar/dissimilar? Thanks

Read the article

In R, How to add Row for Information:

- by Moe

Hi, I'm trying to add a Row to my data.frame to spit out the average of the column. This is my data frame: Weight Response 1 Control 59 0.0 2 Treatment 90 0.8 3 Treatment 47 0.1 4 Treamment 106 0.1 5 Control 85 0.7 6 Treatment 73 0.6 7 Control 61 0.2 I'd like it to become: Weight Response 1 Control 59 0.0 2 Treatment 90 0.8 3 Treatment 47 0.1 4 Treamment 106 0.1 5 Control 85 0.7 6 Treatment 73 0.6 7 Control 61 0.2 8 AVERAGES 74 0.3 Thanks!

Read the article

Pie chart of *nix shell use [closed]

- by hayk.mart

I've encountered a situation where it would be very helpful to know the breakdown of shell use by percentage. For example, I'm looking for something like bash: X%, sh: Y%, csh, tcsh, zsh, ksh, dash, etc.. Obviously, I know there are several complications - multiple shells, the definition of "use", uncertainty and so forth, but I would like to see an informed answer derived from actual data and based on some stated metric, even if the result could be horribly wrong. Bonus if there is historical data demonstrating a shift in preferences.

Read the article

Geometric Mean: is there a built-in?

- by doug

i tried to find a built-in for geometric mean but couldn't. (Obviously a built-in isn't going to save me any time while working in the shell, nor do i suspect there's any difference in accuracy; for scripts i try to use built-ins as often as possible, where the (cumulative) performance gain is often noticeable. In case there isn't one (which i doubt is the case) here's mine. gm_mean = function(a){prod(a)^(1/length(a))}

Read the article

Programmatically determine the relative "popularities" of a list of items (books, songs, movies, etc

- by Horace Loeb

Given a list of (say) songs, what's the best way to determine their relative "popularity"? My first thought is to use Google Trends. This list of songs: Subterranean Homesick Blues Empire State of Mind California Gurls produces the following Google Trends report: (to find out what's popular now, I restricted the report to the last 30 days) Empire State of Mind is marginally more popular than California Gurls, and Subterranean Homesick Blues is far less popular than either. So this works pretty well, but what happens when your list is 100 or 1000 songs long? Google Trends only allows you to compare 5 terms at once, so absent a huge round-robin, what's the right approach? Another option is to just do a Google Search for each song and see which has the most results, but this doesn't really measure the same thing

Read the article

R question. Create new data set that meets all of 4 conditions.

- by Michael

Hello, I would like to create a new dataset where the following for conditions are all met. rowSums(is.na(UNCA[,11:23]))<12 rowSums(is.na(UNCA[,27:39]))<12 rowSums(is.na(UNCA[,40:52]))<12 rowSums(is.na(UNCA[,53:65]))<12 Thanks!

Read the article

Automating R Script

- by ETD

I have written an R script that pulls some data from a database, performs several operations on it and post the output to a new database. I would like this script to run every day at a specific time but I can not find any way to do this effectively. Can anyone recommend a resource I could look at to solve this issue? I am running this script on a Windows machine.

Read the article

Does anyone know of a Cocoa/Obj-C library that can be used to gather application usage data

- by Shackelford R

I would like to be able to gather info like how often certain windows are opened, what types of user data are accessed, how often menu items are clicked, etc. Does anyone know of a 3rd party (open source or commercial) Cocoa/Obj-C library or plugin that would allow me to gather this info?

Read the article

Determining smallest number of samples for 99% accuracy

- by test

I'm trying to compare 100,000 records on a local database (L) with 100,000 records on a remote database (R). Basically I want to know if an elment in L exusts in R. To determine that, I have to make a request against the R for each L, which takes a long time (I know, there should be a better way, there isn't, that's the API I've got). So I would like to test a small sample of L against R, and then infer with some level of confidence how many are present in the whole R. How many do I have to test to have a 99% confidence level?

Read the article

Does Intel Smart Response provide any statistics on the cache usage?

- by Tom Seddon

I've set up my Z68-based Core i7 PC with a 60GB SSD dedicated as a Smart Response cache drive. Is there any way I can get any statistics out of it? It would be nice to have some information on how much cache space is actually being used, maybe how much of it was actually accessed recently, and how many reads in general are coming from the SSD rather than from the mechanical disk. These statistics might help to quickly provide some evidence for or against the use of Smart Response, without my having to reinstall Windows on the SSD (etc.) to find out. The Windows ReadyBoost feature has some performance counters you can access via the Windows 7 perfmon tool, for example, which is the kind of thing I'm hoping is somehow available. Smart Response provides no perfmon counters, though, and the Intel Rapid Storage Utility tells you pretty much nothing except that Smart Response is switched on.

Read the article

How to check if a Statistics is auto-created in a SQL Server 2000 DB using T-SQL?

- by The Shaper

Hi all. A while back I had to come up with a way to clean up all indexes and user-created statistics from some tables in a SQL Server 2005 database. After a few attempts it worked, but now I gotta have it working in SQL Server 2000 databases as well. For SQL Server 2005, I used SELECT Name FROM sys.stats WHERE object_id = object_id(@tableName) AND auto_created = 0 to fetch Statistics that were user-created. However, SQL 2000 doesn't have a sys.stats table. I managed to fetch the indexes and statistics in a distinguishable way from the sysindexes table, but I just couldn't figure out what the sys.stats.auto_created match is for SQL 2000. Any pointers? BTW: T-SQL please.

Read the article

What statistics app should I use for my website?

- by Camran

I have my own server (with root access). I need statistics of users who visit my website etc etc... I have looked at an app called Webalyzer... Is this a good choice? I run apache2 on a Ubuntu 9 system... If you know of any good statistics apps for servers please let me know. And a follow-up question: All statistics are saved in log-files right? So how large would these log-files become then? Possibility to split them would be good, dont know if this is possible with Webalyzer though...

Read the article

Calculating the Size (in Bytes and MB) of a Oracle Coherence Cache

- by Ricardo Ferreira

The concept and usage of data grids are becoming very popular in this days since this type of technology are evolving very fast with some cool lead products like Oracle Coherence. Once for a while, developers need an programmatic way to calculate the total size of a specific cache that are residing in the data grid. In this post, I will show how to accomplish this using Oracle Coherence API. This example has been tested with 3.6, 3.7 and 3.7.1 versions of Oracle Coherence. To start the development of this example, you need to create a POJO ("Plain Old Java Object") that represents a data structure that will hold user data. This data structure will also create an internal fat so I call that should increase considerably the size of each instance in the heap memory. Create a Java class named "Person" as shown in the listing below. package com.oracle.coherence.domain; import java.io.Serializable; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Random; @SuppressWarnings("serial") public class Person implements Serializable { private String firstName; private String lastName; private List<Object> fat; private String email; public Person() { generateFat(); } public Person(String firstName, String lastName, String email) { setFirstName(firstName); setLastName(lastName); setEmail(email); generateFat(); } private void generateFat() { fat = new ArrayList<Object>(); Random random = new Random(); for (int i = 0; i < random.nextInt(18000); i++) { HashMap<Long, Double> internalFat = new HashMap<Long, Double>(); for (int j = 0; j < random.nextInt(10000); j++) { internalFat.put(random.nextLong(), random.nextDouble()); } fat.add(internalFat); } } public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } } Now let's create a Java program that will start a data grid into Coherence and will create a cache named "People", that will hold people instances with sequential integer keys. Each person created in this program will trigger the execution of a custom constructor created in the People class that instantiates an internal fat (the random amount of data generated to increase the size of the object) for each person. Create a Java class named "CreatePeopleCacheAndPopulateWithData" as shown in the listing below. package com.oracle.coherence.demo; import com.oracle.coherence.domain.Person; import com.tangosol.net.CacheFactory; import com.tangosol.net.NamedCache; public class CreatePeopleCacheAndPopulateWithData { public static void main(String[] args) { // Asks Coherence for a new cache named "People"... NamedCache people = CacheFactory.getCache("People"); // Creates three people that will be putted into the data grid. Each person // generates an internal fat that should increase its size in terms of bytes... Person pessoa1 = new Person("Ricardo", "Ferreira", "[email protected]"); Person pessoa2 = new Person("Vitor", "Ferreira", "[email protected]"); Person pessoa3 = new Person("Vivian", "Ferreira", "[email protected]"); // Insert three people at the data grid... people.put(1, pessoa1); people.put(2, pessoa2); people.put(3, pessoa3); // Waits for 5 minutes until the user runs the Java program // that calculates the total size of the people cache... try { System.out.println("---> Waiting for 5 minutes for the cache size calculation..."); Thread.sleep(300000); } catch (InterruptedException ie) { ie.printStackTrace(); } } } Finally, let's create a Java program that, using the Coherence API and JMX, will calculate the total size of each cache that the data grid is currently managing. The approach used in this example was retrieve every cache that the data grid are currently managing, but if you are interested on an specific cache, the same approach can be used, you should only filter witch cache will be looked for. Create a Java class named "CalculateTheSizeOfPeopleCache" as shown in the listing below. package com.oracle.coherence.demo; import java.text.DecimalFormat; import java.util.Map; import java.util.Set; import java.util.TreeMap; import javax.management.MBeanServer; import javax.management.MBeanServerFactory; import javax.management.ObjectName; import com.tangosol.net.CacheFactory; public class CalculateTheSizeOfPeopleCache { @SuppressWarnings({ "unchecked", "rawtypes" }) private void run() throws Exception { // Enable JMX support in this Coherence data grid session... System.setProperty("tangosol.coherence.management", "all"); // Create a sample cache just to access the data grid... CacheFactory.getCache(MBeanServerFactory.class.getName()); // Gets the JMX server from Coherence data grid... MBeanServer jmxServer = getJMXServer(); // Creates a internal data structure that would maintain // the statistics from each cache in the data grid... Map cacheList = new TreeMap(); Set jmxObjectList = jmxServer.queryNames(new ObjectName("Coherence:type=Cache,*"), null); for (Object jmxObject : jmxObjectList) { ObjectName jmxObjectName = (ObjectName) jmxObject; String cacheName = jmxObjectName.getKeyProperty("name"); if (cacheName.equals(MBeanServerFactory.class.getName())) { continue; } else { cacheList.put(cacheName, new Statistics(cacheName)); } } // Updates the internal data structure with statistic data // retrieved from caches inside the in-memory data grid... Set<String> cacheNames = cacheList.keySet(); for (String cacheName : cacheNames) { Set resultSet = jmxServer.queryNames( new ObjectName("Coherence:type=Cache,name=" + cacheName + ",*"), null); for (Object resultSetRef : resultSet) { ObjectName objectName = (ObjectName) resultSetRef; if (objectName.getKeyProperty("tier").equals("back")) { int unit = (Integer) jmxServer.getAttribute(objectName, "Units"); int size = (Integer) jmxServer.getAttribute(objectName, "Size"); Statistics statistics = (Statistics) cacheList.get(cacheName); statistics.incrementUnit(unit); statistics.incrementSize(size); cacheList.put(cacheName, statistics); } } } // Finally... print the objects from the internal data // structure that represents the statistics from caches... cacheNames = cacheList.keySet(); for (String cacheName : cacheNames) { Statistics estatisticas = (Statistics) cacheList.get(cacheName); System.out.println(estatisticas); } } public MBeanServer getJMXServer() { MBeanServer jmxServer = null; for (Object jmxServerRef : MBeanServerFactory.findMBeanServer(null)) { jmxServer = (MBeanServer) jmxServerRef; if (jmxServer.getDefaultDomain().equals(DEFAULT_DOMAIN) || DEFAULT_DOMAIN.length() == 0) { break; } jmxServer = null; } if (jmxServer == null) { jmxServer = MBeanServerFactory.createMBeanServer(DEFAULT_DOMAIN); } return jmxServer; } private class Statistics { private long unit; private long size; private String cacheName; public Statistics(String cacheName) { this.cacheName = cacheName; } public void incrementUnit(long unit) { this.unit += unit; } public void incrementSize(long size) { this.size += size; } public long getUnit() { return unit; } public long getSize() { return size; } public double getUnitInMB() { return unit / (1024.0 * 1024.0); } public double getAverageSize() { return size == 0 ? 0 : unit / size; } public String toString() { StringBuffer sb = new StringBuffer(); sb.append("\nCache Statistics of '").append(cacheName).append("':\n"); sb.append(" - Total Entries of Cache -----> " + getSize()).append("\n"); sb.append(" - Used Memory (Bytes) --------> " + getUnit()).append("\n"); sb.append(" - Used Memory (MB) -----------> " + FORMAT.format(getUnitInMB())).append("\n"); sb.append(" - Object Average Size --------> " + FORMAT.format(getAverageSize())).append("\n"); return sb.toString(); } } public static void main(String[] args) throws Exception { new CalculateTheSizeOfPeopleCache().run(); } public static final DecimalFormat FORMAT = new DecimalFormat("###.###"); public static final String DEFAULT_DOMAIN = ""; public static final String DOMAIN_NAME = "Coherence"; } I've commented the overall example so, I don't think that you should get into trouble to understand it. Basically we are dealing with JMX. The first thing to do is enable JMX support for the Coherence client (ie, an JVM that will only retrieve values from the data grid and will not integrate the cluster) application. This can be done very easily using the runtime "tangosol.coherence.management" system property. Consult the Coherence documentation for JMX to understand the possible values that could be applied. The program creates an in memory data structure that holds a custom class created called "Statistics". This class represents the information that we are interested to see, which in this case are the size in bytes and in MB of the caches. An instance of this class is created for each cache that are currently managed by the data grid. Using JMX specific methods, we retrieve the information that are relevant for calculate the total size of the caches. To test this example, you should execute first the CreatePeopleCacheAndPopulateWithData.java program and after the CreatePeopleCacheAndPopulateWithData.java program. The results in the console should be something like this: 2012-06-23 13:29:31.188/4.970 Oracle Coherence 3.6.0.4 <Info> (thread=Main Thread, member=n/a): Loaded operational configuration from "jar:file:/E:/Oracle/Middleware/oepe_11gR1PS4/workspace/calcular-tamanho-cache-coherence/lib/coherence.jar!/tangosol-coherence.xml" 2012-06-23 13:29:31.219/5.001 Oracle Coherence 3.6.0.4 <Info> (thread=Main Thread, member=n/a): Loaded operational overrides from "jar:file:/E:/Oracle/Middleware/oepe_11gR1PS4/workspace/calcular-tamanho-cache-coherence/lib/coherence.jar!/tangosol-coherence-override-dev.xml" 2012-06-23 13:29:31.219/5.001 Oracle Coherence 3.6.0.4 <D5> (thread=Main Thread, member=n/a): Optional configuration override "/tangosol-coherence-override.xml" is not specified 2012-06-23 13:29:31.266/5.048 Oracle Coherence 3.6.0.4 <D5> (thread=Main Thread, member=n/a): Optional configuration override "/custom-mbeans.xml" is not specified Oracle Coherence Version 3.6.0.4 Build 19111 Grid Edition: Development mode Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. 2012-06-23 13:29:33.156/6.938 Oracle Coherence GE 3.6.0.4 <Info> (thread=Main Thread, member=n/a): Loaded Reporter configuration from "jar:file:/E:/Oracle/Middleware/oepe_11gR1PS4/workspace/calcular-tamanho-cache-coherence/lib/coherence.jar!/reports/report-group.xml" 2012-06-23 13:29:33.500/7.282 Oracle Coherence GE 3.6.0.4 <Info> (thread=Main Thread, member=n/a): Loaded cache configuration from "jar:file:/E:/Oracle/Middleware/oepe_11gR1PS4/workspace/calcular-tamanho-cache-coherence/lib/coherence.jar!/coherence-cache-config.xml" 2012-06-23 13:29:35.391/9.173 Oracle Coherence GE 3.6.0.4 <D4> (thread=Main Thread, member=n/a): TCMP bound to /192.168.177.133:8090 using SystemSocketProvider 2012-06-23 13:29:37.062/10.844 Oracle Coherence GE 3.6.0.4 <Info> (thread=Cluster, member=n/a): This Member(Id=2, Timestamp=2012-06-23 13:29:36.899, Address=192.168.177.133:8090, MachineId=55685, Location=process:244, Role=Oracle, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) joined cluster "cluster:0xC4DB" with senior Member(Id=1, Timestamp=2012-06-23 13:29:14.031, Address=192.168.177.133:8088, MachineId=55685, Location=process:1128, Role=CreatePeopleCacheAndPopulateWith, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) 2012-06-23 13:29:37.172/10.954 Oracle Coherence GE 3.6.0.4 <D5> (thread=Cluster, member=n/a): Member 1 joined Service Cluster with senior member 1 2012-06-23 13:29:37.188/10.970 Oracle Coherence GE 3.6.0.4 <D5> (thread=Cluster, member=n/a): Member 1 joined Service Management with senior member 1 2012-06-23 13:29:37.188/10.970 Oracle Coherence GE 3.6.0.4 <D5> (thread=Cluster, member=n/a): Member 1 joined Service DistributedCache with senior member 1 2012-06-23 13:29:37.188/10.970 Oracle Coherence GE 3.6.0.4 <Info> (thread=Main Thread, member=n/a): Started cluster Name=cluster:0xC4DB Group{Address=224.3.6.0, Port=36000, TTL=4} MasterMemberSet ( ThisMember=Member(Id=2, Timestamp=2012-06-23 13:29:36.899, Address=192.168.177.133:8090, MachineId=55685, Location=process:244, Role=Oracle) OldestMember=Member(Id=1, Timestamp=2012-06-23 13:29:14.031, Address=192.168.177.133:8088, MachineId=55685, Location=process:1128, Role=CreatePeopleCacheAndPopulateWith) ActualMemberSet=MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2012-06-23 13:29:14.031, Address=192.168.177.133:8088, MachineId=55685, Location=process:1128, Role=CreatePeopleCacheAndPopulateWith) Member(Id=2, Timestamp=2012-06-23 13:29:36.899, Address=192.168.177.133:8090, MachineId=55685, Location=process:244, Role=Oracle) ) RecycleMillis=1200000 RecycleSet=MemberSet(Size=0, BitSetCount=0 ) ) TcpRing{Connections=[1]} IpMonitor{AddressListSize=0} 2012-06-23 13:29:37.891/11.673 Oracle Coherence GE 3.6.0.4 <D5> (thread=Invocation:Management, member=2): Service Management joined the cluster with senior service member 1 2012-06-23 13:29:39.203/12.985 Oracle Coherence GE 3.6.0.4 <D5> (thread=DistributedCache, member=2): Service DistributedCache joined the cluster with senior service member 1 2012-06-23 13:29:39.297/13.079 Oracle Coherence GE 3.6.0.4 <D4> (thread=DistributedCache, member=2): Asking member 1 for 128 primary partitions Cache Statistics of 'People': - Total Entries of Cache -----> 3 - Used Memory (Bytes) --------> 883920 - Used Memory (MB) -----------> 0.843 - Object Average Size --------> 294640 I hope that this post could save you some time when calculate the total size of Coherence cache became a requirement for your high scalable system using data grids. See you!

Read the article

Where can I find statistics / figures on how long testing should / could take?

- by NoCarrier

I'm trying to convince management that testing/QA takes considerably longer than non-developers think. Some smaller shops don't have budgets for testers and phbs automatically assume the developer will spend a few minutes after every build "testing" and deliver a perfectly functional system. Can someone point me to some numbers? e.g. Testing should be XX% of your total man hour count , etc etc? Or perhaps some real world experience? My goal is to have some numbers that are grounded in real life so I can make time/effort allocation justifications for "proper" testing when preparing estimates and timelines for applications. Maybe not full blown 100% TDD, but pragmatically close to it. I apologize if I seem vague.

Read the article

Will using HTTPS hurt my site's SEO or other statistics?

- by yannbane

I've set up a WordPress blog. Since I have to log into it from many different locations/machines, I've also got an SSL certificate, and set up Apache to redirect HTTP to HTTPS. It all works, but I'm wondering whether that's an overkill. Since most people who go to my site don't have to log in, I'm starting to wonder whether HTTPS has some drawbacks. If so, should I look for a way to make HTTPS optional?

Read the article

User activity vs. System activity on the Index Usage Statistics report

- by Zachary G Jensen

I recently decided to crawl over the indexes on one of our most heavily used databases to see which were suboptimal. I generated the built-in Index Usage Statistics report from SSMS, and it's showing me a great deal of information that I'm unsure how to understand. I found an article at Carpe Datum about the report, but it doesn't tell me much more than I could assume from the column titles. In particular, the report differentiates between User activity and system activity, and I'm unsure what qualifies as each type of activity. I assume that any query that uses a given index increases the '# of user X' columns. But what increases the system columns? building statistics? Is there anything that depends on the user or role(s) of a user that's running the query?

Read the article

More CPU cores may not always lead to better performance – MAXDOP and query memory distribution in spotlight

- by sqlworkshops

More hardware normally delivers better performance, but there are exceptions where it can hinder performance. Understanding these exceptions and working around it is a major part of SQL Server performance tuning. When a memory allocating query executes in parallel, SQL Server distributes memory to each task that is executing part of the query in parallel. In our example the sort operator that executes in parallel divides the memory across all tasks assuming even distribution of rows. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. In reality, how often are column values evenly distributed, think about an example; are employees working for your company distributed evenly across all the Zip codes or mainly concentrated in the headquarters? What happens when you sort result set based on Zip codes? Do all products in the catalog sell equally or are few products hot selling items? One of my customers tested the below example on a 24 core server with various MAXDOP settings and here are the results:MAXDOP 1: CPU time = 1185 ms, elapsed time = 1188 msMAXDOP 4: CPU time = 1981 ms, elapsed time = 1568 msMAXDOP 8: CPU time = 1918 ms, elapsed time = 1619 msMAXDOP 12: CPU time = 2367 ms, elapsed time = 2258 msMAXDOP 16: CPU time = 2540 ms, elapsed time = 2579 msMAXDOP 20: CPU time = 2470 ms, elapsed time = 2534 msMAXDOP 0: CPU time = 2809 ms, elapsed time = 2721 ms - all 24 cores.In the above test, when the data was evenly distributed, the elapsed time of parallel query was always lower than serial query. Why does the query get slower and slower with more CPU cores / higher MAXDOP? Maybe you can answer this question after reading the article; let me know: [email protected]. Well you get the point, let’s see an example. The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go Let’s create the temporary table #FireDrill with all possible Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip from Employees update statistics #FireDrill with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --First serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) goThe query took 1011 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 799624. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 1912 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 799624. The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead. Sort properties shows the rows are unevenly distributed over the 4 threads. Sort Warnings in SQL Server Profiler. Intermediate Summary: The reason for the higher duration with parallel plan was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001. Now let’s update the Employees table and distribute employees evenly across all Zip codes. update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 751 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 784707. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 661 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 784707. Sort properties shows the rows are evenly distributed over the 4 threads. No Sort Warnings in SQL Server Profiler. Intermediate Summary: When employees were distributed unevenly, concentrated on 1 Zip code, parallel sort spilled while serial sort performed well without spilling to tempdb. When the employees were distributed evenly across all Zip codes, parallel sort and serial sort did not spill to tempdb. This shows uneven data distribution may affect the performance of some parallel queries negatively. For detailed discussion of memory allocation, refer to webcasts available at www.sqlworkshops.com/webcasts. Some of you might conclude from the above execution times that parallel query is not faster even when there is no spill. Below you can see when we are joining limited amount of Zip codes, parallel query will be fasted since it can use Bitmap Filtering. Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go Let’s create the temporary table #FireDrill with limited Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip from Employees where Zip between 1800 and 2001 update statistics #FireDrill with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 989 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 785594. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 1799 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 785594. Sort Warnings in SQL Server Profiler. The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead. Intermediate Summary: The reason for the higher duration with parallel plan even with limited amount of Zip codes was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001. Now let’s update the Employees table and distribute employees evenly across all Zip codes. update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 250 ms to complete. The execution plan shows the 9016 KB of memory was granted while the estimated rows were 79973.8. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 85 ms to complete. The execution plan shows the 13152 KB of memory was granted while the estimated rows were 784707. No Sort Warnings in SQL Server Profiler. Here you see, parallel query is much faster than serial query since SQL Server is using Bitmap Filtering to eliminate rows before the hash join. Parallel queries are very good for performance, but in some cases it can hinder performance. If one identifies the reason for these hindrances, then it is possible to get the best out of parallelism. I covered many aspects of monitoring and tuning parallel queries in webcasts (www.sqlworkshops.com/webcasts) and articles (www.sqlworkshops.com/articles). I suggest you to watch the webcasts and read the articles to better understand how to identify and tune parallel query performance issues. Summary: One has to avoid sort spill over tempdb and the chances of spills are higher when a query executes in parallel with uneven data distribution. Parallel query brings its own advantage, reduced elapsed time and reduced work with Bitmap Filtering. So it is important to understand how to avoid spills over tempdb and when to execute a query in parallel. I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts. Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here. Disclaimer and copyright information:This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners. Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work. This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk. Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here. R Meyyappan [email protected] LinkedIn: http://at.linkedin.com/in/rmeyyappan

Search Results

Search found 1631 results on 66 pages for 'statistics'.

Page 16/66 | < Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >

- by order

- by grady

- by briandk

- by sdaas

- by Mark Harrison

- by knorv

- by Moe

- by Ole Jak

- by Raj

- by Moe

- by hayk.mart

- by doug

- by Horace Loeb

- by Michael

- by ETD

- by Shackelford R

- by test

- by Tom Seddon

- by The Shaper

- by Camran

- by Ricardo Ferreira

- by NoCarrier

- by yannbane

- by Zachary G Jensen

- by sqlworkshops

< Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23 | Next Page >