Search Results

Search found 529 results on 22 pages for 'avg'.

Page 18/22 | < Previous Page | 14 15 16 17 18 19 20 21 22 | Next Page >

postfix revived and delivered have the same values (?)

- by thinkingbig

I have configured my first server (Debian with ISPConfig). Generally i want to send bulk e-mails to our users, i configure postfix and turn on postfix... but... After 1 hour of sending emails i have logs like this: Grand Totals messages 21886 received 21883 delivered 0 forwarded 0 deferred 234 bounced 0 rejected (0%) 0 reject warnings 0 held 0 discarded (0%) 30805k bytes received 31280k bytes delivered 3 senders 3 sending hosts/domains 12588 recipients 3 recipient hosts/domains Per-Hour Traffic Summary time received delivered deferred bounced rejected -------------------------------------------------------------------- 0000-0100 0 0 0 0 0 0100-0200 0 0 0 0 0 0200-0300 0 0 0 0 0 0300-0400 0 0 0 0 0 0400-0500 0 0 0 0 0 0500-0600 0 0 0 0 0 0600-0700 0 0 0 0 0 0700-0800 0 0 0 0 0 0800-0900 0 0 0 0 0 0900-1000 0 0 0 0 0 1000-1100 0 0 0 0 0 1100-1200 0 0 0 0 0 1200-1300 0 0 0 0 0 1300-1400 0 0 0 0 0 1400-1500 0 0 0 0 0 1500-1600 15311 15306 0 168 0 1600-1700 6575 6577 0 66 0 1700-1800 0 0 0 0 0 1800-1900 0 0 0 0 0 1900-2000 0 0 0 0 0 2000-2100 0 0 0 0 0 2100-2200 0 0 0 0 0 2200-2300 0 0 0 0 0 2300-2400 0 0 0 0 0 Host/Domain Summary: Message Delivery sent cnt bytes defers avg dly max dly host/domain 21521 30353k 0 3.4 m 15.5 m wp.pl 355 919k 0 54.9 s 13.0 m mysenderdomainexample.pl 7 8477 0 1.7 s 1.9 s prokonto.pl Host/Domain Summary: Messages Received msg cnt bytes host/domain 21879 30786k mysenderdomainexample.pl 5 16196 mx4.wp.pl 1 3200 mx3.wp.pl Senders by message count 21783 [email protected] 96 [email protected] 6 from=< **So, my question is: 1) Why i have recived and delivered have the same values (approx)? 2) How can I check if an email has been delivered? 3) How to change default "root" and "www-data" user (FROM / RETURN PATH) to another? I have changed this in script, but postfix ignore scripting values and send every mail from root (we have .php send cron's in /etc/crontab) 4) WHY APPROX 100 % MAILS RECIVED HAS BEEN ADRESED TO MY SENDER HOST? Host/Domain Summary: Messages Received Waiting for respond, Regards TB**

Read the article
Suggestions for cleaning up the mess after removing the "system tool" virus?

- by Ross

Hi! Last night I got infected with the "System Tool" virus. For those who don't know it disallows the user from executing any software, changes the desktop, stops all security software from running, and continually requests that you buy a Trojan security software. It took me a few hours but I finally managed to remove the software. To do this I went into my Ubuntu partition and searched out files that had been created around the time that I got infected and deleted the executable. Then I went back into my W7 partition and ran an MBAM full scan, an MSE full scan, an AVG bootable USB scan, and ran a ClamAV scan from my Ubuntu partition (Together these found 3 more infected executables). I also ran a Ccleaner full sweep and the registry cleaner just in case. I think I have found all of the problems but am still concerned that there might be a payload leftover from the virus that I didn't find. Do you have any suggestions of what else I can do to be sure. Just FYI I use W7 64 bit and MSE as my primary antivirus. I was using chrome when I got infected and it seems that it was due to a slightly out of date Java installation (MSE gave me a warning that the website had used a Java exploit and then my desktop changed to the classic "System Tools" desktop) Thank you very much for your help.

Read the article
Unix sort 10x slower with keys specified

- by KenFar

My data: It's a 71 MB file with 1.5 million rows. It has 6 fields, four of which are strings of avg. 15 characters, two are integers. Three of the fields are sometimes empty. All six fields combine to form a unique key - and that's what I need to sort on. Sort statement: sort -t ',' -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 -k6,6 -o a_out.csv a_in.csv The problem: If I sort without keys, it takes 30 seconds. If I sort with keys, it takes 660 seconds. I need to sort with keys to keep this generic and useful for other files that have non-key fields as well. The 30 second timing is fine, but the 660 is a killer. I could theoretically move the temp directory to SSD, and/or split the file into 4 parts, sort them separately (in parallel) then merge the results, etc. But I'm hoping for something simpler since these results are so bad as-is. Any suggestions?

Read the article
One workstation gets slow access to the server, but others are fast

- by Mike Hanson

I've just setup a machine with Windows Server 2008. It hosts various services, like IIS, POP3, SMTP, Music for Squeezeboxes, VNC. All was working well for the first week or so. One day I needed to create a mapped drive on the server, so it could access files on my workstation. Windows indicated that Network Discovery was needed, so I turned it on with the "Home / Office" option (rather than "Public"). This may be coincidence, but since that time I've been having troubles accessing various services from my main workstation (running Windows 7/64): POP3 continued working correctly, but SMTP was delayed or failed entirely. (Telnet took 20 seconds to connect, but Outlook would never send messages.) VNC failed entirely. I reinstalled it on the server, and now it works but feels sluggish. The music web server was extremely delayed and usually failed. I tried reinstalling, and now it takes about 30 seconds to show the page name on the browser tab, and another 30 seconds to display any page contents. Other machines on the local network seem fine, as do machines connected via the Internet. I don't believe I changed anything on my own machine that would cause this. I considered the possibility that my anti-virus was involved, so I uninstalled AVG (commercial version), but that didn't help. I installed Norton 360 after that, and it didn't complain of viruses on my machine, and the delays remained. Because only my machine is affect, I'm tempted to blame it, except that reinstalling software on the server improved the situation, so there is almost certainly something going on with the server too. The firewall has all the necessary ports open, and it works fine for the other workstations (including external machines connected via the Internet), which indicates that it should be OK. Any ideas?

Read the article
what are these weird IP address connections in resource monitor?

- by bill

I decided to check out Resource Monitor (on the 'Performance' tab in Task Manager, Windows 7) and I noticed in the "Network" section that the 'System' image name kept making a bunch (~5 at a time) of connections to random IP addresses, it would show anywhere from 1-500 bytes/sec 'sent'. They would stay connected for 1-2 minutes. -All web browsers are closed So, first thing I did was run a trace from network-tools.com on some of these IP addresses. 8/10 were outside of US and did not resolve to any host name. Of the 10 IP addresses I traced, 2 were in US, 4 showed origins in China, and one each to Algeria, Russia, Pakistan, Korea. (!) So, the next thing I did was turn off my wireless card, watch the connections disappear, then turn the card back on, and within 30 seconds more random connections were created by System, with different IP addresses from the first time. The next thing I did was go open Task Manager, Show Processes From All Users, then I killed just about everything that wasn't (what appeared to be) a windows process. Turned on wi-fi, and again within 30 seconds, random IP addresses connect for ~ 1 min at a time, new ones coming and going. I occasionally use bit torrent on this machine, but there was definitely no process that seemed related to bt running after I went through task manager, and bt wasn't open to begin with. So, any ideas on what these connections might be for? I have been using Ad-Aware Free and AVG Free on this computer for a while now, always up to date..

Read the article
some HTTPS sites getting blocked on one machine in network

- by shadowfoxmi

I have a few computers connected to the internet via a router. I have been having some trouble with this one Windows 7 desktop. I can browse most of the sites without any trouble but some sites where the sign in page switches to a secure connection (https), the page does not load. It's not all of the sites though. I'm able to sign into gmail and a few other services that I know use https . The sites I'm having trouble with; yahoo's sign in page and the one that I have been using to test across different systems, http://iforgot.apple.com (which switchs to https) ;this particular site, i can access from other computers on the network and my phone. I only have windows firewall running and AVG. I even tried to stopping windows firewall but it did not help. Everything was fine last week. All I have installed in the past week is VOIP softwares namely skype, ooVoo and windows live messenger. I'm not sure how to find out what's being blocked and why and how to unblock it? Any suggestions would be greatly appreciated.

Read the article
Copy past speed very slow for a large number of files on Windows [closed]

- by Arno2501

I've run the following test I've created a folder containing 15'000 files of 400 bytes using this batch : @ECHO off SET times=15000 FOR /L %%i IN (1,1,%times%) DO ( fsutil file createnew filename%%i.txt 400 ) then I copy past it on my Windows Computer using this command : robocopy LargeNumberOfFiles\ LargeNumberOfFiles2\ After it has completed I can see that the transfer rate was 915810 Bytes/sec this is less than 1 MB/s. It took me several seconds to copy 7 MBytes Please note that this is very slow. I've tried the same with a folder with a single file of 50 Mbytes and the transfer rate is 1219512195 Bytes/sec. (yeah GB/s) instantaneous. Why copying large number of files take so much time - ressources on a windows filesystem ? Please note that I've tried to do the same on a linux system which runs on the same computer in a virtual machine (vmware player) with ext3 filesystem. I use the cp command and the copy is instantaneous ! Please also note the following : no antivirus I've tested that behaviour on multiple windows computers (always ntfs) i always get comparable results (transfer rate under 1MB/s avg 7-8 seconds to copy 7 MBytes) I've tested on multiple linux ext3 system the copy is always instantaneous for that amount (15000 files of 400 bytes) The question is about understanding what makes windows filesystem so slow to copy large number of files compared to a linux one for instance.

Read the article
Null-free "maps": Is a callback solution slower than tryGet()?

- by David Moles

In comments to "How to implement List, Set, and Map in null free design?", Steven Sudit and I got into a discussion about using a callback, with handlers for "found" and "not found" situations, vs. a tryGet() method, taking an out parameter and returning a boolean indicating whether the out parameter had been populated. Steven maintained that the callback approach was more complex and almost certain to be slower; I maintained that the complexity was no greater and the performance at worst the same. But code speaks louder than words, so I thought I'd implement both and see what I got. The original question was fairly theoretical with regard to language ("And for argument sake, let's say this language don't even have null") -- I've used Java here because that's what I've got handy. Java doesn't have out parameters, but it doesn't have first-class functions either, so style-wise, it should suck equally for both approaches. (Digression: As far as complexity goes: I like the callback design because it inherently forces the user of the API to handle both cases, whereas the tryGet() design requires callers to perform their own boilerplate conditional check, which they could forget or get wrong. But having now implemented both, I can see why the tryGet() design looks simpler, at least in the short term.) First, the callback example: class CallbackMap<K, V> { private final Map<K, V> backingMap; public CallbackMap(Map<K, V> backingMap) { this.backingMap = backingMap; } void lookup(K key, Callback<K, V> handler) { V val = backingMap.get(key); if (val == null) { handler.handleMissing(key); } else { handler.handleFound(key, val); } } } interface Callback<K, V> { void handleFound(K key, V value); void handleMissing(K key); } class CallbackExample { private final Map<String, String> map; private final List<String> found; private final List<String> missing; private Callback<String, String> handler; public CallbackExample(Map<String, String> map) { this.map = map; found = new ArrayList<String>(map.size()); missing = new ArrayList<String>(map.size()); handler = new Callback<String, String>() { public void handleFound(String key, String value) { found.add(key + ": " + value); } public void handleMissing(String key) { missing.add(key); } }; } void test() { CallbackMap<String, String> cbMap = new CallbackMap<String, String>(map); for (int i = 0, count = map.size(); i < count; i++) { String key = "key" + i; cbMap.lookup(key, handler); } System.out.println(found.size() + " found"); System.out.println(missing.size() + " missing"); } } Now, the tryGet() example -- as best I understand the pattern (and I might well be wrong): class TryGetMap<K, V> { private final Map<K, V> backingMap; public TryGetMap(Map<K, V> backingMap) { this.backingMap = backingMap; } boolean tryGet(K key, OutParameter<V> valueParam) { V val = backingMap.get(key); if (val == null) { return false; } valueParam.value = val; return true; } } class OutParameter<V> { V value; } class TryGetExample { private final Map<String, String> map; private final List<String> found; private final List<String> missing; public TryGetExample(Map<String, String> map) { this.map = map; found = new ArrayList<String>(map.size()); missing = new ArrayList<String>(map.size()); } void test() { TryGetMap<String, String> tgMap = new TryGetMap<String, String>(map); for (int i = 0, count = map.size(); i < count; i++) { String key = "key" + i; OutParameter<String> out = new OutParameter<String>(); if (tgMap.tryGet(key, out)) { found.add(key + ": " + out.value); } else { missing.add(key); } } System.out.println(found.size() + " found"); System.out.println(missing.size() + " missing"); } } And finally, the performance test code: public static void main(String[] args) { int size = 200000; Map<String, String> map = new HashMap<String, String>(); for (int i = 0; i < size; i++) { String val = (i % 5 == 0) ? null : "value" + i; map.put("key" + i, val); } long totalCallback = 0; long totalTryGet = 0; int iterations = 20; for (int i = 0; i < iterations; i++) { { TryGetExample tryGet = new TryGetExample(map); long tryGetStart = System.currentTimeMillis(); tryGet.test(); totalTryGet += (System.currentTimeMillis() - tryGetStart); } System.gc(); { CallbackExample callback = new CallbackExample(map); long callbackStart = System.currentTimeMillis(); callback.test(); totalCallback += (System.currentTimeMillis() - callbackStart); } System.gc(); } System.out.println("Avg. callback: " + (totalCallback / iterations)); System.out.println("Avg. tryGet(): " + (totalTryGet / iterations)); } On my first attempt, I got 50% worse performance for callback than for tryGet(), which really surprised me. But, on a hunch, I added some garbage collection, and the performance penalty vanished. This fits with my instinct, which is that we're basically talking about taking the same number of method calls, conditional checks, etc. and rearranging them. But then, I wrote the code, so I might well have written a suboptimal or subconsicously penalized tryGet() implementation. Thoughts?

Read the article
ProgrammingError when aggregating over an annotated & grouped Django ORM query

- by ento

I'm trying to construct a query to get the "average, maximum, minimum number of items purchased by a single user". The data source is this simple sales record table: class SalesRecord(models.Model): id = models.IntegerField(primary_key=True) user_id = models.IntegerField() product_code = models.CharField() price = models.IntegerField() created_at = models.DateTimeField() A new record is inserted into this table for every item purchased by a user. Here's my attempt at building the query: q = SalesRecord.objects.all() q = q.values('user_id').annotate( # group by user and count the # of records count=Count('id'), # (= # of items) ).order_by() result = q.aggregate(Max('count'), Min('count'), Avg('count')) When I try to execute the code, a ProgrammingError is raised at the last line: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM (SELECT sales_records.user_id AS user_id, COUNT(sales_records.`' at line 1") Django's error screen shows that the SQL is SELECT FROM (SELECT `sales_records`.`player_id` AS `player_id`, COUNT(`sales_records`.`id`) AS `count` FROM `sales_records` WHERE (`sales_records`.`created_at` >= %s AND `sales_records`.`created_at` <= %s ) GROUP BY `sales_records`.`player_id` ORDER BY NULL) subquery It's not selecting anything! Can someone please show me the right way to do this? Hacking Django I've found that clearing the cache of selected fields in django.db.models.sql.BaseQuery.get_aggregation() seems to solve the problem. Though I'm not really sure this is a fix or a workaround. @@ -327,10 +327,13 @@ # Remove any aggregates marked for reduction from the subquery # and move them to the outer AggregateQuery. + self._aggregate_select_cache = None + self.aggregate_select_mask = None for alias, aggregate in self.aggregate_select.items(): if aggregate.is_summary: query.aggregate_select[alias] = aggregate - del obj.aggregate_select[alias] + if alias in obj.aggregate_select: + del obj.aggregate_select[alias] ... yields result: {'count__max': 267, 'count__avg': 26.2563, 'count__min': 1}

Read the article
Prevent full table scan for query with multiple where clauses

- by Dave Jarvis

A while ago I posted a message about optimizing a query in MySQL. I have since ported the data and query to PostgreSQL, but now PostgreSQL has the same problem. The solution in MySQL was to force the optimizer to not optimize using STRAIGHT_JOIN. PostgreSQL offers no such option. Here is the explain: Here is the query: SELECT avg(d.amount) AS amount, y.year FROM station s, station_district sd, year_ref y, month_ref m, daily d LEFT JOIN city c ON c.id = 10663 WHERE -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 AND -- Ignore stations outside the given elevations -- s.elevation BETWEEN 0 AND 2000 AND sd.id = s.station_district_id AND -- Gather all known years for that station ... -- y.station_district_id = sd.id AND -- The data before 1900 is shaky; insufficient after 2009. -- y.year BETWEEN 1980 AND 2000 AND -- Filtered by all known months ... -- m.year_ref_id = y.id AND m.month = 12 AND -- Whittled down by category ... -- m.category_id = '001' AND -- Into the valid daily climate data. -- m.id = d.month_ref_id AND d.daily_flag_id <> 'M' GROUP BY y.year It appears as though PostgreSQL is looking at the DAILY table first, which is simply not the right way to go about this query as there are nearly 300 million rows. How do I force PostgreSQL to start at the CITY table? Thank you!

Read the article
Delphi: Why does IdHTTP.ConnectTimeout make requests slower?

- by K.Sandell

I discovered that when setting the ConnectTimeoout property for a TIdHTTP component, it makes the requests (GET and POST) become about 120ms slower? Why is this, and can I avoid/bypass this somehow? Env: D2010 with shipped Indy components, all updates installed for D2010. OS is WinXP (32bit) SP3 with most patches... My timing routine is: Procedure DoGet; Var Freq,T1,T2 : Int64; Cli : TIdHTTP; S : String; begin QueryPerformanceFrequency(Freq); Try QueryPerformanceCounter(T1); Cli := TIdHTTP.Create( NIL ); Cli.ConnectTimeout := 1000; // without this we get < 15ms!! S := Cli.Get('http://127.0.0.1/empty_page.php'); Finally FreeAndNil(Cli); QueryPerformanceCounter(T2); End; Memo1.Lines.Add('Time = '+FormatFloat('0.000',(T2-T1)/Freq) ); End; With the ConnectTimeout set in code I get avg. times of 130-140ms, without it's about 5-15ms ...

Read the article
MySQL table data transformation -- how can I dis-aggregate MySQL time data?

- by lighthouse65

We are coding for a MySQL data warehousing application that stores descriptive data (User ID, Work ID, Machine ID, Start and End Time columns in the first table below) associated with time and production quantity data (Output and Time columns in the first table below) upon which aggregate (SUM, COUNT, AVG) functions are applied. We now wish to dis-aggregate time data for another type of analysis. Our current data table design: +---------+---------+------------+---------------------+---------------------+--------+------+ | User ID | Work ID | Machine ID | Event Start Time | Event End Time | Output | Time | +---------+---------+------------+---------------------+---------------------+--------+------+ | 080025 | ABC123 | M01 | 2008-01-24 16:19:15 | 2008-01-24 16:34:45 | 2120 | 930 | +---------+---------+------------+---------------------+---------------------+--------+------+ Reprocessing dis-aggregation that we would like to do would be to transform table content based on a granularity of minutes, rather than the current production event ("Event Start Time" and "Event End Time") granularity. The resulting reprocessing of existing table rows would look like: +---------+---------+------------+---------------------+--------+ | User ID | Work ID | Machine ID | Production Minute | Output | +---------+---------+------------+---------------------+--------+ | 080025 | ABC123 | M01 | 2010-01-24 16:19 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:20 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:21 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:23 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:24 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:25 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:26 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:27 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:28 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:29 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:30 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:31 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:33 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:34 | 133 | +---------+---------+------------+---------------------+--------+ So the reprocessing would take an existing row of data created at the granularity of production event and modify the granularity to minutes, eliminating redundant (Event End Time, Time) columns while doing so. It assumes a constant rate of production and divides output by the difference in minutes plus one to populate the new table's Output column. I know this can be done in code...but can it be done entirely in a MySQL insert statement (or otherwise entirely in MySQL)? I am thinking of a INSERT ... INTO construction but keep getting stuck. An additional complexity is that there are hundreds of machines to include in the operation so there will be multiple rows (one for each machine) for each minute of the day. Any ideas would be much appreciated. Thanks.

Read the article
Encoding h.264 with libavcodec/x264

- by Leviathan

I am attempting to encode video using libavcodec/libavformat. I'm trying to change the standard output-example.c from ffmpeg source. The AVI file is created on the disk, but the only sound is encoded. I tried adding a lot of options for x264 from here. All the other codecs works fine, mpeg2, mpeg4, mjpeg, xvid. In addition to specifying the parameters x264, I also set the codec to AVOutputFormat structure. That's all I've done. AVOutputFormat *pOutFormat; // in header file av_register_all(); AVCodec *codec = avcodec_find_encoder_by_name("libx264"); pOutFormat = guess_format("avi", NULL, NULL); pOutFormat->video_codec = codec->id; The debug output of my application: Output #0, mp4, to 'D:\1.avi': Stream #0.0: Video: libx264, yuv420p, 320x240, q=10-51, 500 kb/s, 90k tbn, 25 tbc Stream #0.1: Audio: aac, 44100 Hz, 1 channels, s16, 128 kb/s [libx264 @ 0x694010]using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2 [libx264 @ 0x694010]bitrate tolerance too small, using .01 [libx264 @ 0x694010]profile Main, level 2.0 [libx264 @ 0x694010]frame I:150 Avg QP:14.76 size: 2534 [libx264 @ 0x694010]mb I I16..4: 75.9% 0.0% 24.1% [libx264 @ 0x694010]final ratefactor: 17.57 [libx264 @ 0x694010]coded y,uvDC,uvAC intra: 42.7% 92.4% 47.4% [libx264 @ 0x694010]i16 v,h,dc,p: 11% 14% 2% 73% [libx264 @ 0x694010]i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 18% 29% 5% 8% 10% 3% 3% 2% [libx264 @ 0x694010]kb/s:506.79

Read the article
Get percentiles of data-set with group by month

- by Cylindric

Hello, I have a SQL table with a whole load of records that look like this: | Date | Score | + -----------+-------+ | 01/01/2010 | 4 | | 02/01/2010 | 6 | | 03/01/2010 | 10 | ... | 16/03/2010 | 2 | I'm plotting this on a chart, so I get a nice line across the graph indicating score-over-time. Lovely. Now, what I need to do is include the average score on the chart, so we can see how that changes over time, so I can simply add this to the mix: SELECT YEAR(SCOREDATE) 'Year', MONTH(SCOREDATE) 'Month', MIN(SCORE) MinScore, AVG(SCORE) AverageScore, MAX(SCORE) MaxScore FROM SCORES GROUP BY YEAR(SCOREDATE), MONTH(SCOREDATE) ORDER BY YEAR(SCOREDATE), MONTH(SCOREDATE) That's no problem so far. The problem is, how can I easily calculate the percentiles at each time-period? I'm not sure that's the correct phrase. What I need in total is: A line on the chart for the score (easy) A line on the chart for the average (easy) A line on the chart showing the band that 95% of the scores occupy (stumped) It's the third one that I don't get. I need to calculate the 5% percentile figures, which I can do singly: SELECT MAX(SubQ.SCORE) FROM (SELECT TOP 45 PERCENT SCORE FROM SCORES WHERE YEAR(SCOREDATE) = 2010 AND MONTH(SCOREDATE) = 1 ORDER BY SCORE ASC) AS SubQ SELECT MIN(SubQ.SCORE) FROM (SELECT TOP 45 PERCENT SCORE FROM SCORES WHERE YEAR(SCOREDATE) = 2010 AND MONTH(SCOREDATE) = 1 ORDER BY SCORE DESC) AS SubQ But I can't work out how to get a table of all the months. | Date | Average | 45% | 55% | + -----------+---------+-----+-----+ | 01/01/2010 | 13 | 11 | 15 | | 02/01/2010 | 10 | 8 | 12 | | 03/01/2010 | 5 | 4 | 10 | ... | 16/03/2010 | 7 | 7 | 9 | At the moment I'm going to have to load this lot up into my app, and calculate the figures myself. Or run a larger number of individual queries and collate the results.

Read the article
How to setup Lucene search for a B2B web app?

- by Bill Paetzke

Given: 5000 databases (spread out over 5 servers) 1 database per client (so you can infer there are 1000 clients) 2 to 2000 users per client (let's say avg is 100 users per client) Clients (databases) come and go every day (let's assume most remain for at least one year) Let's stay agnostic of language or sql brand, since Lucene (and Solr) have a breadth of support The Question: How would you setup Lucene search so that each client can only search within its database? How would you setup the index(es)? Would you need to add a filter to all search queries? If a client cancelled, how would you delete their (part of the) index? (this may be trivial--not sure yet) Possible Solutions: Make an index for each client (database) Pro: Search is faster (than one-index-for-all method). Indices are relative to the size of the client's data. Con: I'm not sure what this entails, nor do I know if this is beyond Lucene's scope. Have a single, gigantic index with a database_name field. Always include database_name as a filter. Pro: Not sure. Maybe good for tech support or billing dept to search all databases for info. Con: Search is slower (than index-per-client method). Flawed security if query filter removed. For Example: Joel Spolsky said in Podcast #11 that his hosted web app product, FogBugz On-Demand, uses Lucene. He has thousands of on-demand clients. And each client gets their own database. His situation is quite similar to mine. Although, he didn't elaborate on the setup (particularly indices); hence, the need for this question. One last thing: I would also accept an answer that uses Solr (the extension of Lucene). Perhaps it's better suited for this problem. Not sure.

Read the article
mySQL Efficiency Issue - How to find the right balance of normalization...?

- by Foo

I'm fairly new to working with relational databases, but have read a few books and know the basics of good design. I'm facing a design decision, and I'm not sure how to continue. Here's a very over simplified version of what I'm building: People can rate photos 1-5, and I need to display the average votes on the picture while keeping track of the individual votes. For example, 12 people voted 1, 7 people voted 2, etc. etc. The normalization freak of me initially designed the table structure like this: Table pictures id* | picture | userID | Table ratings id* | pictureID | userID | rating With all the foreign key constraints and everything set as they shoudl be. Every time someone rates a picture, I just insert a new record into ratings and be done with it. To find the average rating of a picture, I'd just run something like this: SELECT AVG(rating) FROM ratings WHERE pictureID = '5' GROUP by pictureID Having it setup this way lets me run my fancy statistics to. I can easily find who rated a certain picture a 3, and what not. Now I'm thinking if there's a crapload of ratings (which is very possible in what I'm really designing), finding the average will became very expensive and painful. Using a non-normalized version would seem to be more efficient. e.g.: Table picture id | picture | userID | ratingOne | ratingTwo | ratingThree | ratingFour | ratingFive To calculate the average, I'd just have to select a single row. It seems so much more efficient, but so much more uglier. Can someone point me in the right direction of what to do? My initial research shows that I have to "find the right balance", but how do I go about finding that balance? Any articles or additional reading information would be appreciated as well. Thanks.

Read the article
MySQL table data transformation -- how can I dis-aggreate MySQL time data?

- by lighthouse65

We are coding for a MySQL data warehousing application that stores descriptive data (User ID, Work ID, Machine ID, Start and End Time columns in the first table below) associated with time and production quantity data (Output and Time columns in the first table below) upon which aggregate (SUM, COUNT, AVG) functions are applied. We now wish to dis-aggregate time data for another type of analysis. Our current data table design: +---------+---------+------------+---------------------+---------------------+--------+------+ | User ID | Work ID | Machine ID | Event Start Time | Event End Time | Output | Time | +---------+---------+------------+---------------------+---------------------+--------+------+ | 080025 | ABC123 | M01 | 2008-01-24 16:19:15 | 2008-01-24 16:34:45 | 2120 | 930 | +---------+---------+------------+---------------------+---------------------+--------+------+ Reprocessing dis-aggregation that we would like to do would be to transform table content based on a granularity of minutes, rather than the current production event ("Event Start Time" and "Event End Time") granularity. The resulting reprocessing of existing table rows would look like: +---------+---------+------------+---------------------+--------+ | User ID | Work ID | Machine ID | Production Minute | Output | +---------+---------+------------+---------------------+--------+ | 080025 | ABC123 | M01 | 2010-01-24 16:19 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:20 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:21 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:23 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:24 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:25 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:26 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:27 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:28 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:29 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:30 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:31 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:33 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:34 | 133 | +---------+---------+------------+---------------------+--------+ So the reprocessing would take an existing row of data created at the granularity of production event and modify the granularity to minutes, eliminating redundant (Event End Time, Time) columns while doing so. It assumes a constant rate of production and divides output by the difference in minutes plus one to populate the new table's Output column. I know this can be done in code...but can it be done entirely in a MySQL insert statement (or otherwise entirely in MySQL)? I am thinking of a INSERT ... INTO construction but keep getting stuck. An additional complexity is that there are hundreds of machines to include in the operation so there will be multiple rows (one for each machine) for each minute of the day. Any ideas would be much appreciated. Thanks.

Read the article
named_scope + average is causing the table to be specified more then once in the sql query run on po

- by hadees

I have a named scopes like so... named_scope :gender, lambda { |gender| { :joins => {:survey_session => :profile }, :conditions => { :survey_sessions => { :profiles => { :gender => gender } } } } } and when I call it everything works fine. I also have this average method I call... Answer.average(:rating, :include => {:survey_session => :profile}, :group => "profiles.career") which also works fine if I call it like that. However if I were to call it like so... Answer.gender('m').average(:rating, :include => {:survey_session => :profile}, :group => "profiles.career") I get... ActiveRecord::StatementInvalid: PGError: ERROR: table name "profiles" specified more than once : SELECT avg("answers".rating) AS avg_rating, profiles.career AS profiles_career FROM "answers" LEFT OUTER JOIN "survey_sessions" survey_sessions_answers ON "survey_sessions_answers".id = "answers".survey_session_id LEFT OUTER JOIN "profiles" ON "profiles".id = "survey_sessions_answers".profile_id INNER JOIN "survey_sessions" ON "survey_sessions".id = "answers".survey_session_id INNER JOIN "profiles" ON "profiles".id = "survey_sessions".profile_id WHERE ("profiles"."gender" = E'm') GROUP BY profiles.career Which is a little hard to read but says I'm including the table profiles twice. If I were to just remove the include from average it works but it isn't really practical because average is actually being called inside a method which gets passed the scoped. So there is some times gender or average might get called with out each other and if either was missing the profile include it wouldn't work. So either I need to know how to fix this apparent bug in Rails or figure out a way to know what scopes were applied to a ActiveRecord::NamedScope::Scope object so that I could check to see if they have been applied and if not add the include for average.

Read the article
Virus on site but can't find where

- by Rob

WARNING! THIS IS ABOUT A VIRUS ON MY SITE. IT APPEARS IT HAS BEEN THERE FOR SOMETIME AND I'VE HAD NO PROBLEMS. BUT PLEASE BE CAREFUL. READ EVERYTHING I SAY AND SEE IF YOU CAN HELP ME WITHOUT VISITING THE LINK. AVG PICKS UP ON IT AND BLOCKS IT, MCAFEE DOES NOT. Sorry about the warning, obviously i'm not here to get anyone infected or anything like that. Basically I run the website sortitoutsi dot net. Ages ago I got a virus on my computer, they got hold of my FTP passwords and added some lines of javascript to the top of my site. I removed them and believe it was fixed. However i'm using the "Web Developer" extension for Firefox and chose to view all javascript on my page and find there are various links to horrible urls such as: gittigidiyor-com.excite.co.jp.webmasterworld-com.eastmusicdirect.ru:8080/aboutus.org/aboutus.org/google.com/skycn.com/torrents.ru.php and gittigidiyor-com.excite.co.jp.webmasterworld-com.eastmusicdirect.ru:8080/index.php?jl= These terms do not appear anywhere. In the source code, in any of the javascript or the css. I also can't see that there are any rogue images that I don't recognise either. So i've no idea where this javascript is coming from. Can anyone suggest how I can find references to these links and remove them? I can see them both in the Web Developer firefox extension and in the net tab using Firebug. Any help would be greatly appreciated

Read the article
MySQL Ratings From Two Tables

- by DirtyBirdNJ

I am using MySQL and PHP to build a data layer for a flash game. Retrieving lists of levels is pretty easy, but I've hit a roadblock in trying to fetch the level's average rating along with it's pointer information. Here is an example data set: levels Table: level_id | level_name 1 | Some Level 2 | Second Level 3 | Third Level ratings Table: rating_id | level_id | rating_value 1 | 1 | 3 2 | 1 | 4 3 | 1 | 1 4 | 2 | 3 5 | 2 | 4 6 | 2 | 1 7 | 3 | 3 8 | 3 | 4 9 | 3 | 1 I know this requires a join, but I cannot figure out how to get the average rating value based on the level_id when I request a list of levels. This is what I'm trying to do: SELECT levels.level_id, AVG(ratings.level_rating WHERE levels.level_id = ratings.level_id) FROM levels I know my SQL is flawed there, but I can't figure out how to get this concept across. The only thing I can get to work is returning a single average from the entire ratings table, which is not very useful. Ideal Output from the above conceptually valid but syntactically awry query would be: level_id | level_rating 1| 3.34 2| 1.00 3| 4.54 My main issue is I can't figure out how to use the level_id of each response row before the query has been returned. It's like I want to use a placeholder... or an alias... I really don't know and it's very frustrating. The solution I have in place now is an EPIC band-aid and will only cause me problems long term... please help!

Read the article
Which database I can used and relationship in it ??

- by mimo-hamad

My projece make me confused which I didn't find clear things that make me understand the required database and the relationships in it So, would a super one help me to solve it ?!! ;D this is required: 1) Model the data stored in the database (Identify the entities, roles, relationships, constraints, etc.) 2) Write the Oracle commands to create the database, find appropriate data, and populate the database 3) Write five different queries on your database, using the SELECT/FROM/WHERE construct provided in SQL. Your five queries should illustrate several different aspects of database querying, such as: a. Queries over more than one relation (by listing more than one relation in the FROM clause) b. Queries involving aggregate functions, such as SUM, COUNT, and AVG c. Queries involving complicated selects and joins d. Queries involving GROUP BY, HAVING or other similar functions. e. Queries that require the use of the DISTINCT keyword. And this the condition that we need to determine it to solve the required Q's above : 5) It is desired to develop an Internet membership club to buy products at special prices online. To join, new members must be referred by another existing member of the club. The system will keep the following information for each member: The member ID, referring member, birth date, member name, address, phone, mobile, credit card type, number and expiration date. The items are always shipped to the member's address noted in the membership application. The shipping fees will differ for each order.For each item to be requested, the member will select an item from a long list of possible items. For each item in the database, we store an item ID, an item name, description, and list price. The list price will be different from the actual sale price. The available quantity and the back-ordered quantity (the back-ordered quantity is the quantity on-order by the club from its suppliers) is also noted

Read the article
Export large amount of data from Oracle 10G to SQL Server 2005

- by uniball

Dear all, I need to export 100 million data rows (avg row length ~ 100 bytes) from Oracle 10G database table into SQL server (over WAN/VLAN with 6MBits/sec capacity) on a regular basis. So far, these are the options that I have tried and a quick summary. Has anyone tried this before? Are there other better options? Which option would be the best in terms of performance and reliability? The time taken has been calculated using tests on smaller amounts of data and then extrapolating it to estimate the time required. Using data import wizard on the SQL server or SSIS packages to import the data. It will take around 150 hours to complete the task. Using Oracle batch job to spool data into a comma-delimited flat-file. Then using SSIS package to FTP this file to the SQL server and then load directly from the flat-file. The issue here is the size of the flat-file which is expected to run in GBs. Although this option is drastically different, I am even considering the option of using Linked Server to query the Oracle data directly at run-time to avoid bringing in data. Performance is a big problem and I have limited control over the Oracle database in terms of creating table indexes. Regards, Uniball

Read the article
Eliminate full table scan due to BETWEEN (and GROUP BY)

- by Dave Jarvis

Description According to the explain command, there is a range that is causing a query to perform a full table scan (160k rows). How do I keep the range condition and reduce the scanning? I expect the culprit to be: Y.YEAR BETWEEN 1900 AND 2009 AND Code Here is the code that has the range condition (the STATION_DISTRICT is likely superfluous). SELECT COUNT(1) as MEASUREMENTS, AVG(D.AMOUNT) as AMOUNT, Y.YEAR as YEAR, MAKEDATE(Y.YEAR,1) as AMOUNT_DATE FROM CITY C, STATION S, STATION_DISTRICT SD, YEAR_REF Y FORCE INDEX(YEAR_IDX), MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 10663 AND -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) + (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) * POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND -- Get the station district identification for the matching station. -- S.STATION_DISTRICT_ID = SD.ID AND -- Gather all known years for that station ... -- Y.STATION_DISTRICT_ID = SD.ID AND -- The data before 1900 is shaky; insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '003' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR Update The SQL is performing a full table scan, which results in MySQL performing a "copy to tmp table", as shown here: +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | 1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | Y | range | YEAR_IDX | YEAR_IDX | 4 | NULL | 160422 | Using where | | 1 | SIMPLE | SD | eq_ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION_DISTRICT_ID | 1 | Using index | | 1 | SIMPLE | S | eq_ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 | Using where | | 1 | SIMPLE | M | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8 | climate.Y.ID | 54 | Using where | | 1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 | Using where | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ Related http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html http://stackoverflow.com/questions/557425/optimize-sql-that-uses-between-clause Thank you!

Read the article
How to query multiple tables with multiple selects MySQL

- by brybam

I'm trying to write a php function that gets all the basic food data(this part works fine in my code) Then grabs all the ingredients related to the ids of the selected items from the query before. Anyone have have any idea why my second array keeps coming back as false? I should be getting 1 array with the list of foods (this works) Then, the second one should be an array with all the ingredients for all the foods that were previously selected...then in my code im planning to work with the array and sort them with the proper foods based on the ids. function getFood($start, $limit) { $one = mysql_query("SELECT a.id, a.name, a.type, AVG(b.r) AS fra, COUNT(b.id) as tvotes FROM `foods` a LEFT JOIN `foods_ratings` b ON a.id = b.id GROUP BY a.id ORDER BY fra DESC, tvotes DESC LIMIT $start, $limit;"); $row = mysql_fetch_array($one); $qry = ""; foreach ($row as &$value) { $fid = $value['id']; $qry = $qry . "SELECT ing, amount FROM foods_ing WHERE fid='$fid';"; } $two = mysql_query($qry); return array ($one, $two); }

Read the article
Average over a timeframe with missing data

- by BHare

Assuming a table such as: UID Name Datetime Users 4 Room 4 2012-08-03 14:00:00 3 2 Room 2 2012-08-03 14:00:00 3 3 Room 3 2012-08-03 14:00:00 1 1 Room 1 2012-08-03 14:00:00 2 3 Room 3 2012-08-03 14:15:00 1 2 Room 2 2012-08-03 14:15:00 4 1 Room 1 2012-08-03 14:15:00 3 1 Room 1 2012-08-03 14:30:00 6 1 Room 1 2012-08-03 14:45:00 3 2 Room 2 2012-08-03 14:45:00 7 3 Room 3 2012-08-03 14:45:00 8 4 Room 4 2012-08-03 14:45:00 4 I wanted to get the average user count of each room (1,2,3,4) from the time 2PM to 3PM. The problem is that sometimes the room may not "check in" at the 15 minute interval time, so the assumption has to be made that the previous last known user count is still valid. For example the check-in's for 2012-08-03 14:15:00 room 4 never checked in, so it must be assumed that room 4 had 3 users at 2012-08-03 14:15:00 because that is what it had at 2012-08-03 14:00:00 This follows on through so that the average user count I am looking for is as follows: Room 1: (2 + 3 + 6 + 3) / 4 = 3.5 Room 2: (3 + 4 + 4 + 7) / 4 = 4.5 Room 3: (1 + 1 + 1 + 8) / 4 = 2.75 Room 4: (3 + 3 + 3 + 4) / 4 = 3.25 where # is the assumed number based on the previous known check-in. I am wondering if it's possible to so this with SQL alone? if not I am curious of a ingenious PHP solution that isn't just bruteforce math, as such as my quick inaccurate pseudo code: foreach ($rooms_id_array as $room_id) { $SQL = "SELECT * FROM `table` WHERE (`UID` == $room_id && `Datetime` >= 2012-08-03 14:00:00 && `Datetime` <= 2012-08-03 15:00:00)"; $result = query($SQL); if ( count($result) < 4 ) { // go through each date and find what is missing, and then go to previous date and use that instead } else { foreach ($result) $sum += $result; $avg = $sum / 4; } }

Read the article

< Previous Page | 14 15 16 17 18 19 20 21 22 | Next Page >