Search Results

Search found 8262 results on 331 pages for 'optimization algorithm'.

Page 152/331 | < Previous Page | 148 149 150 151 152 153 154 155 156 157 158 159  | Next Page >

  • What's the difference between !col and col=false in MySQL?

    - by Mask
    The two statements have totally different performance: mysql> explain select * from jobs where createIndexed=false; +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ | 1 | SIMPLE | jobs | ref | i_jobs_createIndexed | i_jobs_createIndexed | 1 | const | 1 | | +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ 1 row in set (0.01 sec) mysql> explain select * from jobs where !createIndexed; +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | jobs | ALL | NULL | NULL | NULL | NULL | 17996 | Using where | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Column definition and related index for aiding analysis: createIndexed tinyint(1) NOT NULL DEFAULT 0, create index i_jobs_createIndexed on jobs(createIndexed);

    Read the article

  • Is there an offline version of Smush.it available?

    - by Jonathon Watney
    Sometimes I use Smush.it via the YSlow Firefox plugin to non-destructively reduce the file size of JPG images. Is there an offline version available that runs on Windows? And if not is there an alternative? The reason I'd like an offline version is that I'd like to optimize images before I deploy them. Currently Smush.it accepts only public facing URLs for images or a web page (via YSlow) and can't access my internal network. That means I have to deploy, optimize, replace images and deploy again. I'd really like to deploy the optimized images on the first deploy. Update: Here's a very similar question.

    Read the article

  • Benchmarking a particular method in Objective-C

    - by Jasconius
    I have a critical method in an Objective-C application that I need to optimize as much as possible. I first need to take some easy benchmarks on this one single method so I can compare my progress as I optimize. What is the easiest way to track the execution time of a given method in, say, milliseconds, and print that to console.

    Read the article

  • TSQL -- Make it better

    - by user319353
    Hi: -- Very Narrow (all IDs are passed in) IF(@EmpID IS NOT NULL AND @DeptID IS NOT NULL AND @CityID IS NOT NULL) BEGIN SELECT e.EmpName ,d.DeptName ,c.CityName FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE e.EmpID = @EmpID END -- Just 2 IDs passed in ELSE IF(@DeptID IS NOT NULL AND @CityID IS NOT NULL) BEGIN SELECT e.EmpName ,d.DeptName ,NULL AS [CityName] FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE d.deptID = @DeptID END -- Very Broad (just 1 ID passed in) ELSE IF(@CityID IS NOT NULL) BEGIN SELECT e.EmpName ,NULL AS [DeptName] ,NULL AS [CityName] FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE c.CityID = @CityID END -- None (Nothing passed in) ELSE BEGIN SELECT NULL AS [EmpName] ,NULL AS [DeptName] ,NULL AS [CityName] END Question: Is there any better way (OR specifically can I do anything without IF...ELSE condition?

    Read the article

  • Why is shrink_to_fit non-binding?

    - by Roger Pate
    The C++0x FCD states in 23.3.6.2 vector capacity: void shrink_to_fit(); Remarks: shrink_to_fit is a non-binding request to reduce capacity() to size(). [Note: The request is non-binding to allow latitude for implementation-specific optimizations. —end note] Why is it non-binding, and what optimizations are intended to be allowed?

    Read the article

  • Does visibility affect DOM manipulation performance?

    - by Chetan Sastry
    IE7/Windows XP I have a third party component in my page that does a lot of DOM manipulation to adjust itself each time the browser window is resized. Unfortunately I have little control of what it does internally and I have optimized everything else (such as callbacks and event handlers) as much as I can. I can't take the component off the flow by setting display:none because it fails measuring itself if I do so. In general, does setting visibility of the container to invisible during the resize help improve DOM rendering performance?

    Read the article

  • very quickly getting total size of folder

    - by freakazo
    I want to quickly find the total size of any folder using python. def GetFolderSize(path): TotalSize = 0 for item in os.walk(path): for file in item[2]: try: TotalSize = TotalSize + getsize(join(item[0], file)) except: print("error with file: " + join(item[0], file)) return TotalSize That's the simple script I wrote to get the total size of the folder, it took around 60 seconds (+-5 seconds). By using multiprocessing I got it down to 23 seconds on a quad core machine. Using the Windows file explorer it takes only ~3 seconds (Right click- properties to see for yourself). So is there a faster way of finding the total size of a folder close to the speed that windows can do it? Windows 7, python 2.6 (Did searches but most of the time people used a very similar method to my own) Thanks in advance.

    Read the article

  • Optimizing Oracle query

    - by Omnipresent
    SELECT MAX(verification_id) FROM VERIFICATION_TABLE WHERE head = 687422 AND mbr = 23102 AND RTRIM(LTRIM(lname)) = '.iq bzw' AND TO_CHAR(dob,'MM/DD/YYYY')= '08/10/2004' AND system_code = 'M'; This query is taking 153 seconds to run. there are millions of rows in VERIFICATION_TABLE. I think query is taking long because of the functions in where clause. However, I need to do ltrim rtrim on the columns and also date has to be matched in MM/DD/YYYY format. How can I optimize this query?

    Read the article

  • performance issue in a select query from a single table

    - by daedlus
    Hi , I have a table as below dbo.UserLogs ------------------------------------- Id | UserId |Date | Name| P1 | Dirty ------------------------------------- There can be several records per userId[even in millions] I have clustered index on Date column and query this table very frequently in time ranges. The column 'Dirty' is non-nullable and can take either 0 or 1 only so I have no indexes on 'Dirty' I have several millions of records in this table and in one particular case in my application i need to query this table to get all UserId that have at least one record that is marked dirty. I tried this query - select distinct(UserId) from UserLogs where Dirty=1 I have 10 million records in total and this takes like 10min to run and i want this to run much faster than this. [i am able to query this table on date column in less than a minute.] Any comments/suggestion are welcome. my env 64bit,sybase15.0.3,Linux

    Read the article

  • Automatically find compiler options for fastest exe on given machine?

    - by dehmann
    Is there a method to automatically find the best compiler options (on a given machine), which result in the fastest possible executable? Naturally, I use g++ -O3, but there are additional flags that may make the code run faster, e.g. -ffast-math and others, some of which are hardware-dependent. Does anyone know some code I can put in my configure.ac file (GNU autotools), so that the flags will be added to the Makefile automatically by the ./configure command? In addition to automatically determining the best flags, I would be interested in some useful compiler flags that are good to use as a default for most optimized executables.

    Read the article

  • How to optimize this mysql query - explain output included

    - by Sandeepan Nath
    This is the query (a search query basically, based on tags):- select SUM(DISTINCT(ttagrels.id_tag in (2105,2120,2151,2026,2046) )) as key_1_total_matches, td.*, u.* from Tutors_Tag_Relations AS ttagrels Join Tutor_Details AS td ON td.id_tutor = ttagrels.id_tutor JOIN Users as u on u.id_user = td.id_user where (ttagrels.id_tag in (2105,2120,2151,2026,2046)) group by td.id_tutor HAVING key_1_total_matches = 1 And following is the database dump needed to execute this query:- CREATE TABLE IF NOT EXISTS `Users` ( `id_user` int(10) unsigned NOT NULL auto_increment, `id_group` int(11) NOT NULL default '0', PRIMARY KEY (`id_user`), KEY `Users_FKIndex1` (`id_group`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=730 ; INSERT INTO `Users` (`id_user`, `id_group`) VALUES (303, 1); CREATE TABLE IF NOT EXISTS `Tutor_Details` ( `id_tutor` int(10) unsigned NOT NULL auto_increment, `id_user` int(10) NOT NULL default '0', PRIMARY KEY (`id_tutor`), KEY `Users_FKIndex1` (`id_user`), KEY `id_user` (`id_user`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=58 ; INSERT INTO `Tutor_Details` (`id_tutor`, `id_user`) VALUES (26, 303); CREATE TABLE IF NOT EXISTS `Tags` ( `id_tag` int(10) unsigned NOT NULL auto_increment, `tag` varchar(255) default NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`), KEY `tag_4` (`tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=2957 ; INSERT INTO `Tags` (`id_tag`, `tag`) VALUES (2026, 'Brendan.\nIn'), (2046, 'Brendan.'), (2105, 'Brendan'), (2120, 'Brendan''s'), (2151, 'Brendan)'); CREATE TABLE IF NOT EXISTS `Tutors_Tag_Relations` ( `id_tag` int(10) unsigned NOT NULL default '0', `id_tutor` int(10) unsigned default NULL, `tutor_field` varchar(255) default NULL, `cdate` timestamp NOT NULL default CURRENT_TIMESTAMP, `udate` timestamp NULL default NULL, KEY `Tutors_Tag_Relations` (`id_tag`), KEY `id_tutor` (`id_tutor`), KEY `id_tag` (`id_tag`), KEY `id_tutor_2` (`id_tutor`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `Tutors_Tag_Relations` (`id_tag`, `id_tutor`, `tutor_field`, `cdate`, `udate`) VALUES (2105, 26, 'firstname', '2010-06-17 17:08:45', NULL); ALTER TABLE `Tutors_Tag_Relations` ADD CONSTRAINT `Tutors_Tag_Relations_ibfk_2` FOREIGN KEY (`id_tutor`) REFERENCES `Tutor_Details` (`id_tutor`) ON DELETE NO ACTION ON UPDATE NO ACTION, ADD CONSTRAINT `Tutors_Tag_Relations_ibfk_1` FOREIGN KEY (`id_tag`) REFERENCES `Tags` (`id_tag`) ON DELETE NO ACTION ON UPDATE NO ACTION; What the query does? This query actually searches tutors which contain "Brendan"(as their name or biography or something). The id_tags 2105,2120,2151,2026,2046 are nothing but the tags which are LIKE "%Brendan%". My question is :- 1.In the explain of this query, the reference column shows NULL for ttagrels, but there are possible keys (Tutors_Tag_Relations,id_tutor,id_tag,id_tutor_2). So, why is no key being taken. How to make the query take references. Is it possible at all? 2. The other two tables td and u are using references. Any indexing needed in those? I think not. Check the explain query output here http://www.test.examvillage.com/explain.png

    Read the article

  • How can I encode four unsigned bytes (0-255) to a float and back again using HLSL?

    - by Statement
    Hello! I am facing a task where one of my hlsl shaders require multiple texture lookups per pixel. My 2d textures are fixed to 256*256, so two bytes should be sufficient to address any given texel given this constraint. My idea is then to put two xy-coordinates in each float, giving me eight xy-coordinates in pixel space when packed in a Vector4 format image. These eight coordinates are then used to sample another texture(s). The reason for doing this is to save graphics memory and an attempt to optimize processing time, since then I don't require multiple texture lookups. By the way: Does anyone know if encoding/decoding 16 bytes from/to 4 floats using 1 sampling is slower than 4 samplings with unencoded data?

    Read the article

  • Is using iframes to improve page performance an acceptable approach?

    - by Denis Hoctor
    Hi all, I have a complex page that has several user controls like galleries, maps, ads etc. I've tried optimising them by ensuring full separation of html/css/js, placing js at the bottom of the page and trying to ensure I have well written code in all 3 but alas I still have a slow page. It's not really noticeable to a modern browser but can see the stats and IE6/7. So I'm now looking to do what we've done previously for Adtech flash crap - an iframe. Apart from the SEO impact which I'm not worried about in the case of these controls, what do people think of this as an approach? PROS and CONS please. Thanks, Denis

    Read the article

  • Speed of CSS

    - by Ólafur Waage
    This is just a question to help me understand CSS rendering better. Lets say we have a million lines of this. <div class="first"> <div class="second"> <span class="third">Hello World</span> </div> </div> Which would be the fastest way to change the font of Hello World to red? .third { color: red; } div.third { color: red; } div.second div.third { color: red; } div.first div.second div.third { color: red; } Also, what if there was at tag in the middle that had a unique id of "foo". Which one of the CSS methods above would be the fastest. I know why these methods are used etc, im just trying to grasp better the rendering technique of the browsers and i have no idea how to make a test that times it. UPDATE: Nice answer Gumbo. From the looks of it then it would be quicker in a regular site to do the full definition of a tag. Since it finds the parents and narrows the search for every parent found. That could be bad in the sense you'd have a pretty large CSS file though.

    Read the article

  • How do we greatly optimize our MySQL database (or replace it) when using joins?

    - by jkaz
    Hi there, This is the first time I'm approaching an extremely high-volume situation. This is an ad server based on MySQL. However, the query that is used incorporates a lot of JOINs and is generally just slow. (This is Rails ActiveRecord, btw) sel = Ads.find(:all, :select = '*', :joins = "JOIN campaigns ON ads.campaign_id = campaigns.id JOIN users ON campaigns.user_id = users.id LEFT JOIN countries ON countries.campaign_id = campaigns.id LEFT JOIN keywords ON keywords.campaign_id = campaigns.id", :conditions = [flashstr + "keywords.word = ? AND ads.format = ? AND campaigns.cenabled = 1 AND (countries.country IS NULL OR countries.country = ?) AND ads.enabled = 1 AND campaigns.dailyenabled = 1 AND users.uenabled = 1", kw, format, viewer['country'][0]], :order = order, :limit = limit) My questions: Is there an alternative database like MySQL that has JOIN support, but is much faster? (I know there's Postgre, still evaluating it.) Otherwise, would firing up a MySQL instance, loading a local database into memory and re-loading that every 5 minutes help? Otherwise, is there any way I could switch this entire operation to Redis or Cassandra, and somehow change the JOIN behavior to match the (non-JOIN-able) nature of NoSQL? Thank you!

    Read the article

  • How to open DataSet in Visual Studio 2008 faster?

    - by Ekkapop
    When I open DataSet in Visual Studio 2008 to design or modify it, it always take a very long time (more than five minutes) before I can continue to do my job. While I'm waiting I can't do anything on Visual Studio, moreover CPU and memory usage is growth dramatically. I want to know, Is it has anyway to reduce this waiting time? Hardware - Desktop CPU: Intel Q6600 Memory: 4 GB HDD: 320 GB 7200 rpm OS: Windows XP 32 bit with Service Pack 3

    Read the article

  • OpenMP timer doesn't work on inline assembly code?

    - by Brett
    I'm trying to compare some code samples for speed, and I decided to use the OpenMP timer since I'll eventually be multi threading the code. The timer works great on two of my four code snippets, but not on the other two start=omp_get_wtime(); /*code here*/ finish = omp_get_wtime() - start_time; The four code here sections are serial code, xmmintrin.h code, and two inline assembly codes. The serial and xmminstrin.h code are able to be timed, but the inline assembly codes returns -1.#IND00 for a time. I can't seem to figure out why this is? Thanks for any help or suggestions!

    Read the article

  • How to improve my LDAP schema?

    - by asmaier
    Hello, I have a OpenLDAP Database and it holds some project objects that look like dn: cn=Proj1,ou=Project,ou=ua,dc=org cn: Proj1 objectClass: top objectClass: posixGroup member: 001ag member: 002ag System: ABEL System: PCx Budget: ABEL:1000000:0.3 Budget: PCx:300000:0.3 One can see that the Budget attribute is a ":"-separated string, where the first part holds the name of the system the budget is for, the second part holds some budget (which may change every month) and the last entry is a conversion factor for the budget of that system. Seeing this, I thought this is bad database design, since attribute values should always be atomic. But how can I improve that in LDAP, so that I can do a direct ldapsearch or a direct ldapmodify of the budget of System "ABEL" instead of writing a script, that will have to parse and split the ":"-separated string?

    Read the article

  • Most optimized way to calculate modulus in C

    - by hasanatkazmi
    I have minimize cost of calculating modulus in C. say I have a number x and n is the number which will divide x when n == 65536 (which happens to be 2^16): mod = x % n (11 assembly instructions as produced by GCC) or mod = x & 0xffff which is equal to mod = x & 65535 (4 assembly instructions) so, GCC doesn't optimize it to this extent. In my case n is not x^(int) but is largest prime less than 2^16 which is 65521 as I showed for n == 2^16, bit-wise operations can optimize the computation. What bit-wise operations can I preform when n == 65521 to calculate modulus.

    Read the article

  • Mysql query help - Alter this mysql query to get these results?

    - by sandeepan-nath
    Please execute the following queries first to set up so that you can help me:- CREATE TABLE IF NOT EXISTS `Tutor_Details` ( `id_tutor` int(10) NOT NULL auto_increment, `firstname` varchar(100) NOT NULL default '', `surname` varchar(155) NOT NULL default '', PRIMARY KEY (`id_tutor`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=41 ; INSERT INTO `Tutor_Details` (`id_tutor`,`firstname`, `surname`) VALUES (1, 'Sandeepan', 'Nath'), (2, 'Bob', 'Cratchit'); CREATE TABLE IF NOT EXISTS `Classes` ( `id_class` int(10) unsigned NOT NULL auto_increment, `id_tutor` int(10) unsigned NOT NULL default '0', `class_name` varchar(255) default NULL, PRIMARY KEY (`id_class`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=229 ; INSERT INTO `Classes` (`id_class`,`class_name`, `id_tutor`) VALUES (1, 'My Class', 1), (2, 'Sandeepan Class', 2); CREATE TABLE IF NOT EXISTS `Tags` ( `id_tag` int(10) unsigned NOT NULL auto_increment, `tag` varchar(255) default NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=18 ; INSERT INTO `Tags` (`id_tag`, `tag`) VALUES (1, 'Bob'), (6, 'Class'), (2, 'Cratchit'), (4, 'Nath'), (3, 'Sandeepan'), (5, 'My'); CREATE TABLE IF NOT EXISTS `Tutors_Tag_Relations` ( `id_tag` int(10) unsigned NOT NULL default '0', `id_tutor` int(10) default NULL, KEY `Tutors_Tag_Relations` (`id_tag`), KEY `id_tutor` (`id_tutor`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `Tutors_Tag_Relations` (`id_tag`, `id_tutor`) VALUES (3, 1), (4, 1), (1, 2), (2, 2); CREATE TABLE IF NOT EXISTS `Class_Tag_Relations` ( `id_tag` int(10) unsigned NOT NULL default '0', `id_class` int(10) default NULL, `id_tutor` int(10) NOT NULL, KEY `Class_Tag_Relations` (`id_tag`), KEY `id_class` (`id_class`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `Class_Tag_Relations` (`id_tag`, `id_class`, `id_tutor`) VALUES (5, 1, 1), (6, 1, 1), (3, 2, 2), (6, 2, 2); In the present system data which I have given , tutor named "Sandeepan Nath" has has created class named "My Class" and tutor named "Bob Cratchit" has created class named "Sandeepan Class". Requirement - To execute a single query with limit on the results to show search results as per AND logic on the search keywords like this:- If "Sandeepan Class" is searched , Tutor Sandeepan Nath's record from Tutor Details table is returned(because "Sandeepan" is the firstname of Sandeepan Nath and Class is present in class name of Sandeepan's class) If "Class" is searched Both the tutors from the Tutor_details table are fetched because Class is present in the name of the class created by both the tutors. Following is what I have so far achieved (PHP Mysql):- <?php $searchTerm1 = "Sandeepan"; $searchTerm2 = "Class"; mysql_select_db("test"); $sql = "SELECT td.* FROM Tutor_Details AS td LEFT JOIN Tutors_Tag_Relations AS ttagrels ON td.id_tutor = ttagrels.id_tutor LEFT JOIN Classes AS wc ON td.id_tutor = wc.id_tutor LEFT JOIN Class_Tag_Relations AS wtagrels ON td.id_tutor = wtagrels.id_tutor LEFT JOIN Tags as t1 on ((t1.id_tag = ttagrels.id_tag) OR (t1.id_tag = wtagrels.id_tag)) LEFT JOIN Tags as t2 on ((t2.id_tag = ttagrels.id_tag) OR (t2.id_tag = wtagrels.id_tag)) where t1.tag LIKE '%".$searchTerm1."%' AND t2.tag LIKE '%".$searchTerm2."%' GROUP BY td.id_tutor LIMIT 10 "; $result = mysql_query($sql); echo $sql; if($result) { while($rec = mysql_fetch_object($result)) $recs[] = $rec; //$rec = mysql_fetch_object($result); echo "<br><br>"; if(is_array($recs)) { foreach($recs as $each) { print_r($each); echo "<br>"; } } } ?> But the results are :- If "Sandeepan Nath" is searched, it does not return any tutor (instead of only Sandeepan's row) If "Sandeepan Class" is searched, it returns Sandeepan's row (instead of Both tutors ) If "Bob Class" is searched, it correctly returns Bob's row If "Bob Cratchit" is searched, it does not return any tutor (instead of only

    Read the article

  • How do you optimize stunicholls' "Professional dropdown #2" with jquery?

    - by geff_chang
    Link to menu: Professional dropdown #2 I was wondering if these posts Suckerfish meets jQuery or Son of Suckerfish dropdowns in jQuery could optimize the menu above. I need the menu to be optimized for IE6, because when I use the menu as it is, the menu hangs after I click on a menu item that loads a page with heavy processing. It takes too long for the menu to be enabled again. Any ideas?

    Read the article

  • Why better isolation level means better performance in MS SQL Server

    - by Oleg Zhylin
    When measuring performance on my query I came up with a dependency between isolation level and elapsed time that was surprising to me READUNCOMMITTED - 409024 READCOMMITTED - 368021 REPEATABLEREAD - 358019 SERIALIZABLE - 348019 Left column is table hint, and the right column is elapsed time in microseconds (sys.dm_exec_query_stats.total_elapsed_time). Why better isolation level gives better performance? This is a development machine and no concurrency whatsoever happens. I would expect READUNCOMMITTED to be the fasted due to less locking overhead.

    Read the article

  • How to measure the time HTTP requests spend sitting in the accept-queue?

    - by David Jones
    I am using Apache2 on Ubuntu 9.10, and I am trying to tune my configuration for a web application to reduce latency of responses to HTTP requests. During a moderately heavy load on my small server, there are 24 apache2 processes handling requests. Additional requests get queued. Using "netstat", I see 24 connections are ESTABLISHED and 125 connections are TIME_WAIT. I am trying to figure out if that is considered a reasonable backlog. Most requests get serviced in a fraction of a second, so I am assuming requests move through the accept-queue fairly quickly, probably within 1 or 2 seconds, but I would like to be more certain. Can anyone recommend an easy way to measure the time an HTTP request sits in the accept-queue? The suggestions I have come across so far seem to start the clock after the apache2 worker accepts the connection. I'm trying to quantify the accept-queue delay before that. thanks in advance, David Jones

    Read the article

  • Converting python collaborative filtering code to use Map Reduce

    - by Neil Kodner
    Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

    Read the article

< Previous Page | 148 149 150 151 152 153 154 155 156 157 158 159  | Next Page >