Search Results

Search found 941 results on 38 pages for 'fastest'.

Page 13/38 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >

  • Lucene: Fastest way to return the document occurance of a phrase?

    - by dont say the kid's name
    Hi Guys, I am trying to use Lucene (actually PyLucene!) to find out how many documents contain my exact phrase. My code currently looks like this... but it runs rather slow. Does anyone know a faster way to return document counts? phraseList = ["some phrase 1", "some phrase 2"] #etc, a list of phrases... countsearcher = IndexSearcher(SimpleFSDirectory(File(STORE_DIR)), True) analyzer = StandardAnalyzer(Version.LUCENE_CURRENT) for phrase in phraseList: query = QueryParser(Version.LUCENE_CURRENT, "contents", analyzer).parse("\"" + phrase + "\"") scoreDocs = countsearcher.search(query, 200).scoreDocs print "count is: " + str(len(scoreDocs))

    Read the article

  • What is the fastest (possibly unsafe) way to read a byte[]?

    - by Aidiakapi
    I'm working on a server project in C#, and after a TCP message is received, it is parsed, and stored in a byte[] of exact size. (Not a buffer of fixed length, but a byte[] of an absolute length in which all data is stored.) Now for reading this byte[] I'll be creating some wrapper functions (also for compatibility), these are the signatures of all functions I need: public byte ReadByte(); public sbyte ReadSByte(); public short ReadShort(); public ushort ReadUShort(); public int ReadInt(); public uint ReadUInt(); public float ReadFloat(); public double ReadDouble(); public string ReadChars(int length); public string ReadString(); The string is a \0 terminated string, and is probably encoded in ASCII or UTF-8, but I cannot tell that for sure, since I'm not writing the client. The data exists of: byte[] _data; int _offset; Now I can write all those functions manually, like this: public byte ReadByte() { return _data[_offset++]; } public sbyte ReadSByte() { byte r = _data[_offset++]; if (r >= 128) return (sbyte)(r - 256); else return (sbyte)r; } public short ReadShort() { byte b1 = _data[_offset++]; byte b2 = _data[_offset++]; if (b1 >= 128) return (short)(b1 * 256 + b2 - 65536); else return (short)(b1 * 256 + b2); } public short ReadUShort() { byte b1 = _data[_offset++]; return (short)(b1 * 256 + _data[_offset++]); } But I wonder if there's a faster way, not excluding the use of unsafe code, since this seems a little bit too much work for simple processing.

    Read the article

  • What is the fastest way to scrape HTML webpage in Android?

    - by kunjaan
    I need to extract information from an unstructured web page in Android. The information I want is embedded in a table that doesn't have an id. <table> <tr><td>Description</td><td></td><td>I want this field next to the description cell</td></tr> </table> Should I use Pattern Matching? Use BufferedReader to extract the information? Or are there faster way to get that information?

    Read the article

  • Fastest way to clamp a real (fixed/floating point) value?

    - by Niklas
    Hi, Is there a more efficient way to clamp real numbers than using if statements or ternary operators? I want to do this both for doubles and for a 32-bit fixpoint implementation (16.16). I'm not asking for code that can handle both cases; they will be handled in separate functions. Obviously, I can do something like: double clampedA; double a = calculate(); clampedA = a > MY_MAX ? MY_MAX : a; clampedA = a < MY_MIN ? MY_MIN : a; or double a = calculate(); double clampedA = a; if(clampedA > MY_MAX) clampedA = MY_MAX; else if(clampedA < MY_MIN) clampedA = MY_MIN; The fixpoint version would use functions/macros for comparisons. This is done in a performance-critical part of the code, so I'm looking for an as efficient way to do it as possible (which I suspect would involve bit-manipulation) EDIT: It has to be standard/portable C, platform-specific functionality is not of any interest here. Also, MY_MIN and MY_MAX are the same type as the value I want clamped (doubles in the examples above).

    Read the article

  • Fastest/One-liner way to list attr_accessors in Ruby?

    - by viatropos
    What's the shortest, one-liner way to list all methods defined with attr_accessor? I would like to make it so, if I have a class MyBaseClass, anything that extends that, I can get the attr_accessor's defined in the subclasses. Something like this: class MyBaseClass < Hash def attributes # ?? end end class SubClass < MyBaseClass attr_accessor :id, :title, :body end puts SubClass.new.attributes.inspect #=> [id, title, body] What about to display just the attr_reader and attr_writer definitions?

    Read the article

  • Fastest way to find the closest point to a given point in 3D, in Python.

    - by Saebin
    So lets say I have 10,000 points in A and 10,000 points in B and want to find out the closest point in A for every B point. Currently, I simply loop through every point in B and A to find which one is closest in distance. ie. B = [(.5, 1, 1), (1, .1, 1), (1, 1, .2)] A = [(1, 1, .3), (1, 0, 1), (.4, 1, 1)] C = {} for bp in B: closestDist = -1 for ap in A: dist = sum(((bp[0]-ap[0])**2, (bp[1]-ap[1])**2, (bp[2]-ap[2])**2)) if(closestDist > dist or closestDist == -1): C[bp] = ap closestDist = dist print C However, I am sure there is a faster way to do this... any ideas?

    Read the article

  • Fastest way to store/retrieve a dictionary - SQL, text file...?

    - by AP257
    Hi all, This is a really really super dumb question, so I apologise, but I'd be grateful for some advice. I've got a text file of words and word frequencies. It's very large - theoretically we're talking millions of rows. I just want to retrieve values from the file, and do it as quickly and efficiently as possible (for a web app, in Django). My question is: what is the best way to store and retrieve the values? Should import them into SQL? Or keep the file and use grep? Or put them into a JSON dictionary...? Or some other way? Sorry for the dumb question, would be very grateful for advice!

    Read the article

  • Fastest algorithm to check if a number is pandigital?

    - by medopal
    Pandigital number is a number that contains the digits 1..number length. For example 123, 4312 and 967412385. I have solved many Project Euler problems, but the Pandigital problems always exceed the one minute rule. This is my pandigital function: private boolean isPandigital(int n){ Set<Character> set= new TreeSet<Character>(); String string = n+""; for (char c:string.toCharArray()){ if (c=='0') return false; set.add(c); } return set.size()==string.length(); } Create your own function and test it with this method int pans=0; for (int i=123456789;i<=123987654;i++){ if (isPandigital(i)){ pans++; } } Using this loop, you should get 720 pandigital numbers. My average time was 500 millisecond. I'm using Java, but the question is open to any language.

    Read the article

  • What's the fastest lookup algorithm for a pair data structure (i.e, a map)?

    - by truncheon
    In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 – 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4) But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. UPDATE: This example is rather trivial and hides the true complexity of what I'm trying to achieve. My real world project is a simple scripting language that uses a parser, data tree, and interpreter (instead of a VM stack system). I need to use some kind of data structure (perhaps map) to store the variables names created by script programmers. These are likely to be pretty randomly named, so I need a lookup method that can quickly find a particular key within a (probably) fairly large list of names. #include <ctime> #include <map> #include <vector> #include <iostream> struct mystruct { char key; int value; mystruct(char k = 0, int v = 0) : key(k), value(v) { } }; int find(const std::vector<mystruct>& ref, char key) { for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i) if (i->key == key) return i->value; return -1; } int main() { std::map<char, int> mymap; std::vector<mystruct> myvec; for (int i = 'a'; i < 'a' + 26; ++i) { mymap[i] = i - 'a'; myvec.push_back(mystruct(i, i - 'a')); } int pre = clock(); for (int i = 0; i < 10000000; ++i) { find(myvec, 'z'); } std::cout << "linear scan: milli " << clock() - pre << "\n"; pre = clock(); for (int i = 0; i < 10000000; ++i) { mymap['z']; } std::cout << "map scan: milli " << clock() - pre << "\n"; return 0; }

    Read the article

  • What is the fastest / best Base64 en/decoder for Java ?

    - by mP
    Just found the MIG Base 64 utility but its over 6 years old since its last release. It would appear to be quicker than the Apache commons equivalent but I have yet to confirm by writing up an actual test. Has anyone verified its correctness which is always a worry. If someone takes a look at the methods, please note i a referring to the non fast methods which make assumptions trading possible correctness for pure speed.

    Read the article

  • Get info from multiple files, match it and then display to end user, what is fastest?

    - by Patrick
    Hi, I need to build a website where we display data that is refreshed every 5minutes in a text file with a | separator. I currently use Java to do this. What I do now: I grab the textfile for every request through the website and process it and then display the data to the end user, this works fine, since Java can go through like 5000 lines of data fast, and when I filter it it is still extremely fast. However now the management wants the following: They added 3 textfiles with the | separator to it, and now want me to also read those files and match the information on certain fields, and if there is a match also display that information to the end user. I think soon enough, although Java is fast, I will run into trouble when 10 people want that information and I have to run through 4 total files matching the information. What can I do to make this process super fast? My Creative solutions so far: -Leave it this way, since Java is fast and end users can wait (probably less then 1second) -Have a background process that dumps new data into a MySQL database every 5minutes, since databases are extremely good at getting the same data from multiple tables. Thank you!

    Read the article

  • Fastest way to copy a set (100+) of related SQLAlchemy objects and change attribute on each one

    - by rebus
    I am developing an app that keeps track of items going in and out of factory. For example, lets say you have 3 kinds of plastic coming in, they are mixed in various ratios and then sent out as a new product. So to keep track of this I've created following database structure: This is very simplified overview of my SQLAlchemy models: IN <- RATIO <- OUT <- REPORT ITEMS -> REPORT IN are products coming in, RATIO is various information on measurements, and OUT is a final product. REPORT is basically a header model which has a lot of REPORT ITEMS attached to it, which in turn relate it to OUT products. This would all work perfectly, but IN and RATION values can change. These changes ultimately change the OUT product which would mean the REPORT values would change. So in order to change an attribute on IN object for example I should copy that object with that attribute changed. I would think this is basically a question about database normalization, because i didn't want to duplicate all the IN, RATIO and OUT information by writing it in REPORT ITEMS table for example, but I've came across this problem (well not really a problem but rather a feature I'd like for a user to have). When the attribute on IN object is changed I want related objects (RATIO and OUT) automatically copied and related to a new IN object. So I was thinking something like: Take an existing instance of model IN that needs to change (call it old_in) Create a new one out of it with some attributes changed (call it new_in) Collect all the RATIO objects that are related to old_in Copy each RATIO and relate them to a new_in Collect all the OUT objects that are related to old RATIO Copy each OUT and relate them to a new RATIO Few questions pop to mind when i look at this problem: Should i just duplicate the data, does all this copying even make sense? If it does, should i rather do it in plain SQL? If no what would be the best approach to do it with Python and SQLAlchemy? Any general answer would suffice really, at least a pointer in right direction. I really want to free then end user for hassle of having create new ratios and out products.

    Read the article

  • C#, Fastest (Best?) Method of Identifying Duplicate Files in an Array of Directories

    - by Nate Greenwood
    I want to recurse several directories and find duplicate files between the n number of directories. My knee-jerk idea at this is to have a global hashtable or some other data structure to hold each file I find; then check each subsequent file to determine if it's in the "master" list of files. Obviously, I don't think this would be very efficient and the "there's got to be a better way!" keeps ringing in my brain. Any advice on a better way to handle this situation would be appreciated.

    Read the article

  • Which of these methods provides for the fastest page loading?

    - by chromedude
    I am building a database in MySQL that will be accessed by PHP scripts. I have a table that is the activity stream. This includes everything that goes on on the website (following of many different things, liking, upvoting etc.). From this activity stream I am going to run an algorithm for each user depending on their activity and display relevant activity. Should I create another table that stores the activity for each user once the algorithm has been run on the activity or should I run the algorithm on the activity table every time the user accesses the site? UPDATE:(this is what is above except rephrased hopefully in an easier to understand way) I have a database table called activity. This table creates a new row every time an action is performed by a user on the website. Every time a user logs in I am going to run an algorithm on the new rows (since the users last login) in the table (activity) that apply to them. For example if the user is following a user who upvoted a post in the activity stream that post will be displayed when the user logs in. I want the ability for the user to be able to access previous content applying to them. Would it be easiest to create another table that saved the rows that have already been run over with the algorithm except attached to individual users names? (a row can apply to multiple different users)

    Read the article

  • What is the fastest (to access) struct-like object in Python?

    - by DNS
    I'm optimizing some code whose main bottleneck is running through and accessing a very large list of struct-like objects. Currently I'm using namedtuples, for readability. But some quick benchmarking using 'timeit' shows that this is really the wrong way to go where performance is a factor: Named tuple with a, b, c: >>> timeit("z = a.c", "from __main__ import a") 0.38655471766332994 Class using __slots__, with a, b, c: >>> timeit("z = b.c", "from __main__ import b") 0.14527461047146062 Dictionary with keys a, b, c: >>> timeit("z = c['c']", "from __main__ import c") 0.11588272541098377 Tuple with three values, using a constant key: >>> timeit("z = d[2]", "from __main__ import d") 0.11106188992948773 List with three values, using a constant key: >>> timeit("z = e[2]", "from __main__ import e") 0.086038238242508669 Tuple with three values, using a local key: >>> timeit("z = d[key]", "from __main__ import d, key") 0.11187358437882722 List with three values, using a local key: >>> timeit("z = e[key]", "from __main__ import e, key") 0.088604143037173344 First of all, is there anything about these little timeit tests that would render them invalid? I ran each several times, to make sure no random system event had thrown them off, and the results were almost identical. It would appear that dictionaries offer the best balance between performance and readability, with classes coming in second. This is unfortunate, since, for my purposes, I also need the object to be sequence-like; hence my choice of namedtuple. Lists are substantially faster, but constant keys are unmaintainable; I'd have to create a bunch of index-constants, i.e. KEY_1 = 1, KEY_2 = 2, etc. which is also not ideal. Am I stuck with these choices, or is there an alternative that I've missed?

    Read the article

  • What's the fastest lookup algorithm for a key, pair data structure (i.e, a map)?

    - by truncheon
    In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 – 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4) But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. #include <ctime> #include <map> #include <vector> #include <iostream> struct mystruct { char key; int value; mystruct(char k = 0, int v = 0) : key(k), value(v) { } }; int find(const std::vector<mystruct>& ref, char key) { for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i) if (i->key == key) return i->value; return -1; } int main() { std::map<char, int> mymap; std::vector<mystruct> myvec; for (int i = 'a'; i < 'a' + 26; ++i) { mymap[i] = i - 'a'; myvec.push_back(mystruct(i, i - 'a')); } int pre = clock(); for (int i = 0; i < 10000000; ++i) { find(myvec, 'z'); } std::cout << "linear scan: milli " << clock() - pre << "\n"; pre = clock(); for (int i = 0; i < 10000000; ++i) { mymap['z']; } std::cout << "map scan: milli " << clock() - pre << "\n"; return 0; }

    Read the article

  • What's the fastest way to search a very long list of words for a match in actionscript 3?

    - by Nuthman
    So I have a list of words (the entire English dictionary). For a word matching game, when a player moves a piece I need to check the entire dictionary to see if the the word that the player made exists in the dictionary. I need to do this as quickly as possible. simply iterating through the dictionary is way too slow. What is the quickest algorithm in AS3 to search a long list like this for a match, and what datatype should I use? (ie array, object, Dictionary etc)

    Read the article

  • fastest public web app framework for quick DB apps?

    - by Steve Eisner
    I'd like to pick up a new tech for my toolbox - something for rapid prototyping of web apps. Brief requirements: public access (not hosted on my machine) - like Google's appengine, etc no tricky configuration necessary to build a simple web app host DB access (small storage provided) including some kind of SQLish query language easy front end HTML templating ability to access as a JSON service C# or Java,PHP or Python - or a fun new language to learn is OK free! An example app, very simple: render an AJAXy editable (add/delete/edit/drag) list of rich-data list items via some template language, so I can quickly mock up a UI for a client. ie. I can do most of the work client-side, but need convenient back end to handle the permanent storage. (In fact I suppose it doesn't even need HTML templating if I can directly access a DB via AJAX calls.) I realize this is a bit vague but am wondering if anyone has recommendations. A Rails host might be best for this (but probably not free) or maybe App Engine, or some other choice I'm not aware of? I've been doing everything with heavyweight servers (ASP.NET etc) for so long that I'm just not up on the latest... Thanks - I'll follow up on comments if this isn't clear enough :)

    Read the article

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >