Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 319/361 | < Previous Page | 315 316 317 318 319 320 321 322 323 324 325 326  | Next Page >

  • C++ - passing references to boost::shared_ptr

    - by abigagli
    If I have a function that needs to work with a shared_ptr, wouldn't it be more efficient to pass it a reference to it (so to avoid copying the shared_ptr object)? What are the possible bad side effects? I envision two possible cases: 1) inside the function a copy is made of the argument, like in ClassA::take_copy_of_sp(boost::shared_ptr<foo> &sp) { ... m_sp_member=sp; //This will copy the object, incrementing refcount ... } 2) inside the function the argument is only used, like in Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here { ... sp->do_something(); ... } I can't see in both cases a good reason to pass the boost::shared_ptr by value instead of by reference. Passing by value would only "temporarily" increment the reference count due to the copying, and then decrement it when exiting the function scope. Am I overlooking something? Andrea. EDIT: Just to clarify, after reading several answers : I perfectly agree on the premature-optimization concerns, and I alwasy try to first-profile-then-work-on-the-hotspots. My question was more from a purely technical code-point-of-view, if you know what I mean.

    Read the article

  • Alternative or succesor to GDBM

    - by Anon Guy
    We a have a GDBM key-value database as the backend to a load-balanced web-facing application that is in implemented in C++. The data served by the application has grown very large, so our admins have moved the GDBM files from "local" storage (on the webservers, or very close by) to a large, shared, remote, NFS-mounted filesystem. This has affected performance. Our performance tests (in a test environment) show page load times jumping from hundreds of milliseconds (for local disk) to several seconds (over NFS, local network), and sometimes getting as high as 30 seconds. I believe a large part of the problem is that the application makes lots of random reads from the GDBM files, and that these are slow over NFS, and this will be even worse in production (where the front-end and back-end have even more network hardware between them) and as our database gets even bigger. While this is not a critical application, I would like to improve performance, and have some resources available, including the application developer time and Unix admins. My main constraint is time only have the resources for a few weeks. As I see it, my options are: Improve NFS performance by tuning parameters. My instinct is we wont get much out of this, but I have been wrong before, and I don't really know very much about NFS tuning. Move to a different key-value database, such as memcachedb or Tokyo Cabinet. Replace NFS with some other protocol (iSCSI has been mentioned, but i am not familiar with it). How should I approach this problem?

    Read the article

  • Java-Eclipse-Spring 3.1 - the fastest way to get familiar with this set

    - by Leron
    I, know almost all of you at some point of your life as a programmer get to the point where you know (more or less) different technologies/languages/IDEs and a times come when you want to get things together and start using them once - more efficient and second - more closely to the real life situation where in fact just knowing Java, or some experience with Eclipse doesn't mean nothing, and what makes you a programmer worth something is the ability to work with the combination of 2 or more combinations. Having this in mind here is my question - what do you think is the optimal way of getting into Java+Eclipse+Spring3.1 world. I've read, and I've read a lot. I started writing real code but almost every step is discovering the wheel again and again, wondering how to do thing you know are some what trivial, but you've missed that one article where this topic was discussed and so on. I don't mind for paying for a good tutorial like for example, after a bit of research I decided that instead of losing a lot of time getting the different parts together I'd rather pay for the videos in http://knpuniversity.com/screencast/starting-in-symfony2-tutorial and save myself a lot of time (I hope) and get as fast as possible to writing a real code instead of wondering what do what and so on. But I find it much more difficult to find such sources of info especially when you want something more specific as me and that's the reason to ask this question. I know a lot of you go through the hard way, and I won't give up if I have to do the same, but to be honest I really hope to get post with good tutorials on the subject (paid or not) because in my situation time is literally money. Thanks Leron

    Read the article

  • How good is the memory mapped Circular Buffer on Wikipedia?

    - by abroun
    I'm trying to implement a circular buffer in C, and have come across this example on Wikipedia. It looks as if it would provide a really nice interface for anyone reading from the buffer, as reads which wrap around from the end to the beginning of the buffer are handled automatically. So all reads are contiguous. However, I'm a bit unsure about using it straight away as I don't really have much experience with memory mapping or virtual memory and I'm not sure that I fully understand what it's doing. What I think I understand is that it's mapping a shared memory file the size of the buffer into memory twice. Then, whenever data is written into the buffer it appears in memory in 2 places at once. This allows all reads to be contiguous. What would be really great is if someone with more experience of POSIX memory mapping could have a quick look at the code and tell me if the underlying mechanism used is really that efficient. Am I right in thinking for example that the file in /dev/shm used for the shared memory always stays in RAM or could it get written to the hard drive (performance hit) at some point? Are there any gotchas I should be aware of? As it stands, I'm probably going to use a simpler method for my current project, but it'd be good to understand this to have it in my toolbox for the future. Thanks in advance for your time.

    Read the article

  • Oracle - UPSERT with update not executed for unmodified values

    - by Buthrakaur
    I'm using following update or insert Oracle statement at the moment: BEGIN UPDATE DSMS SET SURNAME = :SURNAME, FIRSTNAME = :FIRSTNAME, VALID = :VALID WHERE DSM = :DSM; IF (SQL%ROWCOUNT = 0) THEN INSERT INTO DSMS (DSM, SURNAME, FIRSTNAME, VALID) VALUES (:DSM, :SURNAME, :FIRSTNAME, :VALID); END IF; END; This runs fine except that the update statement performs dummy update if the data is same as the parameter values provided. I would not mind the dummy update in normal situation, but there's a replication/synchronization system build over this table using triggers on tables to capture updated records and executing this statement frequently for many records simply means that I'd cause huge traffic in triggers and the sync system. Is there any simple method how to reformulate this code that the update statement wouldn't update record if not necessary without using following IF-EXISTS check code which I find not sleek enough and maybe also not most efficient for this task? DECLARE CNT NUMBER; BEGIN SELECT COUNT(1) INTO CNT FROM DSMS WHERE DSM = :DSM; IF SQL%FOUND THEN UPDATE DSMS SET SURNAME = :SURNAME, FIRSTNAME = :FIRSTNAME, VALID = :VALID WHERE DSM = :DSM AND (SURNAME != :SURNAME OR FIRSTNAME != :FIRSTNAME OR VALID != :VALID); ELSE INSERT INTO DSMS (DSM, SURNAME, FIRSTNAME, VALID) VALUES (:DSM, :SURNAME, :FIRSTNAME, :VALID); END IF; END;

    Read the article

  • Need Help finding an appropriate task asignment algoritm for a collage project involving coordinatin

    - by Trif Mircea
    Hello. I am a long time lurker here and have found over time many answers regarding jquery and web development topics so I decided to ask a question of my own. This time I have to create a c++ project for collage which should help manage the workflow of a company providing all kinds of services through in the field teams. The ideas I have so far are: client-server application; the server is a dispatcher where all the orders from clients get and the clients are mobile devices (PDAs) each team in the field having one a order from a client is a task. Each task is made up of a series of subtasks. You have a database with estimations on how long a task should take to complete you also know what tasks or subtasks each team on the field can perform based on what kind of specialists made up the team (not going to complicate the problem by adding needed materials, it is considered that if a member of a team can perform a subtask he has the stuff needed) Now knowing these factors, what would a good task assignment algorithm be? The criteria is: how many tasks can a team do, how many tasks they have in the queue, it could also be location, how far away are they from the place but I don't think I can implement that.. It needs to be efficient and also to adapt quickly is the human dispatcher manually assigns a task. Any help or leads would be really appreciated. Also I'm not 100% sure in the idea so if you have another way you would go about creating such an application please share, even if it just a quick outline. I have to write a theoretical part too so even if the ideas are far more complex that what i outlined that would be ok ; I'd write those and implement what I can.

    Read the article

  • how to synchronize database table and directory with php

    - by twmulloy
    hello, I have a directory with files and a database table with what should be the same files. I would like to be able to synchronize the database table with the directory. What would be the most efficient way to do this? or would I realistically only be able to do this in a brute manner? Here's my approach: 1. retrieve all of the files in the directory as array 2. retrieve all of the filenames in the database table as array 3. loop through the file values in the directory array and use in_array() on the database table array to verify the filename is in that array, and if not then start building an array to insert the missing filenames. run db query to add each missing file row to database table 4. loop through directory array and use in_array() on the directory array and anything not found in the directory array will just be deleted from the table. Is there a better way to go about this? or something better for this in php than in_array()?

    Read the article

  • Two radically different queries against 4 mil records execute in the same time - one uses brute force.

    - by IanC
    I'm using SQL Server 2008. I have a table with over 3 million records, which is related to another table with a million records. I have spent a few days experimenting with different ways of querying these tables. I have it down to two radically different queries, both of which take 6s to execute on my laptop. The first query uses a brute force method of evaluating possibly likely matches, and removes incorrect matches via aggregate summation calculations. The second gets all possibly likely matches, then removes incorrect matches via an EXCEPT query that uses two dedicated indexes to find the low and high mismatches. Logically, one would expect the brute force to be slow and the indexes one to be fast. Not so. And I have experimented heavily with indexes until I got the best speed. Further, the brute force query doesn't require as many indexes, which means that technically it would yield better overall system performance. Below are the two execution plans. If you can't see them, please let me know and I'll re-post then in landscape orientation / mail them to you. Brute-force query: Index-based exception query: My question is, based on the execution plans, which one look more efficient? I realize that thing may change as my data grows.

    Read the article

  • Returning c_str from a function

    - by user199421
    This is from a small library that I found online: const char* GetHandStateBrief(const PostFlopState* state) { static std::ostringstream out; ... rest of the function ... return out.str().c_str() Now in my code I am doing this: const char *d = GetHandStateBrief(&post); std::cout<< d << std::endl; Now, at first d contained garbage. I then realized that the c string I am getting from the function is destroyed when the function returns because std::ostringstream is allocated on the stack. So I added: return strdup( out.str().c_str()); And now I can get the text I need from the function. I have two questions: 1) Am I understanding this correctly? 2) I later noticed that the ostringstream was was allocated with static storage. Doesn't that mean that the object is supposed to stay in memory until the program terminates? and if so , then why can't I access the string?

    Read the article

  • SQL Databases and table design/organization

    - by John McMullen
    (NOOB disclaimer) I'm working on a system (a type of map), that is accessed mostly via 3 fields: ID (auto incremented), X coordinate, and Y coordinate. As it is right now, i have all data on the map, stored in 1 table. Whenever the map display is loaded it simply queries the database for contents in x and y, and the DB gives the data (other fields in the same entry). If an item on the map is doing something, it has a flag saying its doing something, and then has an ID of the action in another table holding that type of 'actions'. Essentially, for all map data, its stored in 1 table. All actions of a certain type are stored in their own table. I'm a noob, and i'm wondering what the most effective/efficient structure for such a design? (a map that has items, and each item has stats/actions). I'm using PHP atm, using standard SQL queries to get my data. Should i split up the tables so that there are only x number of entries on a table? (coord range limits)? Should it just keep growing and growing? There's a lot of queries to the table... so just tryin to see what is best :/

    Read the article

  • What exactly is a reentrant function?

    - by eSKay
    Most of the times, the definition of reentrance is quoted from Wikipedia: A computer program or routine is described as reentrant if it can be safely called again before its previous invocation has been completed (i.e it can be safely executed concurrently). To be reentrant, a computer program or routine: Must hold no static (or global) non-constant data. Must not return the address to static (or global) non-constant data. Must work only on the data provided to it by the caller. Must not rely on locks to singleton resources. Must not modify its own code (unless executing in its own unique thread storage) Must not call non-reentrant computer programs or routines. How is safely defined? If a program can be safely executed concurrently, does it always mean that it is reentrant? What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities? Also, Are all recursive functions reentrant? Are all thread-safe functions reentrant? Are all recursive and thread-safe functions reentrant? While writing this question, one thing comes to mind: Are the terms like reentrance and thread safety absolute at all i.e. do they have fixed concrete definations? For, if they are not, this question is not very meaningful. Thanks!

    Read the article

  • Finding the index of a list item in jQuery

    - by Tim Piele
    I have an unordered list of eight items. On page load the first five <li> have default thumbnail images in them and the 6th one has a 1px by 1px placeholder image with the ID of $('#last'). When a user inserts a new image it replaces the 'src' of $('#last') with their new image. It's not the most efficient way but it works. <ul> <li><img src="img1.png" /></li> <li><img src="img2.png" /></li> <li><img src="img3.png" /></li> <li><img src="img4.png" /></li> <li><img src="img5.png" /></li> <li><img src="1px.png" id="last"/></li> <li></li> <li></li> </ul> When the user adds a new image the ID of $('#last') is removed and I use each() to find the next empty <li> and insert the 1px by 1px image in it, with an ID of $('#last') so it is ready for the next image upload. At this point I need to get the index() of the <li> that now has the 1px by 1px image in it, whose ID is $('#last'), so that I can store the index in the session, so when a user comes back to the page the $('#last') ID is still set and ready to accept another image. How do I get the index of the <li> with that image in it, since it was set after page load? Is there a way to use delegate() or on() to get it? i.e. how do I get the index of an element that was set after page load?

    Read the article

  • scanf segfaults and various other anomalies inside while loop

    - by Shadow
    while(1){ //Command prompt char *command; printf("%s>",current_working_directory); scanf("%s",command);<--seg faults after input has been received. printf("\ncommand:%s\n",command); } I am getting a few different errors and they don't really seem reproducible(except for the segfault at this point .<). This code worked fine about 10 minutes ago, then it infinite looped the printf command and now it seg faults on the line mentioned above. The only thing I changed was scanf("%s",command); to what it currently is. If I change the command variable to be an array it works, obviously this is because the storage is set aside for it. 1) I got prosecuted about telling someone that they needed to malloc a pointer* (But that usually seems to solve the problem such as making it an array) 2) the command I am entering is "magic" 5 characters so there shouldn't be any crazy stack overflow. 3) I am running on mac OSX 10.6 with newest version of xCode(non-OS4) and standard gcc 4) this is how I compile the program: gcc --std=c99 -W sfs.c Just trying to figure out what is going on. Being this is for a school project I am never going to have to see again, I will just code some noob work around that would make my boss cry :) But for afterwards I would love to figure out why this is happening and not just make some fix for it, and if there is some fix for it why that fix works.

    Read the article

  • Bulletin board - Database optimisation

    - by andrew
    This question is a follow on from this Question The project and problem The project I am currently working on is a bulletin board for a large non-profit organisation. The bulletin board will be used to allow inter-office communication within the organisation. I am building the application and have been having trouble extracting the results that I need from my database because I don't think it is properly normalized and because of limitations in my knowledge of relational database theory and mysql. I would appreciate input into the design of the board in general and in particular, ways that the database structure can be improved to facilitate efficient queries and help me develop this application and future application faster Business Logic The bulletin board will be used in the following way Posting bulletins and responses to bulletins Employees or 'users' in offices around the country will be able to post messages to the bulletin board.Bulletins must be posted to a location and categorised- i'll call these "bulletins". Users will be able to post any number of replies to any one bulletin and users will be able to reply to their own bulletin - i'll call these 'replies'. Rating bulletins and replies Users will be able to either 'like' or 'dislike' a bulletin or a reply and the total number of likes or dislikes will be shown for each bulletin or reply. Viewing the bulletin board and responses Bulletins can be displayed chronologically. Users can sort bulletins chronologically or chronologically by the latest reply to that bulletin(let me know if you need more explanation) When a particular bulletin is selected, replies to that bulletin will be displayed chronologically @PerformanceDBA - edited 10:34 est 28/12/10I have begun implementing the data model. I assume that the 6th data model is the physical model because it contains the associative tables. I am going to post any questions that I have below. I will put up a database dump once I am done. I will then put up a list of all the queries that I need to run on the database and begin writing them. I hope you had a good Christmas. I'm in Canada and there's snow! Implementation of Physical model

    Read the article

  • How to catch YouTube embed code and turn into URL

    - by Jonathan Vanasco
    I need to strip YouTube embed codes down to their URL only. This is the exact opposite of all but one question on StackOverflow. Most people want to turn the URL into an embed code. This question addresses the usage patttern I want, but is tied to a specific embed code's regex ( Strip YouTube Embed Code Down to URL Only ) I'm not familiar with how YouTube has offered embeds over the years - or how the sizes differ. According to their current site, there are 2 possible embed templates and a variety of options. If that's it, I can handle a regex myself -- but I was hoping someone had more knowledge they could share, so I could write a proper regex pattern that matches them all and not run into endless edge-cases. The full use case scenario : user enters content in web based wysiwig editor backend cleans out youtube & other embed codes; reformats approved embeds into an internal format as the text is all converted to markdown. on display, appropriate current template/code display for youtube or other 3rd party site is generated At a previous company, our tech-team devised a plan where YouTube videos were embedded by listing the URL only. That worked great , but it was in a CMS where everyone was trained. I'm trying to create a similar storage, but for user-generated-content.

    Read the article

  • help me reason about F# threads

    - by Kevin Cantu
    In goofing around with some F# (via MonoDevelop), I have written a routine which lists files in a directory with one thread: let rec loop (path:string) = Array.append ( path |> Directory.GetFiles ) ( path |> Directory.GetDirectories |> Array.map loop |> Array.concat ) And then an asynchronous version of it: let rec loopPar (path:string) = Array.append ( path |> Directory.GetFiles ) ( let paths = path |> Directory.GetDirectories if paths <> [||] then [| for p in paths -> async { return (loopPar p) } |] |> Async.Parallel |> Async.RunSynchronously |> Array.concat else [||] ) On small directories, the asynchronous version works fine. On bigger directories (e.g. many thousands of directories and files), the asynchronous version seems to hang. What am I missing? I know that creating thousands of threads is never going to be the most efficient solution -- I only have 8 CPUs -- but I am baffled that for larger directories the asynchronous function just doesn't respond (even after a half hour). It doesn't visibly fail, though, which baffles me. Is there a thread pool which is exhausted? How do these threads actually work?

    Read the article

  • Database contents setting themselves to 0

    - by Luis Armando
    I have a Database that contains 4 tables, however I'm using 1 of them which is separated from the others. In this table I have 4 fields which are varchar and the rest are ints (11 other fields), when the users fill up the DB everything gets saved correctly, however it has happened 3 times so far that the database values for the int's reset to 0 without any apparent reason. At first, I thought, it was because those fields (where the numbers should go) were varchars not ints. However since I changed it, it happened again. I've already double checked my code and I have nothing that even updates or inserts a 0 value. Also I'm using codeigniter and active records which protect against SQL injections AND have XSS filtering enabled, could anyone point out something I might be missing or a reason for this to be happening? Also, I'm pretty sure about the answer of this but, is there ANY way to recover some data?? Other than having to ask everyone to fill in everything again.. =/ ** EDIT ** The Storage Engine is MyISAM and Collation is latin1_swedish_ci, Pack Keys are default, for all intents and purposes it's a normal DB

    Read the article

  • What is the minimal licensable source code?

    - by Hernán Eche
    Let's suppose I want to "protect" this code about being used without attribution, patenting it, or through any open source licence... #include<stdio.h> int main (void) { int version=2; printf("\r\n.Hello world, ver:(%d).", version); return 0; } It's a little obvious or just a language definition example.. When a source stop being "trivial, banal, commonplace, obvious", and start to be something that you may claim "rights"? Perhaps it depends on who read it, something that could be great geniality for someone that have never programmed, could be just obvious for an expert. It's easy when watching two sources there are 10000 same lines of code, that's a theft.. but that's not always so obvious. How to measure amount of "ownness", it's about creativity? line numbers? complexity? I can't imagine objetive answers for that, only some patches. For example perhaps the complexity, It's not fair to replace "years of engeneering" with "copy and paste". But is there any objetive index for objetive determination of this subject? (In a funny way I imagine this criterion: If the licence is longer than the code, then there is no owner, just to punish not caring storage space and world resources =P)

    Read the article

  • Windows 7 versus Windows XP multithreading - Delphi app not acting right

    - by Robert Oschler
    I'm having a problem with a Delphi Pro 6 application that I wrote on my Windows XP machine when it runs on Windows 7. I don't have Windows 7 to test yet and I'm trying to see if Windows 7 might be the source of the trouble. Is there a fundamental difference between the way Windows 7 handles threads compared to Windows XP? I am seeing things happen out of sequence in my error logs on Windows 7 and it's causing problems. For example, objects that should have been initialized are uninitialized when running on Windows 7, yet those objects are initialized on Windows XP by the time they are needed. Some questions: 1) Are there any core differences that could cause threads/processes to behave differently between the two operating system versions? 2) I know this next question may seem absurd, but does Windows 7 attempt to split/fork threads that aren't split/forked on Windows XP? 3) And lastly, are there any known issues with FPU handling that can cause XP programs trouble when run on Windows 7 due to operational differences in wait state handling or register storage, or perhaps something like Exception mask settings, etc? 4) Any 32-bit versus 64-bit issues that could be creating trouble here? -- roschler

    Read the article

  • Create new or update existing entity at one go with JPA

    - by Alex R
    A have a JPA entity that has timestamp field and is distinguished by a complex identifier field. What I need is to update timestamp in an entity that has already been stored, otherwise create and store new entity with the current timestamp. As it turns out the task is not as simple as it seems from the first sight. The problem is that in concurrent environment I get nasty "Unique index or primary key violation" exception. Here's my code: // Load existing entity, if any. Entity e = entityManager.find(Entity.class, id); if (e == null) { // Could not find entity with the specified id in the database, so create new one. e = entityManager.merge(new Entity(id)); } // Set current time... e.setTimestamp(new Date()); // ...and finally save entity. entityManager.flush(); Please note that in this example entity identifier is not generated on insert, it is known in advance. When two or more of threads run this block of code in parallel, they may simultaneously get null from entityManager.find(Entity.class, id) method call, so they will attempt to save two or more entities at the same time, with the same identifier resulting in error. I think that there are few solutions to the problem. Sure I could synchronize this code block with a global lock to prevent concurrent access to the database, but would it be the most efficient way? Some databases support very handy MERGE statement that updates existing or creates new row if none exists. But I doubt that OpenJPA (JPA implementation of my choice) supports it. Event if JPA does not support SQL MERGE, I can always fall back to plain old JDBC and do whatever I want with the database. But I don't want to leave comfortable API and mess with hairy JDBC+SQL combination. There is a magic trick to fix it using standard JPA API only, but I don't know it yet. Please help.

    Read the article

  • C#. How to pass message from unsafe callback to managed code?

    - by maxima120
    Is there a simple example of how to pass messages from unsafe callback to managed code? I have a proprietary dll which receives some messages packed in structs and all is coming to a callback function. The example of usage is as follows but it calls unsafe code too. I want to pass the messages into my application which is all managed code. *P.S. I have no experience in interop or unsafe code. I used to develop in C++ 8 yrs ago but remember very little from that nightmarish times :) P.P.S. The application is loaded as hell, the original devs claim it processes 2mil messages per sec.. I need a most efficient solution.* static unsafe int OnCoreCallback(IntPtr pSys, IntPtr pMsg) { // Alias structure pointers to the pointers passed in. CoreSystem* pCoreSys = (CoreSystem*)pSys; CoreMessage* pCoreMsg = (CoreMessage*)pMsg; // message handler function. if (pCoreMsg->MessageType == Core.MSG_STATUS) OnCoreStatus(pCoreSys, pCoreMsg); // Continue running return (int)Core.CALLBACKRETURN_CONTINUE; } Thank you.

    Read the article

  • Which MySQL Fork/Version to Pick??

    - by Drew
    As most of you know, Sun acquired MySQL (and later Oracle acquired Sun), and during these acquisitions, there were a lot of FUD in MySQL community which resulted in creation of various forks. Today we have MySQL from MySQL, Percona (XtraDB) MySQL, OurDelta MySQL, MariaDB, Drizzle to name a few. Which brings us to the source of the problem. We are in the process of upgrading our databases (hardware/software) and I would like to know which one of the forks should I go with. Each has their own set of pros/cons. We are currently using MySQL 5.0.x from MySQL/Linux on an 8-core machine. Our new hardware is a monster with 32 cores and 32GB of memory connecting to a fast NetApp Storage via FC. I would like to stick with MySQL from MySQL but I have heard horror stories on how badly MySQL 5.1 performs on many cores. I have also heard that MySQL 5.4 performs better on multi-core machines but that's still not production ready. In addition, I have also heard a lot of good things about Percona builds. This is what I know so far: MySQL 5.1 from MySQL: Reliable choice, but doesn't scale well on a big machine Percona: Scales well, good backing company. I don't have much experience with it MariaDB: Don't know much about it besides that it was founded by Original MySQL developers (including Monty) OurDelta: Don't know much Drizzle: Mostly optimized for cloud computing I would like to know what's the general notion about this problem. Which build/version should I go with? How are you guys picking your builds/versions? Thanks!

    Read the article

  • Are there any libraries for parsing "number expressions" like 1,2-9,33- in Java

    - by mihi
    Hi, I don't think it is hard, just tedious to write: Some small free (as in beer) library where I can put in a String like 1,2-9,33- and it can tell me whether a given number matches that expression. Just like most programs have in their print range dialogs. Special functions for matching odd or even numbers only, or matching every number that is 2 mod 5 (or something like that) would be nice, but not needed. The only operation I have to perform on this list is whether the range contains a given (nonnegative) integer value; more operations like max/min value (if they exist) or an iterator would be nice, of course. What would be needed that it does not occupy lots of RAM if anyone enters 1-10000000 but the only number I will ever query is 12345 :-) (To implement it, I would parse a list into several (min/max/value/mod) pairs, like 1,10,0,1 for 1-10 or 11,33,1,2 for 1-33odd, or 12,62,2,10 for 12-62/10 (i. e. 12, 22, 32, ..., 62) and then check each number for all the intervals. Open intervals by using Integer.MaxValue etc. If there are no libs, any ideas to do it better/more efficient?)

    Read the article

  • N-gram split function for string similarity comparison

    - by Michael
    As part of excersise to better understand F# which I am currently learning , I wrote function to split given string into n-grams. 1) I would like to receive feedback about my function : can this be written simpler or in more efficient way? 2) My overall goal is to write function that returns string similarity (on 0.0 .. 1.0 scale) based on n-gram similarity; Does this approach works well for short strings comparisons , or can this method reliably be used to compare large strings (like articles for example). 3) I am aware of the fact that n-gram comparisons ignore context of two strings. What method would you suggest to accomplish my goal? //s:string - target string to split into n-grams //n:int - n-gram size to split string into let ngram_split (s:string, n:int) = let ngram_count = s.Length - (s.Length % n) let ngram_list = List.init ngram_count (fun i -> if( i + n >= s.Length ) then s.Substring(i,s.Length - i) + String.init ((i + n) - s.Length) (fun i -> "#") else s.Substring(i,n) ) let ngram_array_unique = ngram_list |> Seq.ofList |> Seq.distinct |> Array.ofSeq //produce tuples of ngrams (ngram string,how much occurrences in original string) Seq.init ngram_array_unique.Length (fun i -> (ngram_array_unique.[i], ngram_list |> List.filter(fun item -> item = ngram_array_unique.[i]) |> List.length) )

    Read the article

  • Parsing multiple files at a time in Perl

    - by sfactor
    I have a large data set (around 90GB) to work with. There are data files (tab delimited) for each hour of each day and I need to perform operations in the entire data set. For example, get the share of OSes which are given in one of the columns. I tried merging all the files into one huge file and performing the simple count operation but it was simply too huge for the server memory. So, I guess I need to perform the operation each file at a time and then add up in the end. I am new to perl and am especially naive about the performance issues. How do I do such operations in a case like this. As an example two columns of the file are. ID OS 1 Windows 2 Linux 3 Windows 4 Windows Lets do something simple, counting the share of the OSes in the data set. So, each .txt file has millions of these lines and there are many such files. What would be the most efficient way to operate on the entire files.

    Read the article

< Previous Page | 315 316 317 318 319 320 321 322 323 324 325 326  | Next Page >