Search Results

Search found 18246 results on 730 pages for 'language processing'.

Page 145/730 | < Previous Page | 141 142 143 144 145 146 147 148 149 150 151 152  | Next Page >

  • Non-Relational DBMS Design Resources

    - by Matt Luongo
    Hey guys, As a personal project, I'm looking to build a rudimentary DBMS. I've read the relevant sections in Elmasri & Navathe (5ed), but could use a more focused text. The rub is that I want to play with novel non-relational data models. While a lot of E&N was great- indexing implementation details in particular- the more advanced DBMS implementation was only targeted to a relational model. I could also use something a bit more practical and detail-oriented, with real-world recommendations. I'd like to defer staring at DBMS source for a while if I can. Any ideas?

    Read the article

  • Extract all text from a HTML page without losing context

    - by grmbl
    For a translation program I am trying to get a 95% accurate text from a HTML file in order to translate the sentences and links. For example: <div><a href="stack">Overflow</a> <span>Texts <b>go</b> here</span></div> Should give me 2 results to translate: Overflow Texts <b>go</b> here Any suggestions or commercial packages available for this problem?

    Read the article

  • Where do you start your design - code, UI or workflow?

    - by Mmarquee
    Hi I was discussing this at work, and was wondering where people start their designs? We tend to start with designing code to solve the problem presented to us, but that is probably all of us are (or were) programmers. I was wondering where other people and organisations start their design. Do they start with solving the problem as a coding problem, sit down and design what UI to use, or map out the data or workflow? Thanks

    Read the article

  • How to choose the right web application framework?

    - by thenextwebguy
    http://en.wikipedia.org/wiki/Comparison_of_web_application_frameworks Since we are ambitiously aiming to be big, scalability is important, and so are globalization features. Since we are starting out without funding, price/performance and cost of licences/hardware is important. We definitely want to bring AJAX well present in the web interface. But apart from these, there's no further criteria I can come up with. I'm most experienced with C#/ASP.net, PHP and Java, in that order, but don't turn down other languages (Ruby, Python, Scala, etc.). How can we determine from the jungle of frameworks the one that suits best our goal? What other questions should we be asking ourselves? Reference material: articles, book recommendations, websites, etc.?

    Read the article

  • Need some help understanding this problem about maximizing graph connectivity

    - by Legend
    I was wondering if someone could help me understand this problem. I prepared a small diagram because it is much easier to explain it visually. Problem I am trying to solve: 1. Constructing the dependency graph Given the connectivity of the graph and a metric that determines how well a node depends on the other, order the dependencies. For instance, I could put in a few rules saying that node 3 depends on node 4 node 2 depends on node 3 node 3 depends on node 5 But because the final rule is not "valuable" (again based on the same metric), I will not add the rule to my system. 2. Execute the request order Once I built a dependency graph, execute the list in an order that maximizes the final connectivity. I am not sure if this is a really a problem but I somehow have a feeling that there might exist more than one order in which case, it is required to choose the best order. First and foremost, I am wondering if I constructed the problem correctly and if I should be aware of any corner cases. Secondly, is there a closely related algorithm that I can look at? Currently, I am thinking of something like Feedback Arc Set or the Secretary Problem but I am a little confused at the moment. Any suggestions? PS: I am a little confused about the problem myself so please don't flame on me for that. If any clarifications are needed, I will try to update the question.

    Read the article

  • Minimizing the number of boxes that cover a given set of intervals

    - by fortran
    Hi, this is a question for the algorithms gurus out there :-) Let S be a set of intervals of the natural numbers that might overlap and b a box size. Assume that for each interval, the range is strictly less than b. I want to find the minimum set of intervals of size b (let's call it M) so all the intervals in S are contained in the intervals of M. Trivial example: S = {[1..4], [2..7], [3..5], [8..15], [9..13]} b = 10 M = {[1..10], [8..18]} // so ([1..4], [2..7], [3..5]) are inside [1..10] and ([8..15], [9..13]) are inside [8..18] I think a greedy algorithm might not work always, so if anybody knows of a solution of this problem (or a similar one that can be converted into), that would be great. Thanks!

    Read the article

  • Random access gzip stream

    - by jkff
    I'd like to be able to do random access into a gzipped file. I can afford to do some preprocessing on it (say, build some kind of index), provided that the result of the preprocessing is much smaller than the file itself. Any advice? My thoughts were: Hack on an existing gzip implementation and serialize its decompressor state every, say, 1 megabyte of compressed data. Then to do random access, deserialize the decompressor state and read from the megabyte boundary. This seems hard, especially since I'm working with Java and I couldn't find a pure-java gzip implementation :( Re-compress the file in chunks of 1Mb and do same as above. This has the disadvantage of doubling the required disk space. Write a simple parser of the gzip format that doesn't do any decompressing and only detects and indexes block boundaries (if there even are any blocks: I haven't yet read the gzip format description)

    Read the article

  • What exactly are administrative redexes after CPS conversion?

    - by eljenso
    In the context of Scheme and CPS conversion, I'm having a little trouble deciding what administrative redexes (lambdas) exactly are: all the lambda expressions that are introduced by the CPS conversion only the lambda expressions that are introduced by the CPS conversion but you wouldn't have written if you did the conversion "by hand" or through a smarter CPS-converter If possible, a good reference would be welcome.

    Read the article

  • What programming screencasts/podcast resources do you know?

    - by Ricky AH
    Just as the title says, if you know any resource, answer here. Personally I'm more0interested in screencasts more than podcasts, because english is not my mother tonge, so visual clues help a lot: NetBeans TV Screencasts DimeCasts.NET Apple Developer Connection (iTunes) --- Suggested by the community .NET Rocks dnrTV Channel9 MSDN Events and WebCasts Software Engineering DeepFries RailCasts Learnivore! HanselMinutes ThinkCode

    Read the article

  • Most hazardous performance bottleneck misconceptions

    - by David Murdoch
    The guys who wrote Bespin (cloud-based canvas-based code editor [and more]) recently spoke about how they re-factored and optimize a portion of the Bespin code because of a misconception that JavaScript was slow. It turned out that when all was said and done, their optimization produced no significant improvements. I'm sure many of us go out of our way to write "optimized" code based on misconceptions similar to that of the Bespin team. What are some common performance bottleneck misconceptions developers commonly subscribe to?

    Read the article

  • Humor in code

    - by pfranza
    When you are writing code or naming products, which sources of cultural references are you most likely to draw from? Which reference sources do you think are more likely to be universally understood? For example when findbugs sees that you've implemented equals() without overriding hashCode() it suggest that you implement it by returning 42 (a reference from HHGTTG) Or why we have big endian vs little endian encoding, referencing Gulliver's Travels Not that we should act unprofessionally with our code, but if you going to tell a person that they could only (watch/read/...) one (book/movie/show/...) which one would allow them to 'get' the most jokes?

    Read the article

  • How to identify ideas and concepts in a given text

    - by Nick
    I'm working on a project at the moment where it would be really useful to be able to detect when a certain topic/idea is mentioned in a body of text. For instance, if the text contained: Maybe if you tell me a little more about who Mr Balzac is, that would help. It would also be useful if I could have a description of his appearance, or even better a photograph? It'd be great to be able to detect that the person has asked for a photograph of Mr Balzac. I could take a really naïve approach and just look for the word "photo" or "photograph", but this would obviously be no good if they wrote something like: Please, never send me a photo of Mr Balzac. Does anyone know where to start with this? Is it even possible? I've looked into things like nltk, but I've yet to find an example of someone doing something similar and am still not entirely sure what this kind of analysis is called. Any help that can get me off the ground would be great. Thanks!

    Read the article

  • Literature and Tutorials for Writing a Ray Tracer

    - by grrussel
    I am interested in finding recommendations on books on writing a raytracer, simple and clear implementations of ray tracing that can be seen on the web, and online resources on introductory raytracing. Ideally, the approach would be incremental and tutorial in style, and explain both the programming techniques and underyling mathematics, starting from the basics.

    Read the article

  • Drawing Directed Acyclic Graphs: Minimizing edge crossing?

    - by Robert Fraser
    Laying out the verticies in a DAG in a tree form (i.e. verticies with no in-edges on top, verticies dependent only on those on the next level, etc.) is rather simple without graph drawing algorithms such as Efficient Sugimiya. However, is there a simple algorithm to do this that minimizes edge crossing? (For some graphs, it may be impossible to completely eliminate edge crossing.) A picture says a thousand words, so is there an algorithm that would suggest: instead of: EDIT: As the picture suggests, a vertex's inputs are always on top and outputs are always below, which is another barrier to just pasting in an existing layout algorithm.

    Read the article

  • What is the most stupid coded solution you have read/improved/witnessed?

    - by Rigo Vides
    And for stupid I mean Illogical, non-effective, complex(the bad way), ugly code style. I will start: We had a requirement there when we needed to hide certain objects given the press of a button. So this framework we were using at the time provided a way to tag objects and retrieve all the objects with a certain tag in a complete iterable collection. So I presented the most logically solution given these conditions to my partner: Me: you know, tag all the objects we needed to hide with the same tag, then call the function to get them all, iterate trough them and make them hidden. Partner: I don't know, that is hardcoding for me... Me: So what do you suggest? 20 mins later... Partner: I don't know... let's put a tag to all the objects to be hidden like this, 1, 2, 3, 4, 5, 6, 7 (and so for each object to be hidden), Then we make a for from 1 to n (where n was the number of objects to hide) and we hide them all there!

    Read the article

  • Is it possible to do A/B testing by page rather than by individual?

    - by mojones
    Lets say I have a simple ecommerce site that sells 100 different t-shirt designs. I want to do some a/b testing to optimise my sales. Let's say I want to test two different "buy" buttons. Normally, I would use AB testing to randomly assign each visitor to see button A or button B (and try to ensure that that the user experience is consistent by storing that assignment in session, cookies etc). Would it be possible to take a different approach and instead, randomly assign each of my 100 designs to use button A or B, and measure the conversion rate as (number of sales of design n) / (pageviews of design n) This approach would seem to have some advantages; I would not have to worry about keeping the user experience consistent - a given page (e.g. www.example.com/viewdesign?id=6) would always return the same html. If I were to test different prices, it would be far less distressing to the user to see different prices for different designs than different prices for the same design on different computers. I also wonder whether it might be better for SEO - my suspicion is that Google would "prefer" that it always sees the same html when crawling a page. Obviously this approach would only be suitable for a limited number of sites; I was just wondering if anyone has tried it?

    Read the article

  • Last words of a ??? programmer

    - by Peter
    What will the last words of some kind of programmer be? Like: LW of a Perl programmer: I don't have to write documentation. The source is formatted so well, I can read it anytime later... or Im just going to write a regular expression to find this, then I'm done...

    Read the article

  • What's the standard algorithm for syncing two lists of objects?

    - by Oliver Giesen
    I'm pretty sure this must be in some kind of text book (or more likely in all of them) but I seem to be using the wrong keywords to search for it... :( A common task I'm facing while programming is that I am dealing with lists of objects from different sources which I need to keep in sync somehow. Typically there's some sort of "master list" e.g. returned by some external API and then a list of objects I create myself each of which corresponds to an object in the master list. Sometimes the nature of the external API will not allow me to do a live sync: For instance the external list might not implement notifications about items being added or removed or it might notify me but not give me a reference to the actual item that was added or removed. Furthermore, refreshing the external list might return a completely new set of instances even though they still represent the same information so simply storing references to the external objects might also not always be feasible. Another characteristic of the problem is that both lists cannot be sorted in any meaningful way. You should also assume that initializing new objects in the "slave list" is expensive, i.e. simply clearing and rebuilding it from scratch is not an option. So how would I typically tackle this? What's the name of the algorithm I should google for? In the past I have implemented this in various ways (see below for an example) but it always felt like there should be a cleaner and more efficient way. Here's an example approach: Iterate over the master list Look up each item in the "slave list" Add items that do not yet exist Somehow keep track of items that already exist in both lists (e.g. by tagging them or keeping yet another list) When done iterate once more over the slave list Remove all objects that have not been tagged (see 4.) Update Thanks for all your responses so far! I will need some time to look at the links. Maybe one more thing worthy of note: In many of the situations where I needed this the implementation of the "master list" is completely hidden from me. In the most extreme cases the only access I might have to the master list might be a COM-interface that exposes nothing but GetFirst-, GetNext-style methods. I'm mentioning this because of the suggestions to either sort the list or to subclass it both of which is unfortunately not practical in these cases unless I copy the elements into a list of my own and I don't think that would be very efficient. I also might not have made it clear enough that the elements in the two lists are of different types, i.e. not assignment-compatible: Especially, the elements in the master list might be available as interface references only.

    Read the article

  • Avoiding problem of overwriting files which are in use

    - by zaf
    For example on a high traffic web server. To reduce problems when switching a file I usually rename the old file out and then rename in the new file. I was told some time ago that renaming a file does not change the 'inode data' so that processes reading the file can keep doing so without glitches. And, of course, rather than copying in the new file it is faster and safer to rename a temp copy. Is this still best practice and if not what do you do?

    Read the article

  • What kinds of problems are most likely to occur? (question rewritten)

    - by ChrisC
    If I wrote 1) a C# SQL db app (a simple program consisting of a gui over some forms with logic for interfacing with the sql db) 2) for home use, that doesn't do any network communication 3) that uses a simple, reliable, and appropriate sql db 4) whose gui is properly separated from the logic 5) that has complete and dependable input data validation 6) that has been completely tested so that 100% of logic bugs were eliminated ... and then if the program was installed and run by random users on their random Windows computers Q1) What types of technical (non-procedural) problems and support situations are most likely to occur, and how likely are they? Q2) Are there more/other things I could do in the first place to prevent those problems and also minimize the amount of user support required? I know some answers will apply to my specific platforms (C#, SQL, Windows, etc) and some won't. Please be as specific as is possible. Mitch Wheat gave me some very valuable advice below, but I'm now offering the bounty because I am hoping to get a better picture of the kinds of things that I'm most reasonably likely to encounter. Thanks.

    Read the article

< Previous Page | 141 142 143 144 145 146 147 148 149 150 151 152  | Next Page >