Search Results

Search found 34840 results on 1394 pages for 'sublime text 3'.

Page 265/1394 | < Previous Page | 261 262 263 264 265 266 267 268 269 270 271 272  | Next Page >

  • What algorithms can I use to detect if articles or posts are duplicates?

    - by michael
    I'm trying to detect if an article or forum post is a duplicate entry within the database. I've given this some thought, coming to the conclusion that someone who duplicate content will do so using one of the three (in descending difficult to detect): simple copy paste the whole text copy and paste parts of text merging it with their own copy an article from an external site and masquerade as their own Prepping Text For Analysis Basically any anomalies; the goal is to make the text as "pure" as possible. For more accurate results, the text is "standardized" by: Stripping duplicate white spaces and trimming leading and trailing. Newlines are standardized to \n. HTML tags are removed. Using a RegEx called Daring Fireball URLs are stripped. I use BB code in my application so that goes to. (ä)ccented and foreign (besides Enlgish) are converted to their non foreign form. I store information about each article in (1) statistics table and in (2) keywords table. (1) Statistics Table The following statistics are stored about the textual content (much like this post) text length letter count word count sentence count average words per sentence automated readability index gunning fog score For European languages Coleman-Liau and Automated Readability Index should be used as they do not use syllable counting, so should produce a reasonably accurate score. (2) Keywords Table The keywords are generated by excluding a huge list of stop words (common words), e.g., 'the', 'a', 'of', 'to', etc, etc. Sample Data text_length, 3963 letter_count, 3052 word_count, 684 sentence_count, 33 word_per_sentence, 21 gunning_fog, 11.5 auto_read_index, 9.9 keyword 1, killed keyword 2, officers keyword 3, police It should be noted that once an article gets updated all of the above statistics are regenerated and could be completely different values. How could I use the above information to detect if an article that's being published for the first time, is already existing within the database? I'm aware anything I'll design will not be perfect, the biggest risk being (1) Content that is not a duplicate will be flagged as duplicate (2) The system allows the duplicate content through. So the algorithm should generate a risk assessment number from 0 being no duplicate risk 5 being possible duplicate and 10 being duplicate. Anything above 5 then there's a good possibility that the content is duplicate. In this case the content could be flagged and linked to the article's that are possible duplicates and a human could decide whether to delete or allow. As I said before I'm storing keywords for the whole article, however I wonder if I could do the same on paragraph basis; this would also mean further separating my data in the DB but it would also make it easier for detecting (2) in my initial post. I'm thinking weighted average between the statistics, but in what order and what would be the consequences...

    Read the article

  • iFrame ads from site to site

    - by user28327
    How I can make the iFrame ads from site A to site B not like this: <iframe src="http://www.site.com"></iframe> Example: site A <script type="text/javascript"><!-- google_ad_client = "ca-pub-3434343507"; /* site */ google_ad_slot = "343435270"; google_ad_width = 728; google_ad_height = 90; //--> </script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> site B --iFrame-- <script type="text/javascript"><!-- google_ad_client = "ca-pub-3434343507"; /* site */ google_ad_slot = "343435270"; google_ad_width = 728; google_ad_height = 90; //--> </script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> --iFrame--

    Read the article

  • Cleaning a dataset of song data - what sort of problem is this?

    - by Rob Lourens
    I have a set of data about songs. Each entry is a line of text which includes the artist name, song title, and some extra text. Some entries are only "extra text". My goal is to resolve as many of these as possible to songs on Spotify using their web API. My strategy so far has been to search for the entry via the API - if there are no results, apply a transformation such as "remove all text between ( )" and search again. I have a list of heuristics and I've had reasonable success with this but as the code gets more and more convoluted I keep thinking there must be a more generic and consistent way. I don't know where to look - any suggestions for what to try, topics to study, buzzwords to google?

    Read the article

  • Execute command from unity lens

    - by Vishnu V
    I am trying to create a unity lens, how to execute a command when we select an entry from unity-lens in the following code results.append(url, icon, category, mime-type, text, comment, drag and drop url) i tried to set file://, but it opens the file with text editor (if it is not readable with text editor it do nothing) Please help Thank you Vishnu V

    Read the article

  • How does Google+ embed link works?

    - by Drake
    I am trying to understand how does embed link works in Google+ Stream. I manage a site and usually every post contains some text and at least an image. But when I try to embed the link to a certain article on my blog the embed feature does not show the article text and images, but instead it shows the header image of my site plus some other header text content. Do you know how should I tag or code the article part so that Google+ parser recognize the right thing I want to show?

    Read the article

  • Tools for game script / storyboard

    - by Pietro Polsinelli
    I am searching for a tool that will help in writing a game script. By "script" I mean the text core of a storyboard - without the drawing drafts, which may or may not be there (yet). What I'm thinking of will let write a piece of text of the script, define a simplified workflow from that step, and then define the text of next steps, and so on. Searching online, I found Inform http://inform7.com/ ("A Design System for Interactive Fiction Based on Natural Language") which in theory is exactly what I am searching for, but trying to use it it has this model of a space (a dungeon, a library) where you are picking up objects and exploring them. In my case I am designing more a Sims like game, the flow is entirely different. Considering non specific software, mind mapping tools miss the linearity of the process. What I am writing is a directed graph - simply a work-flow, but the way I want to design it is more text based than work-flow based. SO what I'm doing now is using a text editor, which I'll transform directly in code. Any suggestions?

    Read the article

  • how do I setup Apache's Content-Encoding Header?

    - by Nick
    When attempting to validate my site with the W3C validator, it returns the error, "Don't know how to decode Content-Encoding 'none'". Firebug confirms that my server is sending the header, "Content-Encoding: none". But I can't find any directive in apache2.conf or in my vhost that sets the Content-Encoding header. Where does the directive go, and what should it be set to? UPDATE: On further examination it seems something is wrong with mod_deflate (gzip). It's zipping my css files just fine, but is not zipping the html generated by my php scripts. I have: AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css And the pages are showing a mime type of: "text/html". But content encoding is "none" and they aren't zipping. Perhaps these issues are related?

    Read the article

  • Long string insertion with sed

    - by Luis Varca
    I am trying to use this expression to insert the contents of one text file into another after a give string. This is a simple bash script: TEXT=`cat file1.txt` sed -i "/teststring/a \ $TEXT" file2.txt This returns an error, "sed: -e expression #1, char 37: unknown command: `M'" The issue is in the fact that the contents of file1.txt are actually a private certificate so it's a large amount of text and unusual characters which seems to be causing an issue. If I replace $TEXT with a simple ASCII value it works but when it reads the large content of file1.txt it fails with that error. Is there some way to carry out this action? Is my syntax off with sed or my quote placement wrong?

    Read the article

  • Creating Rectangle-based buttons with OnClick events

    - by Djentleman
    As the title implies, I want a Button class with an OnClick event handler. It should fire off connected events when it is clicked. This is as far as I've made it: public class Button { public event EventHandler OnClick; public Rectangle Rec { get; set; } public string Text { get; set; } public Button(Rectangle rec, string text) { this.Rec = rec; this.Text = text; } } I have no clue what I'm doing with regards to events. I know how to use them but creating them myself is another matter entirely. I've also made buttons without using events that work on a case-by-case basis. So basically, I want to be able to attach methods to the OnClick EventHandler that will fire when the Button is clicked (i.e., the mouse intersects Rec and the left mouse button is clicked).

    Read the article

  • Dojo and Separate JavaScript File

    - by Bunch
    For a project I needed to use the ArcGIS API for some mapping. To use this you need to use Dojo but in this case all it really comes down to is adding some require lines and a addOnLoad on your web page. At first everything was working great, the maps rendered and the various layers would populate as needed. Once it was working I started moving the various javascript functions into their own files to keep everything nice and neat. Then the problems started, mainly the map would not show up any more. So that was a pretty big problem. Luckily the fix was pretty simple, just move the dojo.addOnLoad line into it’s own script tag. If I had the dojo.addOnLoad in the same script block as the various require lines it would not work as expected. Works: <script type="text/javascript" language="javascript" src="javascript/test.js" />     <script type="text/javascript">       dojo.require("esri.map");       dojo.require("esri.tasks.locator");       dojo.require("esri.tasks.query");       dojo.require("esri.tasks.geometry");  </script>  <script type="text/javascript">      dojo.addOnLoad(init);  </script> Does not work: <script type="text/javascript" language="javascript" src="javascript/test.js" /> <script type="text/javascript">       dojo.require("esri.map");       dojo.require("esri.tasks.locator");       dojo.require("esri.tasks.query");       dojo.require("esri.tasks.geometry");       dojo.addOnLoad(init); </script> Technorati Tags: JavaScript,Dojo

    Read the article

  • Meta package / quick reference for command line string manipulation tools?

    - by Dylan McCall
    The latest version of the Scribes text editor lets us select some text, hit Alt+X, and then run an arbitrary command. For example, I can run the sort command and the selected text is replaced appropriately. This is quite useful but I am also not very well-versed in awk and the like. Is there something I can grab that will provide more of these commands like sort? Maybe a package with a whole bunch of handy, task-specific string manipulation commands?

    Read the article

  • How to change Content-Transfer-Encoding for email attachments?

    - by user1837811
    I have signed up for http://eudyptula-challenge.org/ challenge and this accepts email attachments only in simple text format. The files without extension (and also .zip file) are transferred with base64 encoding. I want to send email attachment in simple text. This is how my makefile is transferred :- Content-Type: text/plain; charset=UTF-8; name="Makefile" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="Makefile" b2JqLW0gKz0gaGVsbG8yLm8KCmFsbDoKCW1ha2UgLUMgL2xpYi9tb2R1bGVzLyQoc2hlbGwg dW5hbWUgLXIpL2J1aWxkIE09JChQV0QpIG1vZHVsZXMKCmNsZWFuOgoJbWFrZSAtQyAvbGli L21vZHVsZXMvJChzaGVsbCB1bmFtZSAtcikvYnVpbGQgTT0kKFBXRCkgY2xlYW4KCg== How to configure Thunderbird to send a attachment files in as simple text?? Or I should any other email server or tool to send email attachments in simple text??

    Read the article

  • Inserting new 'numbered item' on Word 2010

    - by MarceloRamires
    I have a very simple problem in Word 2010. I have a document with a Table of Contents, and I have the following items: 1. Title 1  [some text]   1.1 Title 1.1    [some text] I simply want to add an item 1.2. If I go at the end of Title 1.1 and press enter, an item 1.2 appears below it, but the text regarding item 1.1 stays below it all. I somehow used to be able to do it on word 2007, but I can't remember what I used to do, and before struggling in it for too long, I remembered SuperUser. Can someone answer this and maybe additionally link me to a tutorial on this ? Every one I find talks about having a text already numbered and adding a TOC in the beginning. I want to build the text all over the TOC.

    Read the article

  • Nginx + PHP-FPM on Centos 6.5 gives me 502 Bad Gateway (fpm error: unable to read what child say: Bad file descriptor)

    - by Latheesan Kanes
    I am setting up a standard LEMP stack. My current setup is giving me the following error: 502 Bad Gateway This is what is currently installed on my server: Here's the configurations I've created/updated so far, can some one take a look at the following and see where the error might be? I've already checked my logs, there's nothing in there (http://i.imgur.com/iRq3ksb.png). And I saw the following in /var/log/php-fpm/error.log file. sidenote: both the nginx and php-fpm has been configured to run under a local account called www-data and the following folders exits on the server nginx.conf global nginx configuration user www-data; worker_processes 6; worker_rlimit_nofile 100000; error_log /var/log/nginx/error.log crit; pid /var/run/nginx.pid; events { worker_connections 2048; use epoll; multi_accept on; } http { include /etc/nginx/mime.types; default_type application/octet-stream; # cache informations about FDs, frequently accessed files can boost performance open_file_cache max=200000 inactive=20s; open_file_cache_valid 30s; open_file_cache_min_uses 2; open_file_cache_errors on; # to boost IO on HDD we can disable access logs access_log off; # copies data between one FD and other from within the kernel # faster then read() + write() sendfile on; # send headers in one peace, its better then sending them one by one tcp_nopush on; # don't buffer data sent, good for small data bursts in real time tcp_nodelay on; # server will close connection after this time keepalive_timeout 60; # number of requests client can make over keep-alive -- for testing keepalive_requests 100000; # allow the server to close connection on non responding client, this will free up memory reset_timedout_connection on; # request timed out -- default 60 client_body_timeout 60; # if client stop responding, free up memory -- default 60 send_timeout 60; # reduce the data that needs to be sent over network gzip on; gzip_min_length 10240; gzip_proxied expired no-cache no-store private auth; gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml; gzip_disable "MSIE [1-6]\."; # Load vHosts include /etc/nginx/conf.d/*.conf; } conf.d/www.domain.com.conf my vhost entry ## Nginx php-fpm Upstream upstream wwwdomaincom { server unix:/var/run/php-fcgi-www-data.sock; } ## Global Config client_max_body_size 10M; server_names_hash_bucket_size 64; ## Web Server Config server { ## Server Info listen 80; server_name domain.com *.domain.com; root /home/www-data/public_html; index index.html index.php; ## Error log error_log /home/www-data/logs/nginx-errors.log; ## DocumentRoot setup location / { try_files $uri $uri/ @handler; expires 30d; } ## These locations would be hidden by .htaccess normally #location /app/ { deny all; } ## Disable .htaccess and other hidden files location /. { return 404; } ## Magento uses a common front handler location @handler { rewrite / /index.php; } ## Forward paths like /js/index.php/x.js to relevant handler location ~ .php/ { rewrite ^(.*.php)/ $1 last; } ## Execute PHP scripts location ~ \.php$ { try_files $uri =404; expires off; fastcgi_read_timeout 900; fastcgi_pass wwwdomaincom; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } ## GZip Compression gzip on; gzip_comp_level 8; gzip_min_length 1000; gzip_proxied any; gzip_types text/plain application/xml text/css text/js application/x-javascript; } /etc/php-fpm.d/www-data.conf my php-fpm pool config ## Nginx php-fpm Upstream upstream wwwdomaincom { server unix:/var/run/php-fcgi-www-data.sock; } ## Global Config client_max_body_size 10M; server_names_hash_bucket_size 64; ## Web Server Config server { ## Server Info listen 80; server_name domain.com *.domain.com; root /home/www-data/public_html; index index.html index.php; ## Error log error_log /home/www-data/logs/nginx-errors.log; ## DocumentRoot setup location / { try_files $uri $uri/ @handler; expires 30d; } ## These locations would be hidden by .htaccess normally #location /app/ { deny all; } ## Disable .htaccess and other hidden files location /. { return 404; } ## Magento uses a common front handler location @handler { rewrite / /index.php; } ## Forward paths like /js/index.php/x.js to relevant handler location ~ .php/ { rewrite ^(.*.php)/ $1 last; } ## Execute PHP scripts location ~ \.php$ { try_files $uri =404; expires off; fastcgi_read_timeout 900; fastcgi_pass wwwdomaincom; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } ## GZip Compression gzip on; gzip_comp_level 8; gzip_min_length 1000; gzip_proxied any; gzip_types text/plain application/xml text/css text/js application/x-javascript; } I've got a file in /home/www-data/public_html/index.php with the code <?php phpinfo(); ?> (file uploaded as user www-data).

    Read the article

  • How do I stop color changes when quitting vi from a terminal emulator?

    - by Michael Warhol
    I have a problem with colors when using vi under Ubuntu 12.04. I'm connecting to my Ubuntu server from a PC, using PowerTerm terminal emulation software. I have PowerTerm set up to display black text on a grey background. When I connect to the Ubuntu box, the screen is fine. When I open a file with vi, the screen is fine. The text is black on a gray background, which is normal for my PowerTerm setup. However, if the file is less than a full screen long, the remainder of the screen is a black background. When I quit vi, the entire background turns black, and the text becomes white. I have to do a Terminal Reset to restore my normal text and background colors. What I want is for there to be no change at all when I use vi. The text should be black and the background grey. I have another server loaded with RedHat 9, and that acts normally; colors don’t change when using vi. Here is my .vimrc file: set compatible syntax off let g:loaded_matchparen=1 set nocp set noincsearch set nohlsearch set noshowmatch set bg=dark I've tried set bg=dark and set bg=light. It makes no difference. Is there some other set command that would clear this up for me, or some TERM setting (my TERM is set to linux)?

    Read the article

  • How to write regex in asp.net MVC 3 razor

    - by anirudha
    here is a small trick to write regex and exceptional code in MVC3. first trick is use @@ instead of @ it’s work don’ worry output goes @ not @@. second trick is that you can use <text></text> tag in MVC to use some code who give error because viewengine does not accept them. like var pattern = new RegExp(/^(("[\w-\s]+")|([\w-]+(?:\.[\w-]+)*)|("[\w-\s]+")([\w-]+(?:\.[\w-]+)*))(@((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$)|(@\[?((25[0-5]\.|2[0-4][0-9]\.|1[0-9]{2}\.|[0-9]{1,2}\.))((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\.){2}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\]?$)/i);   this regex not work because it’s use @ so replace them to @@ e var pattern = new RegExp(/^(("[\w-\s]+")|([\w-]+(?:\.[\w-]+)*)|("[\w-\s]+")([\w-]+(?:\.[\w-]+)*))(@@((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$)|(@@\[?((25[0-5]\.|2[0-4][0-9]\.|1[0-9]{2}\.|[0-9]{1,2}\.))((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\.){2}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\]?$)/i); now it will work. second way is that use text tag in MVC to make them work like <text> function isValidEmailAddress(emailAddress) { var pattern = new RegExp(/^(("[\w-\s]+")|([\w-]+(?:\.[\w-]+)*)|("[\w-\s]+")([\w-]+(?:\.[\w-]+)*))(@((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$)|(@\[?((25[0-5]\.|2[0-4][0-9]\.|1[0-9]{2}\.|[0-9]{1,2}\.))((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\.){2}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[0-9]{1,2})\]?$)/i); return pattern.test(emailAddress); } </text>

    Read the article

  • PDF form created in Libre Office - trouble with form fields and font sizing

    - by soawesomejohn
    I am trying to create a PDF Form using LibreOffice. I can create the form elements and export as PDF. However, the form fields are giving me problems. The text in these fields always centers on the bottom, and often the text you input is cut off at the bottom. I found that if I make the fields larger, the text no longer cuts off, but the field is exceptionally large with lots of space above the text. I have made an odt (source) and a pdf (export) file to show what I'm running into. I tried a number of different fonts and sizes, but to make things easier, I made the field names all "field1" so that once you fill out one entry, all fields show as filled in. http://ytnoc.net/files/sampleapp.odt http://ytnoc.net/files/sampleapp.pdf My main question is how do I make form fields that don't cut off the text without having to make the fields way oversized? Made with LibreOffice 3.3.0

    Read the article

  • Prevent Truncation of Dynamically Generated Results in SQL Server Management Studio

    While working with the Results to Text option in SSMS, you may come across a situation where the output from dynamically generated data is truncated. In this article I will guide you on how to fix this issue and print all the text for the Results to Text option. "SQL Backup Pro 7 improves on an already wonderful product" - Don KolendaHave you tried version 7 yet? Get faster, smaller, fully verified backups. Download a free trial of SQL Backup Pro 7.

    Read the article

  • Creating different margins on the first page of a word template

    - by Paul
    I have a letterhead template and I need the first page left margin to be larger than subsequent pages. I've seen the option of placing a text box or image box in the header to push the text but this ends up throwing off the tabs and bullet list indentation markers. I thought of setting up the first page using two columns and pushing the text to start on the second column but I can't seem to find a way to get the text to switch back to 1 column on the second page when it is created from text overflowing. Does anyone know how something like this is possible? Thanks in advance, Paul

    Read the article

  • What snapshot software do you recommend? [closed]

    - by Charbel
    Possible Duplicate: What screenshot tools are available? Hi, is there a snapshot software for Ubuntu? Something like SnagIt? The idea is many times I have to take the snapshot and edit the image to crop to my region of interest, SnagIt does that automatically and very nicely. Another feature is to take a snapshot of text (that can't be otherwise copied - like text in a photo) and then parse it into actual text document using an OCR technology. Thanks

    Read the article

  • How to use different input language for different active (application) windows?

    - by anvo
    I'm working under 12.04 and suppose I have a Firefox windows active (or in foreground) with English as input language and I need to type a document in other language using some text editor. With the text editor in foreground (or active) and the input language set to a non-English one, when I bring Firefox in foreground (or making it active) the input language remains set to the non-English and the language flag does not switch to English (as it would be expected, since I do not alter the language during the whole Firefox session). Because of this, I have to make extra moves and change the input language manually every time I switch from the text editor to Firefox and back to text editor. This was not happening with 10.04, and each application windows had the corresponding input language set to its default or previous session every time I was bringing it to the foreground! How will I make 12.04 to behave the same way?

    Read the article

  • Useful WatiN Extension Methods

    - by Steve Wilkes
    I've been doing a fair amount of UI testing using WatiN recently – here’s some extension methods I've found useful. This checks if a WatiN TextField is actually a hidden field. WatiN makes no distinction between text and hidden inputs, so this can come in handy if you render an input sometimes as hidden and sometimes as a visible text field. Note that this doesn't check if an input is visible (I've got another extension method for that in a moment), it checks if it’s hidden. public static bool IsHiddenField(this TextField textField) { if (textField == null || !textField.Exists) { return false; } var textFieldType = textField.GetAttributeValue("type"); return (textFieldType != null) && textFieldType.ToLowerInvariant() == "hidden"; } The next method quickly sets the value of a text field to a given string. By default WatiN types the text you give it into a text field one character at a time which can be necessary if you have behaviour you want to test which is triggered by individual key presses, but which most of time is just painfully slow; this method dumps the text in in one go. Note that if it's not a hidden field then it gives it focus first; this helps trigger validation once the value has been set and focus moves elsewhere. public static void SetText(this TextField textField, string value) { if ((textField == null) || !textField.Exists) { return; } if (!textField.IsHiddenField()) { textField.Focus(); } textField.Value = value; } Finally, here's a method which checks if an Element is currently visible. It does so by walking up the DOM and checking for a Style.Display of 'none' on any element between the one on which the method is invoked, and any of its ancestors. public static bool IsElementVisible(this Element element) { if ((element == null) || !element.Exists) { return false; } while ((element != null) && element.Exists) { if (element.Style.Display.ToLowerInvariant().Contains("none")) { return false; } element = element.Parent; } return true; } Hope they come in handy

    Read the article

  • Why is ComboBoxText giving me a "no attribute" error?

    - by boywithaxe
    I'm trying to add a text Combo Box into my app. I've created in and populated the list but when I I try to print out the active text I get an error. Here's the part of the code in question: def on_netif_changed(self, widget): netif = widget.gtk_combo_box_text_get_active_text() print netif And the error I get: Traceback (most recent call last): File "/home/boywithaxe/Developer/Quickly/broadcast/broadcast/BroadcastWindow.py", line 44, in on_netif_changed netif = widget.gtk_combo_box_text_get_active_text() AttributeError: 'ComboBoxText' object has no attribute 'gtk_combo_box_text_get_active_text' I'm a bit at a loss here, I've no problem betting text from text boxes, but this seems a completely different issue. I tried RTFMing but came up short. I would appreciate any suggestions.

    Read the article

  • How to quickly search through a very large list of strings / records on a database

    - by Giorgio
    I have the following problem: I have a database containing more than 2 million records. Each record has a string field X and I want to display a list of records for which field X contains a certain string. Each record is about 500 bytes in size. To make it more concrete: in the GUI of my application I have a text field where I can enter a string. Above the text field I have a table displaying the (first N, e.g. 100) records that match the string in the text field. When I type or delete one character in the text field, the table content must be updated on the fly. I wonder if there is an efficient way of doing this using appropriate index structures and / or caching. As explained above, I only want to display the first N items that match the query. Therefore, for N small enough, it should not be a big issue loading the matching items from the database. Besides, caching items in main memory can make retrieval faster. I think the main problem is how to find the matching items quickly, given the pattern string. Can I rely on some DBMS facilities, or do I have to build some in-memory index myself? Any ideas? EDIT I have run a first experiment. I have split the records into different text files (at most 200 records per file) and put the files in different directories (I used the content of one data field to determine the directory tree). I end up with about 50000 files in about 40000 directories. I have then run Lucene to index the files. Searching for a string with the Lucene demo program is pretty fast. Splitting and indexing took a few minutes: this is totally acceptable for me because it is a static data set that I want to query. The next step is to integrate Lucene in the main program and use the hits returned by Lucene to load the relevant records into main memory.

    Read the article

< Previous Page | 261 262 263 264 265 266 267 268 269 270 271 272  | Next Page >