Search Results

Search found 47210 results on 1889 pages for 'text input'.

Page 250/1889 | < Previous Page | 246 247 248 249 250 251 252 253 254 255 256 257  | Next Page >

  • Are there any well known algorithms to detect the presence of names?

    - by Rhubarb
    For example, given a string: "Bob went fishing with his friend Jim Smith." Bob and Jim Smith are both names, but bob and smith are both words. Weren't for them being uppercase, there would be less indication of this outside of our knowledge of the sentence. Without doing grammar analysis, are there any well known algorithms for detecting the presence of names, at least Western names?

    Read the article

  • Convert array to CSV/TSV-formated string in Python.

    - by dreeves
    Python provides csv.DictWriter for outputting CSV to a file. What is the simplest way to output CSV to a string or to stdout? For example, given a 2D array like this: [["a b c", "1,2,3"], ["i \"comma-heart\" you", "i \",heart\" u, too"]] return the following string: "a b c, \"1, 2, 3\"\n\"i \"\"comma-heart\"\" you\", \"i \"\",heart\"\" u, too\"" which when printed would look like this: a b c, "1,2,3" "i ""heart"" you", "i "",heart"" u, too" (I'm taking csv.DictWriter's word for it that that is in fact the canonical way to output that array as CSV. Excel does parse it correctly that way, though Mathematica does not. From a quick look at the wikipedia page on CSV it seems Mathematica is wrong.) One way would be to write to a temp file with csv.DictWriter and read it back with csv.DictReader. What's a better way? TSV instead of CSV It also occurs to me that I'm not wedded to CSV. TSV would make a lot of the headaches with delimiters and quotes go away: just replace tabs with spaces in the entries of the 2D array and then just intersperse tabs and newlines and you're done. Let's include solutions for both TSV and CSV in the answers to make this as useful as possible for future searchers.

    Read the article

  • tfidf, am I understanding it right?

    - by alskndalsnd
    Hey everyone, I am interested in doing some document clustering, and right now I am considering using TF-IDF for this. If I am not wrong, TFIDF is particularly used for evaluating the relevance of a document given a query. If I do not have a particular query, how can I apply tfidf to clustering?

    Read the article

  • Sphinx search distributed index tuning

    - by Andriy Bohdan
    I'm deciding how to split 3 large sphinx indexes between 3 servers. Each of the 3 indexes is searched separately. What's more effective: to host each index on separate machine Example machine1 - index1 machine2 - index2 machine3 - index3 or to split each index into 3 parts and host each part of the same index on separate machine. Example machine1 - index1_chunk1, index2_chunk1, index3_chunk1 machine2 - index1_chunk2, index2_chunk2, index3_chunk2 machine3 - index1_chunk3, index2_chunk3, index3_chunk3 ?

    Read the article

  • Unset core.editor in Msysgit

    - by mathee
    I set my editor per an SO entry: http://stackoverflow.com/questions/780425/how-do-i-setup-diffmerge-with-msysgit-gitk. I'm wondering how to undo this because I want to switch back to the default editing program.

    Read the article

  • TextMate - must-have Bundles and Plugins for web dev

    - by dscher
    Just curious what experienced Textmate users can't live without in the program. I just ran the trial and bought the program so I'm trying to get a sense of how others might setup their development environment for web development. Also, based on the fact that I just bought the program, I am going to guess that TM2 will come out next week. Yes, that's right, next week. Unfortunately, because of my luck, it will not be a free upgrade...upgrades will cost more.

    Read the article

  • Textually diffing JSON

    - by Richard Levasseur
    As part of my release processes, I have to compare some JSON configuration data used by my application. As a first attempt, I just pretty-printed the JSON and diff'ed them (using kdiff3 or just diff). As that data has grown, however, kdiff3 confuses different parts in the output, making additions look like giant modifies, odd deletions, etc. It makes it really hard to figure out what is different. I've tried other diff tools, too (meld, kompare, diff, a few others), but they all have the same problem. Despite my best efforts, I can't seem to format the JSON in a way that the diff tools can understand. Example data: [ { "name": "date", "type": "date", "nullable": true, "state": "enabled" }, { "name": "owner", "type": "string", "nullable": false, "state": "enabled", } ...lots more... ] The above probably wouldn't cause the problem (the problem occurs when there begin to be hundreds of lines), but thats the gist of what is being compared. Thats just a sample; the full objects are 4-5 attributes, and some attributes have 4-5 attributes in them. The attribute names are pretty uniform, but their values pretty varied. In general, it seems like all the diff tools confuse the closing "}" with the next objects closing "}". I can't seem to break them of this habit. I've tried adding whitespace, changing indentation, and adding some "BEGIN" and "END" strings before and after the respective objects, but the tool still get confused.

    Read the article

  • Best way to handle SQL Server fulltext index updates

    - by tlianza
    Hi all, I have a fulltext index which doesn't need to be immediately up-to-date, I'd like to spare myself the I/O (when I do bulk updates, I see a ton of I/O related to the index) and do the index updates during low usage times (nightly, perhaps even weekly). It seems there are two ways to go about this: Turn off change tracking (SET CHANGE_TRACKING OFF) and add a timestamp field to the indexed table, so that you can run alter fulltext index on <table> start INCREMENTAL population, or Enable change tracking, but set it to MANUAL, so that you can run alter fulltext index on <table> start UPDATE population when you need it updated. Is there a preferred method? I couldn't tell from this overview if there was a performance benefit one way or the other. Tom

    Read the article

  • Drag drop open file in Macvim split window?

    - by Jon
    Hello. I like to use the split window feature in Vim. However I cannot seem to drag drop new files into the different sections. Doing so will just open a new tab. I don't like using tabs as I still need to flick between them and not much different to using separate windows. Is there anyway I can change this behaviour? It works fine on Windows gVim and Im using the same vimrc file.

    Read the article

  • How Do I Convert Pipe Delimited to Comma Delimited with Escaping

    - by Russ Bradberry
    Hi, I am fairly new to scala and I have the need to convert a string that is pipe delimited to one that is comma delimited, with the values wrapped in quotes and any quotes escaped by "\" in c# i would probably do this like this string st = "\"" + oldStr.Replace("\"", "\\\\\"").Replace("|", "\",\"") + "\"" I haven't validated that actually works but that is the basic idea behind what I am trying to do. Is there a way to do this easily in scala?

    Read the article

  • Ruby File IO question; Maintain file read position between script executions

    - by macek
    I have two files a.txt and b.txt (henceforth a and b). My script iterates through a, does some operation, and potentially inserts a line to b. In the event the script stops, I need it to pick up where it left off. In the example below: foo was copied to b bar was copied to b zim was not copied to b (did not pass some criteria) gaz was copied to b Script stops (for whatever reason) When script starts again, how to open a and start on line "dib"? a.txt foo bar zim gaz // <= last successful copy dib // <= I want to start here on next script execution gir b.txt foo bar gaz // <= note omission of "zim" above gaz

    Read the article

  • messy css indentation in vim

    - by hasen j
    When editing an html file in vim, the indentation for css inside style tags is messy. For instance, this is how it would indent this sample css code without any manual intervention to fix the indentation on my part: div.class { color: white; backgroung-color: black; } Why is this happening? how can I fix it?

    Read the article

  • Perl: parsing string enclosed by double quotes

    - by sfactor
    I need to parse tab/space delimited files that have a lot of columns in Perl. The values are such that the there are large strings enclosed within double quotes. These strings can have any characters such as tabs and spaces or anything else. When I try to parse them with the split function it splits these strings as well. Now how can I make perl understand that the strings within the " " are a single column entry? A simple example is, 12 345546.67677 "Hello World!!!" -567.55656 0.5465767 "Hello_Again; "

    Read the article

  • How to populate data from .txt file into Excel in VBA?

    - by swei
    I'm trying to create something to read data from a .txt file, then populate data into .xls, but after open the .txt file, how do I get the data out? Basically I'm trying to get the the third column of the lines dated '04/06/2010'. After I open the .txt file, when I use ActiveSheet.Cells(row, col), the ActiveSheet is not pointing to .txt file. My .txt file is like this (space delimited): 04/05/10 23 29226 04/05/10 24 26942 04/06/10 1 23166 04/06/10 2 22072 04/06/10 3 21583 04/06/10 4 21390 Here is the code I have: Dim BidDate As Date BidDate = '4/6/2010' Workbooks.OpenText Filename:=ForecastFile, StartRow:=1, DataType:=xlDelimited, Space:=True If Err.Number = 1004 Then MsgBox ("The forecast file " & ForecastFile & " was not found.") Exit Sub End If On Error GoTo 0 Dim row As Integer, col As Integer row = 1 col = 1 cell_value = activeSheet.Cells(row, col) MsgBox ("the cell_value=" & cell_value) Do While (cell_value <> BidDate) And (cell_value <> "") row = row + 1 cell_value = activeSheet.Cells(row, col) ' MsgBox ("the value is " & cell_value) Loop If cell_value = "" Then MsgBox ("A load forecast for " & BidDate & " was not found in your current load forecast file titled '" + ForecastFile + ". " + "Make sure you have a load forecast for the current bid date and then open this spreadsheet again.") ActiveWindow.Close Exit Sub End If Can anyone point out where it goes wrong here?

    Read the article

  • Speech.Recognition GrammarBuilder/Choices Tree Structure

    - by user2210179
    In playing around with C#'s Speech Recognition, I've stumbled across a road block in the creation of an effective GrammerBuilder with Choices (more specifically, Choices of Choices). IE considering the following logical commands. One solution would to "hard code" every combination of Speech lines and add them to a GrammarBuilder (ie "SET LEFT COLOR RED" and "SET RIGHT CLEAR", however, this would quickly max out the limit of 1024, especially when dealing with number combinations. Another solution would to Append all 'columns' as "Choices" (and filter out incorrect paths upon 'recognition', however this seems like it's processor heavy and unnecessary. The middle ground, seems like the best path - with Choices of Choices - like a tree structure on a GrammarBuilder - however I'm not sure how to proceed. Any suggestions?

    Read the article

  • Importing a large delimited file to a MySQL table

    - by Tom
    I have this large (and oddly formatted txt file) from the USDA's website. It is the NUT_DATA.txt file. But the problem is that it is almost 27mb! I was successful in importing the a few other smaller files, but my method was using file_get_contents which it makes sense why an error would be thrown if I try to snag 27+ mb of RAM. So how can I import this massive file to my MySQL DB without running into a timeout and RAM issue? I've tried just getting one line at a time from the file, but this ran into timeout issue. Using PHP 5.2.0. Here is the old script (the fields in the DB are just numbers because I could not figure out what number represented what nutrient, I found this data very poorly document. Sorry about the ugliness of the code): <? $file = "NUT_DATA.txt"; $data = split("\n", file_get_contents($file)); // split each line $link = mysql_connect("localhost", "username", "password"); mysql_select_db("database", $link); for($i = 0, $e = sizeof($data); $i < $e; $i++) { $sql = "INSERT INTO `USDA` (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17) VALUES("; $row = split("\^", trim($data[$i])); // split each line by carrot for ($j = 0, $k = sizeof($row); $j < $k; $j++) { $val = trim($row[$j], '~'); $val = (empty($val)) ? 0 : $val; $sql .= ((empty($val)) ? 0 : $val) . ','; // this gets rid of those tildas and replaces empty strings with 0s } $sql = rtrim($sql, ',') . ");"; mysql_query($sql) or die(mysql_error()); // query the db } echo "Finished inserting data into database.\n"; mysql_close($link); ?>

    Read the article

  • Get highest frequency terms from Lucene index

    - by Julia
    Hello! i need to extract terms with highest frequencies from several lucene indexes, to use them for some semantic analysis. So, I want to get maybe top 30 most occuring terms(still did not decide on threshold, i will analyze results) and their per-index counts. I am aware that I might lose some precision because of potentionally dropped duplicates, but for now, lets say i am ok with that. So for the proposed solutions, (needless to say maybe) speed is not important, since I would do static analysis, I would put accent on simplicity of implementation because im not so skilled with Lucene (not the programming guru too :/ ) and cant wrap my mind around many concepts of it.. I can not find any code samples from something similar, so all concrete advices (code, pseudocode, links to code samples...) I will apretiate very much!!! Thank you!

    Read the article

< Previous Page | 246 247 248 249 250 251 252 253 254 255 256 257  | Next Page >