Search Results

Search found 4071 results on 163 pages for 'preg split'.

Page 99/163 | < Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106  | Next Page >

  • Splitting string on probable English word boundaries

    - by Sean
    I recently used Adobe Acrobat Pro's OCR feature to process a Japanese kanji dictionary. The overall quality of the output is generally quite a bit better than I'd hoped, but word boundaries in the English portions of the text have often been lost. For example, here's one line from my file: softening;weakening(ofthemarket)8 CHANGE [transform] oneselfINTO,takethe form of; disguise oneself I could go around and insert the missing word boundaries everywhere, but this would be adding to what is already a substantial task. I'm hoping that there might exist software which can analyze text like this, where some of the words run together, and split the text on probable word boundaries. Is there such a package? I'm using Emacs, so it'd be extra-sweet if the package in question were already an Emacs package or could be readily integrated into Emacs, so that I could simply put my cursor on a line like the above and repeatedly invoke some command that splits the line on word boundaries in decreasing order of probable correctness.

    Read the article

  • Why doesn't this Perl array sort work?

    - by Luke
    Why won't the array sort? CODE my @data = ('PJ RER Apts to Share|PROVIDENCE', 'PJ RER Apts to Share|JOHNSTON', 'PJ RER Apts to Share|JOHNSTON', 'PJ RER Apts to Share|JOHNSTON', 'PJ RER Condo|WEST WARWICK', 'PJ RER Condo|WARWICK'); foreach my $line (@data) { $count = @data; chomp($line); @fields = split(/\|/,$line); if ($fields[0] eq "PJ RER Apts to Share"){ @city = "\u\L$fields[1]"; @city_sort = sort (@city); print "@city_sort","\n"; } } print "$count","\n"; OUTPUT Providence Johnston Johnston Johnston 6

    Read the article

  • Transforming a string to a valid PDO_MYSQL DSN

    - by Alix Axel
    What is the most concise way to transform a string in the following format: mysql:[/[/]][user[:pass]@]host[:port]/db[/] Into a usuable PDO connection/instance (using the PDO_MYSQL DSN), some possible examples: $conn = new PDO('mysql:host=host;dbname=db'); $conn = new PDO('mysql:host=host;port=3307;dbname=db'); $conn = new PDO('mysql:host=host;port=3307;dbname=db', 'user'); $conn = new PDO('mysql:host=host;port=3307;dbname=db', 'user', 'pass'); I've been trying some regular expressions (preg_[match|split|replace]) but they either don't work or are too complex, my gut tells me this is not the way to go but nothing else comes to my mind. Any suggestions?

    Read the article

  • Garbage collecting at ColdFusion CFC

    - by Sergii
    Hello. I have a CFC as singletone object in Application scope. One of the methods is used for massive data processing and periodically causes the "Java heap space" errors. EDIT All variables inside the method are VAR-scoped, so they should not be kept in the object scope when invokation ended. It can be a bit dumb question for Java people, but I'd like to know how Java garbage collector cleans up the CFC methods memory: only when whole request ends, or maybe right after each method/function invokation? Second option is interesting because it can allow me to split my large method into the few, as one of the possible optimizations.

    Read the article

  • how to check whether for each value in array is a numeric, alphabetical or alphanumeric (Perl)

    - by dexter
    i have array which values are user input like: aa df rrr5 4323 54 hjy 10 gj @fgf %d would be that array, now i want to check each value in array whether its numeric or alphabetic (a-zA-Z) or alphanumeric and save them in other respective arrays i have done: my @num; my @char; my @alphanum; my $str =<>; my @temp = split(" ",$str); foreach (@temp) { print "input : $_ \n"; if ($_ =~/^(\d+\.?\d*|\.\d+)$/) { push(@num,$_); } } this works, similarly i want to check for alphabet, and alphanumeric values note: alphanumeric ex. fr43 6t$ $eed5 *jh

    Read the article

  • Is their an optimal config/format for a TIFF when using Tesseract or other OCR?

    - by Zando
    I'm having a bizarre problem with Tesseract. I have a name, "Janice" that is in a 200x40 pixel tiff, that Tesseract interprets as a blank. I'm running hundreds of names through Tesseract and they are processed fine. What I'm actually doing, though, is breaking up a larger TIFF into smaller tiffs of one word each. In the larger TIFF, tesseract recognizes "Janice". What could cause it to hiccup in a TIFF that solely contains that word (and there's enough space around the word to not truncate any of the pixels)? I'm using ImageMagick to split the big TIFF, are there options I should set when reconstituting the new TIFF files?

    Read the article

  • python appengine form-posted utf8 file issue

    - by khany
    hi, i am trying to form-post a sql file that consists on many INSERTS, eg. INSERT INTO `TABLE` VALUES ('abcdé', 2759); then i use re.search to parse it and extract the fields to put into my own datastore. The problem is that, although the file contains accented characters (see the e is a é), once uploaded it loses it and either errors or stores a bytestring representation of it. Heres what i am currently using (and I have tried loads of alternatives): form = cgi.FieldStorage() uFile = form['sql'] uSql = uFile.file.read() lineX = uSql.split("\n") # to get each line and so on. has anyone got a robust way of making this work? remember i am on appengine so access to some libraries is restricted/forbidden

    Read the article

  • jQuery plugin options: required, optional, inaccessible

    - by Trevor Hartman
    I'm curious how to specify options to a jQuery plugin in a way that some are required, some are optionally overridden, and some can't be touched. I started off with the usual: jQuery.fn.plugin = function (options){ var defaults = { username: "", posts:10, api: "http://myapi.com" } var settings = jQuery.extend({}, defaults, options); } Let's say I want username to be required, posts is optional (defaults to 10) and you (you being the user of the plugin) can't change api, even if they try. Ideally, they'd all still be in the same data structure instead of being split into separate objects. Ideas?

    Read the article

  • Opencv: Converting hue image to RGB image

    - by jhaip
    I am trying to show the hue component of the image from my webcam. I have split apart the image into the hue component but I can't figure out how to show the hue component as the pure colors. For example if one pixel of the image was B=189 G=60 R=60 then in HSV, H=0. I don't want the draw image to be the the gray values of hue but the RGB equivalent of the hue or H=0 - B=0 G=0 R=255 IplImage *image, *imageHSV, *imageHue; image = cvQueryFrame(capture); //image from webcam imageHSV = cvCreateImage( cvGetSize(image), IPL_DEPTH_8U, 3 ); imageHue = cvCreateImage( cvGetSize(image), IPL_DEPTH_8U, 1 ); cvCvtColor( image, imageHSV, CV_BGR2HSV ); cvSplit( imageHSV, imageHue, 0, 0, 0 ); I have a feeling there is a simple solution so any help is appreciated.

    Read the article

  • /regexp?/ on HTML, but not in form

    - by takeshin
    I need to do some regex replacement on HTML input, but I need to exclude some parts from filtering by other regexp. (e.g. remove all <a> tags with specific href="example.com…, except the ones that are inside the <form> tag) Is there any smart regex technique for this? Or do I have to find all forms using $regex1, then split the input to the smaller chunks, excluding the matched text blocks, and then run the $regex2 on all the chunks?

    Read the article

  • Converting a JSP to a SharePoint webpart

    - by Kelly French
    We have a large number of Java based servlets/portlets running in a BEA portal that we want to convert into SharePoint 2007 webparts. Many of the portlets use user preferences but the implementations are split between preferences being handled by the portlet directly and stored in a separate database from the portal. Others are using the BEA WebLogic API for user preferences. Three questions: Has anyone gotten a Java Servlet/JSP (compiled against JRE 1.4.2 and running on Tomcat 4.1) to run as a SharePoint 2007 webpart? How large of an effort was it in general (as in, was it measured in days/weeks/months)? Would it be easier to rewrite the portlet as native webparts at least as far as user preferences are concerned?

    Read the article

  • try...else...except syntax error

    - by iform
    I can't understand this... Cannot get this code to run and I've no idea why it is a syntax error. try: newT.read() #existingArtist = newT['Exif.Image.Artist'].value #existingKeywords = newT['Xmp.dc.subject'].value except KeyError: print "KeyError" else: #Program will NOT remove existing values newT.read() if existingArtist != "" : newT['Exif.Image.Artist'] = artistString print existingKeywords keywords = os.path.normpath(relativePath).split(os.sep) print keywords newT['Xmp.dc.subject'] = existingKeywords + keywords newT.write() except: print "Cannot write tags to ",filePath Syntax error occurs on the last "except:". Again...I have no idea why python is throwing a syntax error (spent ~3hrs on this problem).

    Read the article

  • Scrapy Could not find spider Error

    - by Nacari
    I have been trying to get a simple spider to run with scrapy, but keep getting the error: Could not find spider for domain:stackexchange.com when I run the code with the expression scrapy-ctl.py crawl stackexchange.com. The spider is as follow: from scrapy.spider import BaseSpider from __future__ import absolute_import class StackExchangeSpider(BaseSpider): domain_name = "stackexchange.com" start_urls = [ "http://www.stackexchange.com/", ] def parse(self, response): filename = response.url.split("/")[-2] open(filename, 'wb').write(response.body) SPIDER = StackExchangeSpider()` Another person posted almost the exact same problem months ago but did not say how they fixed it, http://stackoverflow.com/questions/1806990/scrapy-spider-is-not-working I have been following the turtorial exactly at http://doc.scrapy.org/intro/tutorial.html, and cannot figure out why it is not working.

    Read the article

  • Tokenize problem in Java with separator ". "

    - by user112976
    I need to split a text using the separator ". ". For example I want this string : Washington is the U.S Capital. Barack is living there. To be cut into two parts: Washington is the U.S Capital. Barack is living there. Here is my code : // Initialize the tokenizer StringTokenizer tokenizer = new StringTokenizer("Washington is the U.S Capital. Barack is living there.", ". "); while (tokenizer.hasMoreTokens()) { System.out.println(tokenizer.nextToken()); } And the output is unfortunately : Washington is the U S Capital Barack is living there Can someone explain what's going on?

    Read the article

  • MX Records - go to two servers?

    - by Jim Beam
    Right now I have a single mail server for IMAP. Let's say I want to introduce Exchange but not all users will be on it. Some users will be on my "legacy" IMAP, others on the "new" Exchange. Is it possible to "split up" your users (from the same e-mail domain) on two services like this? What would the MX records look like? My guess is that this isn't possible, but thought I'd ask. By the way, I realize that Exchange can offer IMAP and all that, but my question is more about splitting users across services and the MX records. The actual protocols above are only examples.

    Read the article

  • How can you toggle between two sets of values per data series in flot?

    - by Jedidja
    flot has built-in support for multiple data series (sample code) and also dual-axis (sample code). Assuming multiple data series (water, electricity, etc) that each have an amount (usage) and a dollar value (charge for that usage), what would the best way be to to use flot to display either the amount or dollar values for all the data series, while still supporting toggling display for each individual series? The idea is to send down all the data in one GET request and then let the client take care of everything else in Javascript. Ideally we could use triplets somehow {date, amount, charge}, and then possibly split that into two arrays for flot.

    Read the article

  • How to write stored procedures to separate files with mysqldump?

    - by Jader Dias
    The mysqldump option --tab=path writes the creation script of each table in a separate file. But I can't find the stored procedures, except in the screen dump. I need to have the stored procedures also in separate files. The current solution I am working on is to split the screen dump programatically. Is there a easier way? The code I am using so far is: mysqldump -p$PASSWORD --routines --skip-dump-date --no-create-info --no-data --skip-opt $DATABASE > $BACKUP_PATH/$DATABASE.sql mysqldump -p$PASSWORD --tab=$BACKUP_PATH --skip-dump-date --no-data --skip-opt $DATABASE

    Read the article

  • .toggle(true) throw null in $(document).ready(function())

    - by James123
    I am toggling row siblings. I wrote .toggle(true) when document ready. see below picture. I think row sibling are not availble before this function calls. $(document).ready(function() { $('tr[@class^=RegText]').hide().children('td'); list_Visible_Ids = []; var idsString, idsArray; idsString = $('#myVisibleRows').val(); idsArray = idsString.split(','); $.each(idsArray, function() { if (this != "") { $(this).siblings('.RegText').toggle(true); list_Visible_Ids[this] = 1; } }); How to resolve this? why sliblings are not avaible in when document is ready?

    Read the article

  • MongoDB Schema Design - Real-time Chat

    - by Nick
    I'm starting a project which I think will be particularly suited to MongoDB due to the speed and scalability it affords. The module I'm currently interested in is to do with real-time chat. If I was to do this in a traditional RDBMS I'd split it out into: Channel (A channel has many users) User (A user has one channel but many messages) Message (A message has a user) The the purpose of this use case, I'd like to assume that there will be typically 5 channels active at one time, each handling at most 5 messages per second. Specific queries that need to be fast: Fetch new messages (based on an bookmark, time stamp maybe, or an incrementing counter?) Post a message to a channel Verify that a user can post in a channel Bearing in mind that the document limit with MongoDB is 4mb, how would you go about designing the schema? What would yours look like? Are there any gotchas I should watch out for?

    Read the article

  • Problems with XAML WPF 4.0 Editor in VS2010

    - by RTPeat
    Wondering if anybody else has found some very odd behaviour with the XAML/WPF 4 editor in VS2010. This only occurs if the project is using .NET 4. Whenever I tried to open a XAML document for editing, the window would appear to open for a split second and then vanish, but VS2010 would still list the window as open. The fault was eventually traced to having the "Reuse current document window, if saved" option under "Documents" in the "Environment" options checked. Once this was unchecked XAML 4 files opened as expected. As I said, this only appears to occur on projects targeted at .NET Framework 4 - those targeted at 3.5 worked without a problem, and the "Reuse current document window, if saved" appears to work fine on other files.

    Read the article

  • Handling newline character in input between Windows and Linux

    - by Fazal
    I think this is a standard problem which may have been asked before but I could not get the exact answer so posting the issue. The issue is that our server is running on a linux box. We access the server over the browser on a window box to enter data into field which is supposed to contain multiple lines which user can enter by pressing the enter key after each line Abc Def GHI When this input field (this is a text area),is read on the linux machine, we want to split the data based on new line character. I had three question on this. Does the incoming data contain "\r\n" or "\n" If incoming data does contain "\r\n", the linux line.separator property (vm property) would not work for me as it would say "\n" and therefore may leave "\r" in the data. If "\r" is left in the data, if I open the file on a windows machine, will this mean a newline character? Finally can anyone tell me the standard way to deal with this issue?

    Read the article

  • SQL Server Table Partitioning, what is happening behind the scenes?

    - by user404463
    I'm working with table partitioning on extremely large fact table in a warehouse. I have executed the script a few different ways. With and without non clustered indexes. With indexes it appears to dramatically expand the log file while without the non clustered indexes it appears to not expand the log file as much but takes more time to run due to the rebuilding of the indexes. What I am looking for is any links or information as to what is happening behind the scene specifically to the log file when you split a table partition.

    Read the article

  • How do you sort files numerically?

    - by Zachary Young
    Hello all, First off, I'm posting this because when I was looking for a solution to the problem below, I could not find one on stackoverflow. So, I'm hoping to add a little bit to the knowledge base here. I need to process some files in a directory and need the files to be sorted numerically. I found some examples on sorting--specifically with using the lamba pattern--at wiki.python.org, and I put this together: #!env/python import re tiffFiles = """ayurveda_1.tif ayurveda_11.tif ayurveda_13.tif ayurveda_2.tif ayurveda_20.tif ayurveda_22.tif""".split('\n') numPattern = re.compile('_(\d{1,2})\.', re.IGNORECASE) tiffFiles.sort(cmp, key=lambda tFile: int(numPattern.search(tFile).group(1))) print tiffFiles I'm still rather new to Python and would like to ask the community if there are any improvements that can be made to this: shortening the code up (removing lambda), performance, style/readability? Thank you, Zachary

    Read the article

  • understanding this regex

    - by DarthVader
    I m trying to understand what the following does. ^([^=]+)(?:(?:\\=)(.+))?$ Any ideas? This is being used here. Obviously it s command line parser but i m trying to understand the syntax so i can actually run the program. This is from commandline-jmxclient , they have no documents on setting JMX properties but in their source code, there is such an option, so i just want to understand how i can invoke that method. Matcher m = Client.CMD_LINE_ARGS_PATTERN.matcher(command); if ((m == null) || (!m.matches())) { throw new ParseException("Failed parse of " + command, 0); } this.cmd = m.group(1); if ((m.group(2) != null) && (m.group(2).length() > 0)) this.args = m.group(2).split(","); else this.args = null;

    Read the article

  • Characters in string changed after downloading HTML from the internet.

    - by Callum Rogers
    Using the following code, I can download the HTML of a file from the internet: WebClient wc = new WebClient(); // .... string downloadedFile = wc.DownloadString("http://www.myurl.com/"); However, sometimes the file contains "interesting" characters like é to é, ? to ↠and ????? to フシギダãƒ. I think it may be something to do with different unicode types or something, as each character gets changed into 2 new ones, perhaps each character being split in half but I have very little knowledge in this area. What do you think is wrong?

    Read the article

< Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106  | Next Page >