Search Results

Search found 641 results on 26 pages for 'handwriting recognition'.

Page 9/26 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >

OpenCV: Shift/Align face image relative to reference Image (Image Registration)

- by Abhischek

I am new to OpenCV2 and working on a project in emotion recognition and would like to align a facial image in relation to a reference facial image. I would like to get the image translation working before moving to rotation. Current idea is to run a search within a limited range on both x and y coordinates and use the sum of squared differences as error metric to select the optimal x/y parameters to align the image. I'm using the OpenCV face_cascade function to detect the face images, all images are resized to a fixed (128x128). Question: Which parameters of the Mat image do I need to modify to shift the image in a positive/negative direction on both x and y axis? I believe setImageROI is no longer supported by Mat datatypes? I have the ROIs for both faces available however I am unsure how to use them. void alignImage(vector<Rect> faceROIstore, vector<Mat> faceIMGstore) { Mat refimg = faceIMGstore[1]; //reference image Mat dispimg = faceIMGstore[52]; // "displaced" version of reference image //Rect refROI = faceROIstore[1]; //Bounding box for face in reference image //Rect dispROI = faceROIstore[52]; //Bounding box for face in displaced image Mat aligned; matchTemplate(dispimg, refimg, aligned, CV_TM_SQDIFF_NORMED); imshow("Aligned image", aligned); } The idea for this approach is based on Image Alignment Tutorial by Richard Szeliski Working on Windows with OpenCV 2.4. Any suggestions are much appreciated.

Read the article
Writing a program which uses voice recogniton... where should I start?

- by Katsideswide

Hello! I'm a design student currently dabbling with Arduino code (based on c/c++) and flash AS3. What I want to do is to be able to write a program with a voice control input. So, program prompts user to spell a word. The user spells out the word. The program recognizes if this is right, adds one to a score if it's correct, and corrects the user if it's wrong. So I'm seeing a big list of words, each with an audio file of the word being read out, with the voice recognition part checking to see if the reply matches the input. Ideally i'd like to be able to interface this with an Arduino microcontroller so that a physical output with a motor could be achieved in reaction also. Thing is i'm not sure if I can make this program in flash, in Processing (associated with arduino) or if I need another CS3 program-making-program. I guess I need to download a good voice recognizing program, but how can I interface this with anything else? Also, I'm on a mac. (not sure if this makes a difference) I apologize for my cluelessness, any hints would be great! -Susan

Read the article
iPhone SDK 3.2 UIGestureRecognizer interfering with UIView animations?

- by Brian Cooley

Are there known issues with gesture recognizers and the UIView class methods for animation? I am having problems with a sequence of animations on a UIImageView from UIGestureRecognizer callback. If the sequence of animations is started from a standard callback like TouchUpInside, the animation works fine. If it is started via the UILongPressGestureRecognizer, then the first animation jumps to the end and the second animation immediately begins. Here's a sample that illustrates my problem. In the .xib for the project, I have a UIImageView that is connected to the viewToMove IBOutlet. I also have a UIButton connected to the startButton IBOutlet, and I have connected its TouchUpInside action to the startButtonClicked IBAction. The TouchUpInside action works as I want it to, but the longPressGestureRecognizer skips to the end of the first animation after about half a second. When I NSLog the second animation (animateTo200) I can see that it is called twice when a long press starts the animation but only once when the button's TouchUpInside action starts the animation. - (void)viewDidLoad { [super viewDidLoad]; UILongPressGestureRecognizer *longPressRecognizer = [[UILongPressGestureRecognizer alloc] initWithTarget:self action:@selector(startButtonClicked)]; NSArray *recognizerArray = [[NSArray alloc] initWithObjects:longPressRecognizer, nil]; [startButton setGestureRecognizers:recognizerArray]; [longPressRecognizer release]; [recognizerArray release]; } -(IBAction)startButtonClicked { if (viewToMove.center.x < 150) { [self animateTo200:@"Right to left" finished:nil context:nil]; } else { [self animateTo100:@"Right to left" finished:nil context:nil]; } } -(void)animateTo100:(NSString *)animationID finished:(NSNumber *)finished context:(void *)context { [UIView beginAnimations:@"Right to left" context:nil]; [UIView setAnimationDuration:4]; [UIView setAnimationDelegate:self]; [UIView setAnimationDidStopSelector:@selector(animateTo200:finished:context:)]; viewToMove.center = CGPointMake(100.0, 100.0); [UIView commitAnimations]; } -(void)animateTo200:(NSString *)animationID finished:(NSNumber *)finished context:(void *)context { [UIView beginAnimations:@"Left to right" context:nil]; [UIView setAnimationDuration:4]; viewToMove.center = CGPointMake(200.0, 200.0); [UIView commitAnimations]; }

Read the article
Intercepting/Hijacking iPhone Touch Events for MKMapView

- by Shawn

Is there a bug in the 3.0 SDK that disables real-time zooming and intercepting the zoom-in gesture for the MKMapView? I have some real simple code so I can detect tap events, but there are two problems: zoom-in gesture is always interpreted as a zoom-out none of the zoom gestures update the Map's view in realtime. In hitTest, if I return the "map" view, the MKMapView functionality works great, but I don't get the opportunity to intercept the events. Any ideas? MyMapView.h: @interface MyMapView : MKMapView { UIView *map; } MyMapView.m: - (id)initWithFrame:(CGRect)frame { if (![super initWithFrame:frame]) return nil; self.multipleTouchEnabled = true; return self; } - (UIView *)hitTest:(CGPoint)point withEvent:(UIEvent *)event { NSLog(@"Hit Test"); map = [super hitTest:point withEvent:event]; return self; } - (void)touchesCancelled:(NSSet *)touches withEvent:(UIEvent *)event { NSLog(@"%s", __FUNCTION__); [map touchesCancelled:touches withEvent:event]; } - (void)touchesBegan:(NSSet *)touches withEvent:(UIEvent*)event { NSLog(@"%s", __FUNCTION__); [map touchesBegan:touches withEvent:event]; } - (void)touchesMoved:(NSSet*)touches withEvent:(UIEvent*)event { NSLog(@"%s, %x", __FUNCTION__, mViewTouched); [map touchesMoved:touches withEvent:event]; } - (void)touchesEnded:(NSSet*)touches withEvent:(UIEvent*)event { NSLog(@"%s, %x", __FUNCTION__, mViewTouched); [map touchesEnded:touches withEvent:event]; }

Read the article
Recognize objects in image

- by DoomStone

Hello I am in the process of doing a school project, where we have a robot driving on the ground in between Flamingo plates. We need to create an algorithm that can identify the locations of these plates, so we can create paths around them (We are using A Star for that). So far have we worked with AForged Library and we have created the following class, the only problem with this is that when it create the rectangles dose it not take in account that the plates are not always parallel with the camera border, and it that case will it just create a rectangle that cover the whole plate. So we need to some way find the rotation on the object, or another way to identify this. I have create an image that might help explain this Image the describe the problem: http://img683.imageshack.us/img683/9835/imagerectangle.png Any help on how I can do this would be greatly appreciated. Any other information or ideers are always welcome. public class PasteMap { private Bitmap image; private Bitmap processedImage; private Rectangle[] rectangels; public void initialize(Bitmap image) { this.image = image; } public void process() { processedImage = image; processedImage = applyFilters(processedImage); processedImage = filterWhite(processedImage); rectangels = extractRectangles(processedImage); //rectangels = filterRectangles(rectangels); processedImage = drawRectangelsToImage(processedImage, rectangels); } public Bitmap getProcessedImage { get { return processedImage; } } public Rectangle[] getRectangles { get { return rectangels; } } private Bitmap applyFilters(Bitmap image) { image = new ContrastCorrection(2).Apply(image); image = new GaussianBlur(10, 10).Apply(image); return image; } private Bitmap filterWhite(Bitmap image) { Bitmap test = new Bitmap(image.Width, image.Height); for (int width = 0; width < image.Width; width++) { for (int height = 0; height < image.Height; height++) { if (image.GetPixel(width, height).R > 200 && image.GetPixel(width, height).G > 200 && image.GetPixel(width, height).B > 200) { test.SetPixel(width, height, Color.White); } else test.SetPixel(width, height, Color.Black); } } return test; } private Rectangle[] extractRectangles(Bitmap image) { BlobCounter bc = new BlobCounter(); bc.FilterBlobs = true; bc.MinWidth = 5; bc.MinHeight = 5; // process binary image bc.ProcessImage( image ); Blob[] blobs = bc.GetObjects(image, false); // process blobs List<Rectangle> rects = new List<Rectangle>(); foreach (Blob blob in blobs) { if (blob.Area > 1000) { rects.Add(blob.Rectangle); } } return rects.ToArray(); } private Rectangle[] filterRectangles(Rectangle[] rects) { List<Rectangle> Rectangles = new List<Rectangle>(); foreach (Rectangle rect in rects) { if (rect.Width > 75 && rect.Height > 75) Rectangles.Add(rect); } return Rectangles.ToArray(); } private Bitmap drawRectangelsToImage(Bitmap image, Rectangle[] rects) { BitmapData data = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb); foreach (Rectangle rect in rects) Drawing.FillRectangle(data, rect, Color.Red); image.UnlockBits(data); return image; } }

Read the article
LoadDictation with SAPI

- by Naveen

I am able to create alternate dictation grammars using the dictation resource kit or directions given here. I am not able to load the new dictation topic with c++. I am trying to modify the simpledict sample provided with the sapi5.1 sdk. The following doesn't work. std::wstring stemp = s2ws("grammar:dictation#Genre"); LPCWSTR mygrammar = stemp.c_str(); hr = m_cpDictationGrammar-LoadDictation(mygrammar, SPLO_STATIC);

Read the article
OpenCV: Traincascade fails "Assertion failed _img.cols == winSize.width"

- by Josef St.

Anybody has an idea what OpenCV Error: Assertion failed _img.cols == winSize.width means? I'm not familar with the new implemenation of the haar training (=traincascade) nor could I find any documentation in the wiki. Thanks, Josef

Read the article
Android: Voice Recording and saving audio

- by user1320912

I am working on application that will record the voice of the user and save the file on the SD card and then allow the user to listen to the audio again. I am able to allow the user to record his voice using the RecognizerIntent, but I cant figure out how to save the audio file and allow the user to hear the audio. I would appreciate it if someone could help me out. I have displayed my code below: // Setting up the onClickListener for Audio Button attachVoice = (Button) findViewById(R.id.AttachVoice_questionandanswer); attachVoice.setOnClickListener(new OnClickListener() { public void onClick(View v) { Intent voiceIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); voiceIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); voiceIntent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Please Speak"); startActivityForResult(voiceIntent, VOICE_REQUEST); } }); protected void onActivityResult(int requestCode, int resultCode, Intent data) { if(requestCode == VOICE_REQUEST && resultCode == RESULT_OK){ }

Read the article
How to use Speech 2 Text in Microsoft Surface

- by Roflcoptr

I'd like to use some speech 2 text in my microsoft surface application. I saw that it is possible, but I don't really know where to start. Is there any framework/library available, or a code snippet, or a tutorial?? I don't even know exactly what i should google for ;) ===EDIT=== I read that it is necessary to use a grammar to recognize words. So if I want to proceed free text, is there a predefined grammar for the english language? Or is it a better choice to don't use speech2text but just audio files instead?

Read the article
Multivariate Decision Tree learner

- by Snej

A lot univariate decision tree learner implementations (C4.5 etc) do exist, but does actually someone know multivariate decision tree learner algorithms?

Read the article
How to add words to an already loaded grammar using System.Speech and SAPI 5.3

- by Kim Major

Given the following code, Choices choices = new Choices(); choices.Add(new GrammarBuilder(new SemanticResultValue("product", "<product/>"))); GrammarBuilder builder = new GrammarBuilder(); builder.Append(new SemanticResultKey("options", choices.ToGrammarBuilder())); Grammar grammar = new Grammar(builder) { Name = Constants.GrammarNameLanguage}; grammar.Priority = priority; _recognition.LoadGrammar(grammar); How can I add additional words to the loaded grammar? I know this can be achieved both in native code and using the SpeechLib interop, but I prefer to use the managed library. Update: What I want to achieve, is not having to load an entire grammar repeatedly because of individual changes. For small grammars I got good results by calling _recognition.RequestRecognizerUpdate() and then doing the unload of the old grammar and loading of a rebuilt grammar in the event: void Recognition_RecognizerUpdateReached(object sender, RecognizerUpdateReachedEventArgs e) For large grammars this becomes too expensive.

Read the article
Using android gesture on top of menu buttons

- by chriacua

What I want is to have an options menu where the user can choose to navigate the menu between: 1) touching a button and then pressing down on the trackball to select it, and 2) drawing predefined gestures from Gestures Builder As it stands now, I have created my buttons with OnClickListener and the gestures with GestureOverlayView. Then I select starting a new Activity depending on whether the using pressed a button or executed a gesture. However, when I attempt to draw a gesture, it is not picked up. Only pressing the buttons is recognized. The following is my code: public class Menu extends Activity implements OnClickListener, OnGesturePerformedListener { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); //create TextToSpeech myTTS = new TextToSpeech(this, this); myTTS.setLanguage(Locale.US); //create Gestures mLibrary = GestureLibraries.fromRawResource(this, R.raw.gestures); if (!mLibrary.load()) { finish(); } // Set up click listeners for all the buttons. View playButton = findViewById(R.id.play_button); playButton.setOnClickListener(this); View instructionsButton = findViewById(R.id.instructions_button); instructionsButton.setOnClickListener(this); View modeButton = findViewById(R.id.mode_button); modeButton.setOnClickListener(this); View statsButton = findViewById(R.id.stats_button); statsButton.setOnClickListener(this); View exitButton = findViewById(R.id.exit_button); exitButton.setOnClickListener(this); GestureOverlayView gestures = (GestureOverlayView) findViewById(R.id.gestures); gestures.addOnGesturePerformedListener(this); } public void onGesturePerformed(GestureOverlayView overlay, Gesture gesture) { ArrayList<Prediction> predictions = mLibrary.recognize(gesture); // We want at least one prediction if (predictions.size() > 0) { Prediction prediction = predictions.get(0); // We want at least some confidence in the result if (prediction.score > 1.0) { // Show the gesture Toast.makeText(this, prediction.name, Toast.LENGTH_SHORT).show(); //User drew symbol for PLAY if (prediction.name.equals("Play")) { myTTS.shutdown(); //connect to game // User drew symbol for INSTRUCTIONS } else if (prediction.name.equals("Instructions")) { myTTS.shutdown(); startActivity(new Intent(this, Instructions.class)); // User drew symbol for MODE } else if (prediction.name.equals("Mode")){ myTTS.shutdown(); startActivity(new Intent(this, Mode.class)); // User drew symbol to QUIT } else { finish(); } } } } @Override public void onClick(View v) { switch (v.getId()){ case R.id.instructions_button: startActivity(new Intent(this, Instructions.class)); break; case R.id.mode_button: startActivity(new Intent(this, Mode.class)); break; case R.id.exit_button: finish(); break; } } Any suggestions would be greatly appreciated!

Read the article
Advice on using OCR on an image of a blackboard

- by ro

I'm trying to get an image of a blackboard readable by OCR. Naturally, most OCR software doesn't like dirty images. What image processing should I try to put the image through to clean the image up?

Read the article
Finding a picture in a picture with java?

- by tarrasch

what i want to to is analyse input from screen in form of pictures. I want to be able to identify a part of an image in a bigger image and get its coordinates within the bigger picture. Example: would have to be located in And the result would be the upper right corner of the picture in the big picture and the lower left of the part in the big picture. As you can see, the white part of the picture is irrelevant, what i basically need is just the green frame. Is there a library that can do something like this for me? Runtime is not really an issue. What i want to do with this is just generating a few random pixel coordinates and recognize the color in the big picture at that position, to recognize the green box fast later. And how would it decrease performance, if the white box in the middle is transparent? The question has been asked several times on SO as it seems without a single answer. I found i found a solution at http://werner.yellowcouch.org/Papers/subimg/index.html . Unfortunately its in C++ and i do not understand a thing. Would be nice to have a Java implementation on SO.

Read the article
Android 2.1 fling gesture captured on textview but still a contextmenu opens

- by hermo

The following problem seems unique to 2.1, happens both on an emulator and on a nexus. The same example works fine on other platforms I've tested (1.5, 1.6 and 2.0 emulators). I've added created gestureListener as described in this post. The difference is that I've added the listener on a TextView which also has a contextMenu registered, i.e. sth like the following: onCreate(...) { ... // Layout contains a large TextView on which I want to add a context menu tv = findViewById(R.id.text_view); tv.registerForContextMenu(this); // create the gestureListener according above mentioned post. gestureListener = ... // set the listener on the text-view tv.setOnTouchListener(gestureListener); ... } When testing it, the correct gesture is recognized alright, but every other time it also causes the context menu to be opened. As the same example is working on non 2.1 platforms, I've got a feeling it is not my code that is the problem... Thankful for any suggestions.

Read the article
Complex gestures on the iPhone.

- by Tejaswi Yerukalapudi

Is there a high level library that handles complex gestures l ike detecting triangles / loops / circles? Is it even possible to build such a library with what Apple already has? Thanks, Teja

Read the article
Recognize Dates In A String

- by Tim Scott

I want a class something like this: public interface IDateRecognizer { DateTime[] Recognize(string s); } The dates might exist anywhere in the string and might be any format. For now, I could limit to U.S. culture formats. The dates would not be delimited in any way. They might have arbitrary amounts of whitespace between parts of the date. The ideas I have are: ANTLR Regex Hand rolled I have never used ANTLR, so I would be learning from scratch. I wonder if there are libraries or code samples out there that do something similar that could jump start me. Is ANTLR too heavy for such a narrow use? I have used Regex a lot before, but I hate it for all the reasons that most people hate it. I could certainly hand roll it but I'd rather not re-solve a solved problem. Suggestions? UPDATE: Here is an example. Given this input: This is a date 11/3/63. Here is another one: November 03, 1963; and another one Nov 03, 63 and some more (11/03/1963). The dates could be in any U.S. format. They might have dashes like 11-2-1963 or weird extra whitespace inside like this: Nov 3, 1963, and even maybe the comma is missing like [Nov 3 63] but that's an edge case. The output should be an array of seven DateTimes. Each date would be the same: 11/03/1963 00:00:00.

Read the article
how to apply Discrete wavelet transform on image

- by abuasis

I am implementing an android application that will verify signature images , decided to go with the Discrete wavelet transform method (symmlet-8) the method requires to apply the discrete wavelet transform and separate the image using low-pass and high-pass filter and retrieve the wavelet transform coefficients. the equations show notations that I cant understand thus can't do the math easily , also didn't know how to apply low-pass and high-pass filters to my x and y points. is there any tutorial that shows you how to apply the discrete wavelet transform to my image easily that breaks it out in numbers? thanks alot in advance.

Read the article
recognizing computer in masm

- by oneat

How can I retrieve MAC of network card(s) and what can I retrieve to recognise specific computer and how ?

Read the article
Find image position inside a larger image

- by Matthew

Basically I want to find the pixel location of a small image inside a large image. I have searched for something similar to this but have had no luck.

Read the article
Improving the efficiency of Kinect for Windows DTWGestureRecognition Application

- by Ray

Currently I am using the DTWGestureRecognition open source tool for Kinect SDK v1.5. I have recorded a few gestures and use them to navigate through Windows 7. I also have implemented voice control for simple things such as opening PowerPoint, Chrome, etc. My main issue is that the application uses quite a bit of my CPU power which causes it to become slow. During gestures and voice commands, the CPU usage sometimes spikes to 80-90%, which causes the application to be unresponsive for a few seconds. I am running it on a 64 bit Windows 7 machine with an i5 processor and 8 GB of RAM. I was wondering if anyone with any experience using this tool or Kinect in general has made it more efficient and less performance hogging. Right now I removed sections which display the RGB video and the Depth video but even doing that did not make a big impact. Any help is appreciated, thanks!

Read the article
Question SpeechSynthesizer.SetOutputToAudioStream audio format problem

- by Chris Kugler

Hi, I'm currently working on an application which requires transmission of speech encoded to a specific audio format. System.Speech.AudioFormat.SpeechAudioFormatInfo synthFormat = new System.Speech.AudioFormat.SpeechAudioFormatInfo(System.Speech.AudioFormat.EncodingFormat.Pcm, 8000, 16, 1, 16000, 2, null); This states that the audio is in PCM format, 8000 samples per second, 16 bits per sample, mono, 16000 average bytes per second, block alignment of 2. When I attempt to execute the following code there is nothing written to my MemoryStream instance; however when I change from 8000 samples per second up to 11025 the audio data is written successfully. SpeechSynthesizer synthesizer = new SpeechSynthesizer(); waveStream = new MemoryStream(); PromptBuilder pbuilder = new PromptBuilder(); PromptStyle pStyle = new PromptStyle(); pStyle.Emphasis = PromptEmphasis.None; pStyle.Rate = PromptRate.Fast; pStyle.Volume = PromptVolume.ExtraLoud; pbuilder.StartStyle(pStyle); pbuilder.StartParagraph(); pbuilder.StartVoice(VoiceGender.Male, VoiceAge.Teen, 2); pbuilder.StartSentence(); pbuilder.AppendText("This is some text."); pbuilder.EndSentence(); pbuilder.EndVoice(); pbuilder.EndParagraph(); pbuilder.EndStyle(); synthesizer.SetOutputToAudioStream(waveStream, synthFormat); synthesizer.Speak(pbuilder); synthesizer.SetOutputToNull(); There are no exceptions or errors recorded when using a sample rate of 8000 and I couldn't find anything useful in the documentation regarding SetOutputToAudioStream and why it succeeds at 11025 samples per second and not 8000. I have a workaround involving a wav file that I generated and converted to the correct sample rate using some sound editing tools, but I would like to generate the audio from within the application if I can. One particular point of interest was that the SpeechRecognitionEngine accepts that audio format and successfully recognized the speech in my synthesized wave file... Update: Recently discovered that this audio format succeeds for certain installed voices, but fails for others. It fails specifically for LH Michael and LH Michelle, and failure varies for certain voice settings defined in the PromptBuilder.

Read the article
How to engineer features for machine learning

- by Ivo Danihelka

Do you have some advices or reading how to engineer features for a machine learning task? Good input features are important even for a neural network. The chosen features will affect the needed number of hidden neurons and the needed number of training examples. The following is an example problem, but I'm interested in feature engineering in general. A motivation example: What would be a good input when looking at a puzzle (e.g., 15-puzzle or Sokoban)? Would it be possible to recognize which of two states is closer to the goal?

Read the article
What should I do to uninitialize .NET SpeechRecognizer?

- by manuel

The speech recognizer object is declared as SpeechRecognizer^ sre = new SpeechRecognizer(); But I can't close the application properly when I want to. I tried using delete sre; but it didn't make any difference.

Read the article
Calculating probability that a string has been randomized? - Python

- by RadiantHex

Hi folks, this is correlated to a question I asked earlier (question) I have a list of manually created strings such as: lucy87 gordan_king fancy_unicorn77 joplucky_kanga90 base_belong_to_narwhals and a list of randomized strings: johnkdf pancake90kgjd fancy_jagookfk manhattanljg What gives away that the last set of strings are randomized is that sequences such as 'kjg', 'jgf', 'lkd', ... . Any clever way I could separate strings that contain these apparently randomized strings from the crowd? I guess that this plays a lot on the fact that certain characters are more likely to be placed next to others (e.g. 'co', 'ka', 'ja', ...). Any ideas on this one? Kylotan mentioned Reverend, but I am not sure if it can be used fr such purpose. Help would be much appreciated!

Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16 | Next Page >