speech recognition - Page 27

How to get the default audio format of a TTS Engine

- by Itslava

In Microsoft TTS 5.1 or newer. The SpVoice.AudioOutputStream property says: The AudioOutputStream property gets and sets the current audio stream object used by the voice. Setting the voice's AudioOutputStream property may cause its audio output format to be automatically changed to match the text-to-speech (TTS) engine's preferred audio output format. If the voice's AllowAudioOutputFormatChangesOnNextSet property is True, the format change takes place; if False, the format remains unchanged. In order to set the AudioOutputStream property of a voice to a specific format, its AllowOutputFormatChangesOnNextSet should be False. It means a engine's always has a preferred audio output format. So, how can i get it.. i have not found any interface to get that attribute.

Read the article

Change Titlewindow close button

- by Cameigons

I'm working with Flex 3.4 SDK. I need to change the default close button image from a TitleWindow. So what I'm doing is defining a CSS selector, like this: TitleWindow{ close-button-skin: Embed('assets/close.png'); border-color: #FFFFFF; corner-radius: 10; closeButtonDisabledSkin: ClassReference(null); closeButtonDownSkin: ClassReference(null); closeButtonOverSkin: ClassReference(null); closeButtonUpSkin: ClassReference(null); } The problem is: the result image is totally squeezed beyond recognition. Probably because the image dimensions are 55x10 pixels (much wider than the default closebutton square-like dimensions) and flex forces it to fit that size. Would anyone know how to go about fixing that?

Read the article

Alternative to Microsoft Agent / Fix for color issue?

- by Rob P.

I've got an app that does Text-To-Speech; but I wanted to show an animated face/character to go with it. I found a tutorial on Microsoft Agent and I implemented it in my vb.net app. The problem is with the transparency color. Unless I run application in compatibility mode/256 colors, the characters will appear with a purplish-pink background image instead of a transparent back-color. But running the app in 256 colors the rest of the app looks awfully out of place. First - is there something that works similar to MS Agent I can use that would be more appropriate? Second - if I'm still MS Agent - can I get the transparent color to work correctly without limiting myself to 256 colors?

Read the article

added TextToSpeech to my activity and now my onDestroy is not called any more, bug?

- by hermo

I added TextToSpeech to my app, following the guidelines in the following post: http://android-developers.blogspot.com/2009/09/introduction-to-text-to-speech-in.html and now my onDestroy is no longer called when the back button is pressed. I filed a bug report regarding this: http://code.google.com/p/android/issues/detail?id=7674 Figured i should also ask here if someone else has seen this, and found a solution? It seems that it is the intent that causes the problem, i.e. the following: Intent checkIntent = new Intent(); checkIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA); startActivityForResult(checkIntent, MY_DATA_CHECK_CODE); If I skip this intent, and just go ahead and create a tts-instance, it works fine. Any clues to what is wrong with this intent?

Read the article

How to get the contents of the wav file into array so as to cut the required segment and convert it

- by kaushik

How to get the contents of the wav file into array so as to cut the required segment and convert it back to wav format using python?? My prob is similar to "ROMANs" prob,i hav seen earlier in the post at this site.. Basically,i want to combine parts of different wav file into one wav file?? if there is ne other apporach thn takin the contents into an array and cuting part and combining and again converting bac? please suggest... edited: I prefer unpacking the contents of the wave file into an array and editing by cutting the required segment of sound from the wav file,as i am working on speech processing,and guess this way would be easy to enchance the quality of sound later... can ne one suggest a way for this?? Plz help.. Thanks in advance.

Read the article

Embed font in a mac bundle

- by RW

I have a program I am writing. I want to use a fancy font. Can I just embed my font into my bundle and use it from there. My code... NSMutableAttributedString *recOf; recOf = [[NSMutableAttributedString alloc] initWithString:@"In Recognition of"]; length = [recOf length]; [recOf addAttribute:NSFontAttributeName value:[NSFont fontWithName:@"Edwardian Script ITC" size:50] range:NSMakeRange(0, length)]; [[NSColor blackColor] set]; p.x = (bounds.size.width/2)- (([recOf size].width)/2); p.y = (bounds.size.height/1.7); [recOf drawAtPoint:p]; [recOf release];

Read the article

Playing audio from a wav file in iPhone SpeakHere example

- by Mo

I'm working with the iPhone SpeakHere example, and I would like to be able to play audio from either the mic (as in the example) or from a wav file. I have working code to play from a particular wav file, which looks like this: NSString *path = [[NSBundle mainBundle] pathForResource:@"basketBall" ofType:@"wav"]; AVAudioPlayer* theAudio=[[AVAudioPlayer alloc] initWithContentsOfURL:[NSURL fileURLWithPath:path] error:NULL]; theAudio.delegate = self; [theAudio play]; So I'm fine with actually getting the wav to play in the application (I can hook it up to a button, etc.) but I would like it to also behave the same way pushing the "Play" button does after recorded speech, in that it should be connected to the same visualization (which I have modified quite a bit, but essentially shows the current volume, among other things). Thanks for your help!

Read the article

Filter user input (paragraph) for links + smileys

- by Alec Smart

Hello, I am looking at some sort of existing filter which can sanitize the user input to avoid XSS. Probably I can use htmlspecialchars for that. But at the same time I want to be able to parse all links (should match a.com, www.a.com and http://www.a.com and if it is http://www.aaaaaaaaaaaaaaaaaaaaaaaaaa.com then it should display it as aaa..a.com), e-mails and smileys. I am wondering what is the best way to go about it. I am currently using a php function with some regex, but many times the regex simply fails (because of link recognition is incorrect etc.). I want something very similar to the parser used during Google Chat (even a.com works). Thank you for your time.

Read the article

Key word extraction in Python

- by oliland

I'm building a website in django that needs to extract key words from short (twitter-like) messages. I've looked at packages like topia.textextract and nltk - but both seem to be overkill for what I need to do. All I need to do is filter words like "and", "or", "not" while keeping nouns and verbs that aren't conjunctives or other parts of speech. Are there any "simpler" packages out there that can do this? EDIT: This needs to be done in near real-time on a production website, so using a keyword extraction service seems out of the question, based on their response times and request throttling.

Read the article

Linux, how to capture screen, and simulate mouse movements.

- by 2di

Hi All I need to capture screen (as print screen) in the way so I can access pixel color data, to do some image recognition, after that I will need to generate mouse events on the screen such as left click, drag and drop (moving mouse while button is pressed, and then release it). Once its done, image will be deleted. Note: I need to capture whole screen everything that user can see, and I need to simulate clicks outside window of my program (if it makes any difference) Spec: Linux ubuntu Language: C++ Performance is not very important,"print screen" function will be executed once every ~10 sec. Duration of the process can be up to 24 hours so method needs to be stable and memory leaks free (as usuall :) I was able to do in windows with win GDI and some windows events, but I'ev no idea how to do it in Linux. Thanks a lot

Read the article

Best programming novel to take on holiday

- by Ed Guiness

I am about enjoy a two week break in Spain where I expect to have lots of time for relaxing and reading. I normally read a lot of non-fiction so I'm looking for novel suggestions. If there is another Cryptonomicon out there I'd love to hear about it! UPDATE: In the end I took four books including Quicksilver. Quicksilver was fantastic and I look forward to continuing the series. I was disappointed with Gen X (Coupland) and Pattern Recognition (Gibson). Upon arrival I also found The Monsters Of Gramercy Park (Leigh) which was enjoyable though sad. Thanks for all the recommendations, I'm sure to return to this list when I have more free time.

Read the article

Praat scripting

- by Binaryrespawn

Hi all, I am trying to write a praat script to do preprocessing on hundreds of speach samples. I need to extract speech features from each sample and feed these as imputs into a feed-forward neural network. I have already constructed the network using math-lab. However, learing to script in praat is proving to be quite a challenge given my time constraints. Some of my samples are 0.01 to 0.03 seconds in length, I was looking at standardising the duration for all samples using Pitch Synchronous OverLap-Add(PSOLA). However this will be very tedious if I were to do this for every sample. Is there any script that can read in all of my files and perform the operations in a batch mode? Any guidance will be surelly appreaciated. Regards.

Read the article

Show Alertdialog and use vibrator

- by user1007522

I'm having a class that implements RecognitionListener like this: public class listener implements RecognitionListener I wanted to show a alertdialog and use the vibrator but this isn't possible because I need to provide a context what I don't have. My alertdialog code was like this: new AlertDialog.Builder(this) .setTitle("dd") .setMessage("aa") .setNeutralButton("Ok", new DialogInterface.OnClickListener() { public void onClick(DialogInterface dialog, int which) { } }) .show(); But the AlertDialog.Builder(this) wants a context, the same problem with my vibrator code: v = (Vibrator) getSystemService(Context.VIBRATOR_SERVICE); The getSystemService method isn't available. My code that starts the class: sr = SpeechRecognizer.createSpeechRecognizer(this); sr.setRecognitionListener(new listener()); Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test"); intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,5); sr.startListening(intent); Whats the best way to solve this?

Read the article

How can I tag words when creating grammar rules to convert voice to text using xml ?

- by jhone

hii friends, I am doing project using c#, which is about converting voice to text. I use speech sdk for this. I want to tag words according to its category using a grammar file which is written in xml sheet and display it in a text box. eg : if the word is "eat" it should be display like "eat/verb". following is the xml code i have written but the part which i tagged won't display in the text box. only the converted word is there. <rule id="Verbs"> <item>eat<tag>$._value="/verb";</tag></item> </rule>

Read the article

audio power on AudioQueue

- by Tomoyuki

Hi everyone. I'm now creating an Application using speech recognition.To check the Audio Power coming in through the microphone, I wrote a method as follows. -(void)checkPower(AudioqueRef)queue{ UInt32 expectedSize= sizeof(AudioQueueLevelMeterState); AudioQueueGetProperty(queue, kAudioQueueProperty_CurrentLevelMeter, audioLevels, expectedSize); NSLog(@"average:%f peak:%f",audioLevels.mAveragePower,audioLevels.mPeakPower); } I found that sometimes mAveragePower was larger than mPeakPower, and when mAveragePower was 1.0, in other words, averagePower is regarded as max, mPeakPower was lower than 1.0. I think that generally this result is inpossible. please Let me know if you have any information about sound power on CoreAudio. thanks.

Read the article

Bitmap manipulation in C++ on Windows

- by Oliver

Hi, I have myself a handle to a bitmap, in C++, on Windows: HBITMAP hBitmap; On this image I want to do some Image Recognition, pattern analysis, that sort of thing. In my studies at University, I have done this in Matlab, it is quite easy to get at the individual pixels based on their position, but I have no idea how to do this in C++ under Windows - I haven't really been able to understand what I have read so far. I have seen some references to a nice looking Bitmap class that lets you setPixel() and getPixel() and that sort of thing, but I think this is with .net . How should I go about turning my HBITMAP into something I can play with easily? I need to be able to get at the RGBA information. Are there libraries that allow me to work with the data without having to learn about DCs and BitBlt and that sort of thing?

Read the article

How do I implement .net plugins without using AppDomains?

- by Abtin Forouzandeh

Problem statement: Implement a plug-in system that allows the associated assemblies to be overwritten (avoid file locking). In .Net, specific assemblies may not be unloaded, only entire AppDomains may be unloaded. I'm posting this because when I was trying to solve the problem, every solution made reference to using multiple AppDomains. Multiple AppDomains are very hard to implement correctly, even when architected at the start of a project. Also, AppDomains didn't work for me because I needed to transfer Type across domains as a setting for Speech Server worfklow's InvokeWorkflow activity. Unfortunately, sending a type across domains causes the assembly to be injected into the local AppDomain. Also, this is relevant to IIS. IIS has a Shadow Copy setting that allows an executing assembly to be overwritten while its loaded into memory. The problem is that (at least under XP, didnt test on production 2003 servers) when you programmatically load an assembly, the shadow copy doesnt work (because you are loading the DLL, not IIS).

Read the article

Distance between hyperplanes

- by michael dillard

I'm trying to teach myself some machine learning, and have been using the MNIST database (http://yann.lecun.com/exdb/mnist/) do so. The author of that site wrote a paper in '98 on all different kinds of handwriting recognition techniques, available at http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf. The 10th method mentioned is a "Tangent Distance Classifier". The idea being that if you place each image in a (NxM)-dimensional vector space, you can compute the distance between two images as the distance between the hyperplanes formed by each where the hyperplane is given by taking the point, and rotating the image, rescaling the image, translating the image, etc. I can't figure out enough to fill in the missing details. I understand that most of these are indeed linear operators, so how does one use that fact to then create the hyperplane? And once we have a hyperplane, how do we take its distance with other hyperplanes?

Read the article

Storing SQL queries in Table in sql server

- by Rohit

We have multiple jobs in our system.These jobs are listed in a grid. We have 3 different user types (usertypeid 1,2,3). For each user listing is different and he can filter listing by selecting view from a dropdown. ViewName in the below table is the view which needs to be displayed. To achieve this functionality, a fellow developer has created the following table structure and stored sql fragments in SQLExpression in the below table. According to me the query should not be stored in database. What are the pros and cons of this approach and what are the available alternatives? JobListingViewID ViewName SQLExpression UserTypeID 3 All Jobs 1 = 1 3 4 Error Jobs JobStatusID IN ( 2 ) 1 5 Error Jobs JobStatusID IN ( 2 ) 2 6 Error Jobs JobStatusID IN ( 2 ) 3 7 Speech JobStatusID IN ( 1, 3, 8 ) 1

Read the article

diff implementation in Java

- by Frór

Hi, I'm looking for a diff implementation in Java. I've seen that Python has its own SequenceMatcher (with difflib), which is exactly what I need... in Java. Is there any portage? Or is there any other class/library that performs the same in Java? If not, where can I find the source code of that difflib (if free as in speech) to make my own implementation of SequenceMatcher in Java ? Unfortunately, Apache Commons Lang doesn't help me much. Thanks!

Read the article

Build a decision tree for classification of large amount data,using python?

- by kaushik

Hi,i am working for speech synthesis.In this i have a large number of pronunciation for each phone i.e alphabet and need to classify them according to few feature such as segment size(int) and alphabet itself(string) into a smaller set suitable for that particular context. For this purpose,i have decided to use decision tree for classification.the data to be parsed is in the S expression format.eg:((question)(LEFTNODE)(RIGHTNODE)). i hav idea for building decision tree for normal buit in type such as list..looking for suggestion for implementation for S expression.. kindly help.. Thanks in advance.. Note:this question may look similar to my prev post,srry if cant giv multiple post.already edited it many times so though of wirting new question instead of editing again

Read the article

How do I construct a 3D model of a room from 2 stereo cameras? What is the determining factor to an

- by yasumi

Currently, I have extracted depth points to construct a 3D model from 2 stereo cameras. The methods I have used are openCV graphCut method and a software from http://sourceforge.net/projects/reconststereo/. However, the generated 3D models are not very accurate, which leads me to question: 1) What is the problem with pixel-based method? 2) Should I change my pixel-based method to feature-based or object-recognition-based method? Is there a best method? 3) Are there any other ways to do such reconstruction? Additionally, the depth extracted comes only from 2 images. What if I am turning the camera 360 degrees to obtain a video? Looking forward to suggestion on how to combine this depth information. Thank you very much :)

Read the article

Any experience with the Deliverance system ?

- by e-satis

My new boss went to a speech where Deliverance, a kind of proxy allowing to add skin to any html output on the fly, was presented. He decided to use it right after that, no matter how young it is. More here : http://www.openplans.org/projects/deliverance/introduction In theory, the system sounds great when you want a newbie to tweak your plone theme without having to teach him all the complex mechanisms behind the zope products. And apply the same theme on a Drupal web site in one row. But I don't believe in theory, and would like to know if anybody tried this out in the real world :-)

Read the article

Receiving Text From Another Application

- by Garry

Hi, I'm building some home automation software with Cocoa/Objective-C. The main application will have a minimal GUI and will most likely be represented by a status bar icon only. I'm using proprietary speech-to-text software (MacSpeech Dictate) that takes my voice command and converts it to plain text. I then need to send this plain text to my app for parsing. Is there a way to send a string to a Cocoa application? Could AppleScript achieve this? How would I make the NSString string in my app "available" to receive the passed string? For reasons that are beyond the scope of this question - it is not possible to dictate the command directly into my app. Many thanks in advance,

Read the article

measuring uncertainty in matlabs svmclassify

- by Mark

I'm doing contextual object recognition and I need a prior for my observations. e.g. this space was labeled "dog", what's the probability that it was labeled correctly? Do you know if matlabs svmclassify has an argument to return this level of certainty with it's classification? If not, matlabs svm has the following structures in it: SVM = SupportVectors: [11x124 single] Alpha: [11x1 double] Bias: 0.0915 KernelFunction: @linear_kernel KernelFunctionArgs: {} GroupNames: {11x1 cell} SupportVectorIndices: [11x1 double] ScaleData: [1x1 struct] FigureHandles: [] Can you think of any ways to compute a good measure of uncertainty from these? (Which support vector to use?) Papers/articles explaining uncertainty in SVMs welcome. More in depth explanations of matlabs SVM are also welcome. If you can't do it this way, can you think of any other libraries with SVMs that have this measure of uncertainty?

Search Results

Search found 916 results on 37 pages for 'speech recognition'.

Page 27/37 | < Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34 | Next Page >

- by Itslava

- by Cameigons

- by Rob P.

- by hermo

- by kaushik

- by RW

- by Mo

- by Alec Smart

- by oliland

- by 2di

- by Ed Guiness

- by Binaryrespawn

- by user1007522

- by jhone

- by Tomoyuki

- by Oliver

- by Abtin Forouzandeh

- by michael dillard

- by Rohit

- by Frór

- by kaushik

- by yasumi

- by e-satis

- by Garry

- by Mark

< Previous Page | 23 24 25 26 27 28 29 30 31 32 33 34 | Next Page >