Search Results

Search found 6107 results on 245 pages for 'reserved words'.

Page 19/245 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >

  • Edit Distance in Python

    - by Alice
    I'm programming a spellcheck program in Python. I have a list of valid words (the dictionary) and I need to output a list of words from this dictionary that have an edit distance of 2 from a given invalid word. I know I need to start by generating a list with an edit distance of one from the invalid word(and then run that again on all the generated words). I have three methods, inserts(...), deletions(...) and changes(...) that should output a list of words with an edit distance of 1, where inserts outputs all valid words with one more letter than the given word, deletions outputs all valid words with one less letter, and changes outputs all valid words with one different letter. I've checked a bunch of places but I can't seem to find an algorithm that describes this process. All the ideas I've come up with involve looping through the dictionary list multiple times, which would be extremely time consuming. If anyone could offer some insight, I'd be extremely grateful. Thanks!

    Read the article

  • difference between 2 pieces Python code

    - by draw
    Hello, I'm doing an exercise as following: # B. front_x # Given a list of strings, return a list with the strings # in sorted order, except group all the strings that begin with 'x' first. # e.g. ['mix', 'xyz', 'apple', 'xanadu', 'aardvark'] yields # ['xanadu', 'xyz', 'aardvark', 'apple', 'mix'] # Hint: this can be done by making 2 lists and sorting each of them # before combining them. sample solution: def front_x(words): listX = [] listO = [] for w in words: if w.startswith('x'): listX.append(w) else: listO.append(w) listX.sort() listO.sort() return listX + listO my solution: def front_x(words): listX = [] for w in words: if w.startswith('x'): listX.append(w) words.remove(w) listX.sort() words.sort() return listX + words as I tested my solution, the result is a little weird. Here is the source code with my solution: http://dl.dropbox.com/u/559353/list1.py. You might want to try it out.

    Read the article

  • Creating a music catalog in C# and extracting first 30 seconds as soon as the first words are sung

    - by Rad
    I already read a question: Separation of singing voice from music. I don’t need this complex audio processing. I only need some detection mechanism that would detect that there is some voice/vocal playing while the music is playing (or not playing) I need to extract first 30 seconds when a vocalist starts singing along with full band music. See question 2 below. I want to create a music catalog using ASP.NET MVC 2 and Silverlight clients and C#.NET 4.0 programming language that would be front store. On the backend I would also like to create a desktop WPF/Windows application to create the music catalog from already existing music files, most of which have metadata in them ID3v1, ID3v2.3, ID3v2.4, iTunes MP4, WMA, Vorbis Comments and APE Tags etc. I would possibly like to create a web service that would allow catalog contributors to upload a zipped album and trigger metadata extraction of music data and extraction of music segments as described below. I would be happy if I achieve no. 1 below. Let's say I have 1000ths of songs in mp3 (or other formats) grouped in subfolders using some classification (Genre, Artists, Albums, Composers or other groupings). I want to create tables in DB that would organize songs so they can be searched based on different criteria (year, length, above classification or by song title, description etc) like what iTune store allows to their customers. I want to extract metadata from various formats (I will try to get songs in mp3 format, but there may be other popular formats) and allow music Catalog manager person to add missing data from either desktop or web applications. He or other contributors can upload zipped music via an HTML or Silverlight upload or WPF. Can anybody suggest open source libraries, articles, code snippets that can do that in an automatic way using .NET and possibly SQL Server DB? My main questions are these. This is an audio processing challenge. I want to extract 2 segments of music (questions 1 and 2): 1. How to extract a music segment: 1-2 seconds before a vocal starts singing and up to 30 seconds from that point in time and 2. Much more challenging is to find repeating segments (One would usually find or recognize the names of the songs and songs are usually known by these refrains. How would I go about creating a list of songs that go great together like what Genius from iTune does? Is there any characteristics of music that can be used to match songs? The goal is for people quickly scan and recognize songs i.e. associate melody, words with a title/album so they can make intelligent decisions like buying a song, create similar mood songs. Thanks, Rad

    Read the article

  • Creating a music catalog and extracting first 30 seconds as soon as the first words are sung

    - by Rad
    I already read a question: Separation of singing voice from music. I don’t need this complex audio processing. I only need some detection mechanism that would detect that there is some voice/vocal playing while the music is playing (or not playing) I need to extract first 30 seconds when a vocalist starts singing along with full band music. See question 2 below. I want to create a music catalog using ASP.NET MVC 2 and Silverlight clients and C#.NET 4.0 programming language that would be front store. On the backend I would also like to create a desktop WPF/Windows application to create the music catalog from already existing music files, most of which have metadata in them ID3v1, ID3v2.3, ID3v2.4, iTunes MP4, WMA, Vorbis Comments and APE Tags etc. I would possibly like to create a web service that would allow catalog contributors to upload a zipped album and trigger metadata extraction of music data and extraction of music segments as described below. I would be happy if I achieve no. 1 below. Let's say I have 1000ths of songs in mp3 (or other formats) grouped in subfolders using some classification (Genre, Artists, Albums, Composers or other groupings). I want to create tables in DB that would organize songs so they can be searched based on different criteria (year, length, above classification or by song title, description etc) like what iTune store allows to their customers. I want to extract metadata from various formats (I will try to get songs in mp3 format, but there may be other popular formats) and allow music Catalog manager person to add missing data from either desktop or web applications. He or other contributors can upload zipped music via an HTML or Silverlight upload or WPF. Can anybody suggest open source libraries, articles, code snippets that can do that in an automatic way using .NET and possibly SQL Server DB? My main questions are these. This is an audio processing challenge. I want to extract 2 segments of music (questions 1 and 2): 1. How to extract a music segment: 1-2 seconds before a vocal starts singing and up to 30 seconds from that point in time and 2. Much more challenging is to find repeating segments (One would usually find or recognize the names of the songs and songs are usually known by these refrains. How would I go about creating a list of songs that go great together like what Genius from iTune does? Is there any characteristics of music that can be used to match songs? The goal is for people quickly scan and recognize songs i.e. associate melody, words with a title/album so they can make intelligent decisions like buying a song, create similar mood songs.

    Read the article

  • Big Data – Learning Basics of Big Data in 21 Days – Bookmark

    - by Pinal Dave
    Earlier this month I had a great time to write Bascis of Big Data series. This series received great response and lots of good comments I have received, I am going to follow up this basics series with further in-depth series in near future. Here is the consolidated blog post where you can find all the 21 days blog posts together. Bookmark this page for future reference. Big Data – Beginning Big Data – Day 1 of 21 Big Data – What is Big Data – 3 Vs of Big Data – Volume, Velocity and Variety – Day 2 of 21 Big Data – Evolution of Big Data – Day 3 of 21 Big Data – Basics of Big Data Architecture – Day 4 of 21 Big Data – Buzz Words: What is NoSQL – Day 5 of 21 Big Data – Buzz Words: What is Hadoop – Day 6 of 21 Big Data – Buzz Words: What is MapReduce – Day 7 of 21 Big Data – Buzz Words: What is HDFS – Day 8 of 21 Big Data – Buzz Words: Importance of Relational Database in Big Data World – Day 9 of 21 Big Data – Buzz Words: What is NewSQL – Day 10 of 21 Big Data – Role of Cloud Computing in Big Data – Day 11 of 21 Big Data – Operational Databases Supporting Big Data – RDBMS and NoSQL – Day 12 of 21 Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21 Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21 Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21 Big Data – Interacting with Hadoop – What is PIG? – What is PIG Latin? – Day 16 of 21 Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21 Big Data – Basics of Big Data Analytics – Day 18 of 21 Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21 Big Data – Various Learning Resources – How to Start with Big Data? – Day 20 of 21 Big Data – Final Wrap and What Next – Day 21 of 21 Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Software for a online collaborative bi/tri lingual dictionary [closed]

    - by user537488
    I am looking for a software which I can host in popular and general shared web hosting services(online softwares like wordpress, meidawiki, drupal etc.) which can do the following- allow users to create account allow users or anons to add words to the dictionary (there will be English as base language and other languages) easier way to import all the words from English dictionary users should be able to write the that language equivalent of the English word Every word should have it's own address and page like www.namesomething.com/word/en/software will contain the word software and the other language word for it search should be faster and should find nearer results it's should be able to list related words like if the user is looking at "software" then other words from s like "softcopy" etc should appear alphabetically in that page Any one should be able to comment on the word which is not seen in the main page but other page similar to the talk page in the wiki any one should be able to contribute clean interface unlike wiki (media wiki and all other) just for words only I tried media wiki and other wiki software but it overloaded and unclean. I am looking for interface similar to oed.com but clean, minimal as we are not going to have such more information. Just words in English and it's other language equivalent. Here we are talking about a language which has not yet been in the Internet. It's should be collaborative.

    Read the article

  • Stylecop 4.7.39.0 has been released

    - by TATWORTH
    Stylecop  4.7.38.0 has been released at http://stylecop.codeplex.com/releases/view/79972The release notes follow:Allow case sensitivity in the deprecated words and recognised words listStyleing fixes.Fix for documentation spelling checks inside nested xml nodes.Look for CustomDictionary.xml files in the folder of the cs file.Update the TabIndex in the spelling tab.Updating default deprecated words and their alternatives.Add support for specifying dictionary folders in the settings.StyleCop file. Like :Rename StyleCopViolationError to StyleCopHighlightingError and all associated types.Fix the Bulb Item for spelling mistakes to replace matching words correctly.Fix the spelling parser for strings beginning with $$THREADING FIX: Make StyleCop execute analysis in proces and not create 2 threads. Use Countdown Event when we move to .NET 4.Use the naming service for the Culture specified for the project. Pass the actual violation through to ReSharper.Ensure Registry access code works for VS2008 addins.Rollback Registry changes to ensure VS2008 plugin loads correctly.Adding support for preferred alternative words for spelling. Adding deprecated word support into Settings.StyleCop file. Spelling is only checked if Office 2010 is installed. Allow editing of deprecated words and their alternatives in the Settings editor.Adding new resource stringsAdding BulbItem and Quick fixes for spelling errors.Moving StringExtensions to common area.Styling fixes.Report all spelling errors found on a line.Start of 4.7.39.0 dev.

    Read the article

  • How to optimize Core Data query for full text search

    - by dk
    Can I optimize a Core Data query when searching for matching words in a text? (This question also pertains to the wisdom of custom SQL versus Core Data on an iPhone.) I'm working on a new (iPhone) app that is a handheld reference tool for a scientific database. The main interface is a standard searchable table view and I want as-you-type response as the user types new words. Words matches must be prefixes of words in the text. The text is composed of 100,000s of words. In my prototype I coded SQL directly. I created a separate "words" table containing every word in the text fields of the main entity. I indexed words and performed searches along the lines of SELECT id, * FROM textTable JOIN (SELECT DISTINCT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) ON id=textTableId LIMIT 50 This runs very fast. Using an IN would probably work just as well, i.e. SELECT * FROM textTable WHERE id IN (SELECT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) LIMIT 50 The LIMIT is crucial and allows me to display results quickly. I notify the user that there are too many to display if the limit is reached. This is kludgy. I've spent the last several days pondering the advantages of moving to Core Data, but I worry about the lack of control in the schema, indexing, and querying for an important query. Theoretically an NSPredicate of textField MATCHES '.*\bfoo.*' would just work, but I'm sure it will be slow. This sort of text search seems so common that I wonder what is the usual attack? Would you create a words entity as I did above and use a predicate of "word BEGINSWITH 'foo'"? Will that work as fast as my prototype? Will Core Data automatically create the right indexes? I can't find any explicit means of advising the persistent store about indexes. I see some nice advantages of Core Data in my iPhone app. The faulting and other memory considerations allow for efficient database retrievals for tableview queries without setting arbitrary limits. The object graph management allows me to easily traverse entities without writing lots of SQL. Migration features will be nice in the future. On the other hand, in a limited resource environment (iPhone) I worry that an automatically generated database will be bloated with metadata, unnecessary inverse relationships, inefficient attribute datatypes, etc. Should I dive in or proceed with caution?

    Read the article

  • How we can perform Action on Sequence UIButtons?

    - by Prince Shazad
    As My Screen shot show that i am working on word matching game.In this game i assign my words to different UIButtons in Specific sequence on different loctions(my red arrow shows this sequence)and of rest UIButtons i assign a one of random character(A-Z).when i Click on any UIButtons its title will be assign to UILabel which is in Fornt of Current Section:i campare this UILabel text to below UILabels text which is in fornt of timer.when it match to any of my UILabels its will be deleted.i implement all this process already. But my problem is that which is show by black lines.if the player find the first word which is "DOG". he click the Two UIButtons in Sequence,but not press the Third one in Sequence.(as show by black line).so here i want that when player press the any UIButtons which is not in Sequence then remove the previous text(which is "DO") of UILabel and now the Text of UILabel is only "G" . Here is my code to get the UIButtons titles and assign it UILabel. - (void)aMethod:(id)sender { UIButton *button = (UIButton *)sender; NSString *get = (NSString *)[[button titleLabel] text]; NSString *origText = mainlabel.text; mainlabel.text = [origText stringByAppendingString:get]; if ([mainlabel.text length ]== 3) { if([mainlabel.text isEqualToString: a]){ lbl.text=@"Right"; [btn1 removeFromSuperview]; score=score+10; lblscore.text=[NSString stringWithFormat:@"%d",score]; words=words-1; lblwords.text=[NSString stringWithFormat:@"%d",words]; mainlabel.text=@""; a=@"tbbb"; } else if([mainlabel.text isEqualToString: c]){ lbl.text=@"Right"; [btn2 removeFromSuperview]; score=score+10; lblscore.text=[NSString stringWithFormat:@"%d",score]; words=words-1; lblwords.text=[NSString stringWithFormat:@"%d",words]; mainlabel.text=@""; c=@"yyyy"; } else if([mainlabel.text isEqualToString: d]){ lbl.text=@"Right"; [btn3 removeFromSuperview]; score=score+10; lblscore.text=[NSString stringWithFormat:@"%d",score]; words=words-1; lblwords.text=[NSString stringWithFormat:@"%d",words]; mainlabel.text=@""; d=@"yyyy"; } else { lbl.text=@"Wrong"; mainlabel.text=@""; } }} Thanx in advance

    Read the article

  • LINQ and ordering of the result set

    - by vik20000in
    After filtering and retrieving the records most of the time (if not always) we have to sort the record in certain order. The sort order is very important for displaying records or major calculations. In LINQ for sorting data the order keyword is used. With the help of the order keyword we can decide on the ordering of the result set that is retrieved after the query.  Below is an simple example of the order keyword in LINQ.     string[] words = { "cherry", "apple", "blueberry" };     var sortedWords =         from word in words         orderby word         select word; Here we are ordering the data retrieved based on the string ordering. If required the order can also be made on any of the property of the individual like the length of the string.     var sortedWords =         from word in words         orderby word.Length         select word; You can also make the order descending or ascending by adding the keyword after the parameter.     var sortedWords =         from word in words         orderby word descending         select word; But the best part of the order clause is that instead of just passing a field you can also pass the order clause an instance of any class that implements IComparer interface. The IComparer interface holds a method Compare that Has to be implemented. In that method we can write any logic whatsoever for the comparision. In the below example we are making a string comparison by ignoring the case. string[] words = { "aPPLE", "AbAcUs", "bRaNcH", "BlUeBeRrY", "cHeRry"}; var sortedWords = words.OrderBy(a => a, new CaseInsensitiveComparer());  public class CaseInsensitiveComparer : IComparer<string> {     public int Compare(string x, string y)     {         return string.Compare(x, y, StringComparison.OrdinalIgnoreCase);     } }  But while sorting the data many a times we want to provide more than one sort so that data is sorted based on more than one condition. This can be achieved by proving the next order followed by a comma.     var sortedWords =         from word in words         orderby word , word.length         select word; We can also use the reverse() method to reverse the full order of the result set.     var sortedWords =         from word in words         select word.Reverse();                                 Vikram

    Read the article

  • Announcing the New Windows Azure Web Sites Shared Scaling Tier

    - by Clint Edmonson
    Windows Azure Web Sites has added a new pricing tier that will solve the #1 blocker for the web development community. The shared tier now supports custom domain names mapped to shared-instance web sites. This post will outline the plan changes and elaborate on how the new pricing model makes Windows Azure Web Sites an even richer option for web development shops of all sizes. Free Shared Reserved # of Sites 10 100 100 Egress 165MB/Day 5GB/Month Included 5GB/Month Included Storage 1GB 1GB 10GB Throttling CPU/Memory/Egress CPU/Memory Unlimited Price Free $.02/hr per site, per instance $.08/hr per core Setting the Stage In June, we released the first public preview of Windows Azure Web Sites, which gave web developers a great platform on which to get web sites running using their web development framework of choice. PHP, Node.js, classic ASP, and ASP.NET developers can all utilize the Windows Azure platform to create and launch their web sites. Likewise, these developers have a series of data storage options using Windows Azure SQL Databases, MySQL, or Windows Azure Storage. The Windows Azure Web Sites free offer enabled startups to get their site up and running on Windows Azure with a minimal investment, and with multiple deployment and continuous integration features such as Git, Team Foundation Services, FTP, and Web Deploy.  The response to the Windows Azure Web Sites offer has been overwhelmingly positive. Since the addition of the service on June 12th, tens of thousands of web sites have been deployed to Windows Azure and the volume of adoption is increasing every week. Preview Feedback In spite of the growth and success of the product, the community has had questions about features lacking in the free preview offer. The main question web developers asked regarding Windows Azure Web Sites relates to the lack of the free offer’s support for domain name mapping. During the preview launch period, customer feedback made it obvious that the lack of domain name mapping support was an area of concern. We’re happy to announce that this #1 request has been delivered as a feature of the new shared plan. New Shared Tier Portal Features In the screen shot below, the “Scale” tab in the portal shows the new tiers – Free, Shared, and Reserved – and gives the user the ability to quickly move any of their free web sites into the shared tier. With a single mouse-click, the user can move their site into the shared tier. Once a site has been moved into the shared tier, a new Manage Domains button appears in the bottom action bar of the Windows Azure Portal giving site owners the ability to manage their domain names for a shared site. This button brings up the domain-management dialog, which can be used to enter in a specific domain name that will be mapped to the Windows Azure Web Site. Shared Tier Benefits Startups and large web agencies will both benefit from this plan change. Here are a few examples of scenarios which fit the new pricing model: Startups no longer have to select the reserved plan to map domain names to their sites. Instead, they can use the free option to develop their sites and choose on a site-by-site basis which sites they elect to move into the shared plan, paying only for the sites that are finished and ready to be domain-mapped Agencies who manage dozens of sites will realize a lower cost of ownership over the long term by moving their sites into reserved mode. Once multi-site companies reach a certain price point in the shared tier, it is much more cost-effective to move sites to a reserved tier.  Long-term, it’s easy to see how the new Windows Azure Web Sites shared pricing tier makes Windows Azure Web Sites it a great choice for both startups and agency customers, as it enables rapid growth and upgrades while keeping the cost to a minimum. Large agencies will be able to have all of their sites in their own instances, and startups will have the capability to scale up to multiple-shared instances for minimal cost and eventually move to reserved instances without worrying about the need to incur continually additional costs. Customers can feel confident they have the power of the Microsoft Windows Azure brand and our world-class support, at prices competitive in the market. Plus, in addition to realizing the cost savings, they’ll have the whole family of Windows Azure features available. Continuous Deployment from GitHub and CodePlex Along with this new announcement are two other exciting new features. I’m proud to announce that web developers can now publish their web sites directly from CodePlex or GitHub.com repositories. Once connections are established between these services and your web sites, Windows Azure will automatically be notified every time a check-in occurs. This will then trigger Windows Azure to pull the source and compile/deploy the new version of your app to your web site automatically. Walk-through videos on how to perform these functions are below: Publishing to an Azure Web Site from CodePlex Publishing to an Azure Web Site from GitHub.com These changes, as well as the enhancements to the reserved plan model, make Windows Azure Web Sites a truly competitive hosting option. It’s never been easier or cheaper for a web developer to get up and running. Check out the free Windows Azure web site offering and see for yourself. Stay tuned to my twitter feed for Windows Azure announcements, updates, and links: @clinted

    Read the article

  • How can I estimate the entropy of a password?

    - by Wug
    Having read various resources about password strength I'm trying to create an algorithm that will provide a rough estimation of how much entropy a password has. I'm trying to create an algorithm that's as comprehensive as possible. At this point I only have pseudocode, but the algorithm covers the following: password length repeated characters patterns (logical) different character spaces (LC, UC, Numeric, Special, Extended) dictionary attacks It does NOT cover the following, and SHOULD cover it WELL (though not perfectly): ordering (passwords can be strictly ordered by output of this algorithm) patterns (spatial) Can anyone provide some insight on what this algorithm might be weak to? Specifically, can anyone think of situations where feeding a password to the algorithm would OVERESTIMATE its strength? Underestimations are less of an issue. The algorithm: // the password to test password = ? length = length(password) // unique character counts from password (duplicates discarded) uqlca = number of unique lowercase alphabetic characters in password uquca = number of uppercase alphabetic characters uqd = number of unique digits uqsp = number of unique special characters (anything with a key on the keyboard) uqxc = number of unique special special characters (alt codes, extended-ascii stuff) // algorithm parameters, total sizes of alphabet spaces Nlca = total possible number of lowercase letters (26) Nuca = total uppercase letters (26) Nd = total digits (10) Nsp = total special characters (32 or something) Nxc = total extended ascii characters that dont fit into other categorys (idk, 50?) // algorithm parameters, pw strength growth rates as percentages (per character) flca = entropy growth factor for lowercase letters (.25 is probably a good value) fuca = EGF for uppercase letters (.4 is probably good) fd = EGF for digits (.4 is probably good) fsp = EGF for special chars (.5 is probably good) fxc = EGF for extended ascii chars (.75 is probably good) // repetition factors. few unique letters == low factor, many unique == high rflca = (1 - (1 - flca) ^ uqlca) rfuca = (1 - (1 - fuca) ^ uquca) rfd = (1 - (1 - fd ) ^ uqd ) rfsp = (1 - (1 - fsp ) ^ uqsp ) rfxc = (1 - (1 - fxc ) ^ uqxc ) // digit strengths strength = ( rflca * Nlca + rfuca * Nuca + rfd * Nd + rfsp * Nsp + rfxc * Nxc ) ^ length entropybits = log_base_2(strength) A few inputs and their desired and actual entropy_bits outputs: INPUT DESIRED ACTUAL aaa very pathetic 8.1 aaaaaaaaa pathetic 24.7 abcdefghi weak 31.2 H0ley$Mol3y_ strong 72.2 s^fU¬5ü;y34G< wtf 88.9 [a^36]* pathetic 97.2 [a^20]A[a^15]* strong 146.8 xkcd1** medium 79.3 xkcd2** wtf 160.5 * these 2 passwords use shortened notation, where [a^N] expands to N a's. ** xkcd1 = "Tr0ub4dor&3", xkcd2 = "correct horse battery staple" The algorithm does realize (correctly) that increasing the alphabet size (even by one digit) vastly strengthens long passwords, as shown by the difference in entropy_bits for the 6th and 7th passwords, which both consist of 36 a's, but the second's 21st a is capitalized. However, they do not account for the fact that having a password of 36 a's is not a good idea, it's easily broken with a weak password cracker (and anyone who watches you type it will see it) and the algorithm doesn't reflect that. It does, however, reflect the fact that xkcd1 is a weak password compared to xkcd2, despite having greater complexity density (is this even a thing?). How can I improve this algorithm? Addendum 1 Dictionary attacks and pattern based attacks seem to be the big thing, so I'll take a stab at addressing those. I could perform a comprehensive search through the password for words from a word list and replace words with tokens unique to the words they represent. Word-tokens would then be treated as characters and have their own weight system, and would add their own weights to the password. I'd need a few new algorithm parameters (I'll call them lw, Nw ~= 2^11, fw ~= .5, and rfw) and I'd factor the weight into the password as I would any of the other weights. This word search could be specially modified to match both lowercase and uppercase letters as well as common character substitutions, like that of E with 3. If I didn't add extra weight to such matched words, the algorithm would underestimate their strength by a bit or two per word, which is OK. Otherwise, a general rule would be, for each non-perfect character match, give the word a bonus bit. I could then perform simple pattern checks, such as searches for runs of repeated characters and derivative tests (take the difference between each character), which would identify patterns such as 'aaaaa' and '12345', and replace each detected pattern with a pattern token, unique to the pattern and length. The algorithmic parameters (specifically, entropy per pattern) could be generated on the fly based on the pattern. At this point, I'd take the length of the password. Each word token and pattern token would count as one character; each token would replace the characters they symbolically represented. I made up some sort of pattern notation, but it includes the pattern length l, the pattern order o, and the base element b. This information could be used to compute some arbitrary weight for each pattern. I'd do something better in actual code. Modified Example: Password: 1234kitty$$$$$herpderp Tokenized: 1 2 3 4 k i t t y $ $ $ $ $ h e r p d e r p Words Filtered: 1 2 3 4 @W5783 $ $ $ $ $ @W9001 @W9002 Patterns Filtered: @P[l=4,o=1,b='1'] @W5783 @P[l=5,o=0,b='$'] @W9001 @W9002 Breakdown: 3 small, unique words and 2 patterns Entropy: about 45 bits, as per modified algorithm Password: correcthorsebatterystaple Tokenized: c o r r e c t h o r s e b a t t e r y s t a p l e Words Filtered: @W6783 @W7923 @W1535 @W2285 Breakdown: 4 small, unique words and no patterns Entropy: 43 bits, as per modified algorithm The exact semantics of how entropy is calculated from patterns is up for discussion. I was thinking something like: entropy(b) * l * (o + 1) // o will be either zero or one The modified algorithm would find flaws with and reduce the strength of each password in the original table, with the exception of s^fU¬5ü;y34G<, which contains no words or patterns.

    Read the article

  • To have efficient many-to-many relation in Java

    - by Masi
    How can you make the efficient many-to-many -relation from fileID to Words and from word to fileIDs without database -tools like Postgres in Java? I have the following classes. The relation from fileID to words is cheap, but not the reverse, since I need three for -loops for it. My solution is not apparently efficient. Other options may be to create an extra class that have word as an ID with the ArrayList of fileIDs. Reply to JacobM's answer The relevant part of MyFile's constructors is: /** * Synopsis of data in wordToWordConutInFile.txt: * fileID|wordID|wordCount * * Synopsis of the data in the file wordToWordID.txt: * word|wordID **/ /** * Getting words by getting first wordIDs from wordToWordCountInFile.txt and then words in wordToWordID.txt. */ InputStream in2 = new FileInputStream("/home/dev/wordToWordCountInFile.txt"); BufferedReader fi2 = new BufferedReader(new InputStreamReader(in2)); ArrayList<Integer> wordIDs = new ArrayList<Integer>(); String line = null; while ((line = fi2.readLine()) != null) { if ((new Integer(line.split("|")[0]) == currentFileID)) { wordIDs.add(new Integer(line.split("|")[6])); } } in2.close(); // Getting now the words by wordIDs. InputStream in3 = new FileInputStream("/home/dev/wordToWordID.txt"); BufferedReader fi3 = new BufferedReader(new InputStreamReader(in3)); line = null; while ((line = fi3.readLine()) != null) { for (Integer wordID : wordIDs) { if (wordID == (new Integer(line.split("|")[1]))) { this.words.add(new Word(new String(line.split("|")[0]), fileID)); break; } } } in3.close(); this.words.addAll(words); The constructor of Word is at the paste.

    Read the article

  • char[] and char* compatibility?

    - by Aerovistae
    In essence, will this code work? And before you say "Run it and see!", I just realized my cygwin didn't come with gcc and it's currently 40 minutes away from completing reinstallation. That being said: char* words[1000]; for(int i = 0; i<1000; i++) words[i] = NULL; char buffer[ 1024 ]; //omit code that places "ADD splash\0" into the buffer if(strncmp (buffer, "ADD ", 4){ char* temp = buffer + 4; printf("Adding: %s", temp); int i = 0; while(words[i] != NULL) i++; words[i] = temp; } I'm mostly uncertain about the line char* temp = buffer + 4, and also whether I can assign words[i] in the manner that I am. Am I going to get type errors when I eventually try to compile this in 40 minutes? Also-- if this works, why don't I need to use malloc() on each element of words[]? Why can I say words[i] = temp, instead of needing to allocate memory for words[i] the length of temp?

    Read the article

  • Java ArrayList remove dupes without sets

    - by Kieran
    I'm having problems removing duplicates from an ArrayList. It's for an assignment for college. Here's the code I have already: public int numberOfDiffWords() { ArrayList<String> list = new ArrayList<>(); for(int i=0; i<words.size()-1; i++) { for(int j=i+1; j<words.size(); j++) { if(words.get(i).equals(words.get(j))) { // do nothing } else { list.add(words.get(i)); } } } return list.size(); } The problem is in the numberOfDiffWords() method. The populate list method is working correctly, as my instructor has given me a sample string (containing 4465 words) to analyse - printing words.size() gives the correct result. I want to return the size of the new ArrayList with all duplicates removed. words is an ArrayList class attribute. UPDATE: I should have mentioned I'm only allowed to use dynamic indexed-based storage for this part of the assignment, which means no hash-based storage.

    Read the article

  • how to dispose a incoming email and then send some words back using googe-app-engine..

    - by zjm1126
    from google.appengine.api import mail i read the doc: mail.send_mail(sender="[email protected]", to="Albert Johnson <[email protected]>", subject="Your account has been approved", body=""" Dear Albert: Your example.com account has been approved. You can now visit http://www.example.com/ and sign in using your Google Account to access new features. Please let us know if you have any questions. The example.com Team """) and i know hwo to send a email using gae ,but how to check a email incoming, and then do something thanks

    Read the article

  • How to transform phrases and words into MD5 hash?

    - by brilliant
    Can anyone, please, explain to me how to transform a phrase like "I want to buy some milk" into MD5? I read Wikipedia article on MD5, but the explanation given there is beyond my comprehension: "MD5 processes a variable-length message into a fixed-length output of 128 bits. The input message is broken up into chunks of 512-bit blocks (sixteen 32-bit little endian integers)" "sixteen 32-bit little endian integers" is already hard for me. I checked the article on little endians and didn't understand a bit. However, the examples of some phrases and their MD5 hashes are very nice: MD5("The quick brown fox jumps over the lazy dog") = 9e107d9d372bb6826bd81d3542a419d6 MD5("The quick brown fox jumps over the lazy dog.") = e4d909c290d0fb1ca068ffaddf22cbd0 Can anyone, please, explain to me how this MD5 algorithm works on some very simple example? And also, perhaps you know some software or a code that would transform phrases into their MD5. If yes, please, let me know.

    Read the article

  • Pulling out two separate words from a string using reg expressions?

    - by Marvin
    I need to improve on a regular expression I'm using. Currently, here it is: ^[a-zA-Z\s/-]+ I'm using it to pull out medication names from a variety of formulation strings, for example: SULFAMETHOXAZOLE-TRIMETHOPRIM 200-40 MG/5ML PO SUSP AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE AMOXICILLIN TRIHYDRATE 125 mg ORAL TABLET, CHEWABLE AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE Amoxicillin 1000 MG / Clavulanate 62.5 MG Extended Release Tablet The resulting matches on these examples are: SULFAMETHOXAZOLE-TRIMETHOPRIM AMOX TR/POTASSIUM CLAVULANATE AMOXICILLIN TRIHYDRATE AMOX TR/POTASSIUM CLAVULANATE Amoxicillin The first four are what I want, but on the fifth, I really need "Amoxicillin / Clavulanate". How would I pull out patterns like "Amoxicillin / Clavulanate" (in fifth row) while missing patterns like "MG/5 ML" (in the first row)?

    Read the article

  • It's possible make an OCR in Python to check words...

    - by Shady
    in opened applications? I want to automate firefox in some web page and I don't have a way to "know" if the page already load completely or if it still loading... I was thinking about making an OCR to check the status bar... it's difficult ? For example, when the word DONE appears at the status bar, the program continues to the next command...

    Read the article

  • Fastest way to put contents of Set<String> to a single String with words separated by a whitespace?

    - by Lars Andren
    I have a few Set<String>s and want to transform each of these into a single String where each element of the original Set is separated by a whitespace " ". A naive first approach is doing it like this Set<String> set_1; Set<String> set_2; StringBuilder builder = new StringBuilder(); for (String str : set_1) { builder.append(str).append(" "); } this.string_1 = builder.toString(); builder = new StringBuilder(); for (String str : set_2) { builder.append(str).append(" "); } this.string_2 = builder.toString(); Can anyone think of a faster, prettier or more efficient way to do this?

    Read the article

  • A PHP Library / Class to Count Words in Various Languages?

    - by Michael Robinson
    Some time in the near future I will need to implement a cross-language word count, or if that is not possible, a cross-language character count. I'd love it if I just had to look at English, but I need to consider every language here, Chinese, Korean, English, Arabic, Hindi, and so on. I would like to know if Stack Overflow has any leads on where to start looking for an existing product / method to do this in PHP, as I am a good lazy programmer* *http://blogoscoped.com/archive/2005-08-24-n14.html

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >