Search Results

Search found 100611 results on 4025 pages for 'code search'.

Page 51/4025 | < Previous Page | 47 48 49 50 51 52 53 54 55 56 57 58 | Next Page >

Binary Search Tree in Java

- by John R

I want to make a generic BST, that can be made up of any data type, but i'm not sure how I could add things to the tree, if my BST is generic. All of my needed code is below. I want my BST made up of Locations, and sorted by the x variable. Any help is appreciated. Major thanks for looking. public void add(E element) { if (root == null) root = element; if (element < root) add(element, root.leftChild); if (element > root) add(element, root.rightChild); else System.out.println("Element Already Exists"); } private void add(E element, E currLoc) { if (currLoc == null) currLoc = element; if (element < root) add(element, currLoc.leftChild); if (element > root) add(element, currLoc.rightChild); else System.out.println("Element Already Exists); } Other Code public class BinaryNode<E> { E BinaryNode; BinaryNode nextBinaryNode; BinaryNode prevBinaryNode; public BinaryNode() { BinaryNode = null; nextBinaryNode = null; prevBinaryNode = null; } } public class Location<AnyType> extends BinaryNode { String name; int x,y; public Location() { name = null; x = 0; y = 0; } public Location(String newName, int xCord, int yCord) { name = newName; x = xCord; y = yCord; } public int equals(Location otherScene) { return name.compareToIgnoreCase(otherScene.name); } }

Read the article
Faceted search with Solr on Windows

- by Dr.NETjes

With over 10 million hits a day, funda.nl is probably the largest ASP.NET website which uses Solr on a Windows platform. While all our data (i.e. real estate properties) is stored in SQL Server, we're using Solr 1.4.1 to return the faceted search results as fast as we can.And yes, Solr is very fast. We did do some heavy stress testing on our Solr service, which allowed us to do over 1,000 req/sec on a single 64-bits Solr instance; and that's including converting search-url's to Solr http-queries and deserializing Solr's result-XML back to .NET objects! Let me tell you about faceted search and how to integrate Solr in a .NET/Windows environment. I'll bet it's easier than you think :-) What is faceted search? Faceted search is the clustering of search results into categories, allowing users to drill into search results. By showing the number of hits for each facet category, users can easily see how many results match that category. If you're still a bit confused, this example from CNET explains it all: The SQL solution for faceted search Our ("pre-Solr") solution for faceted search was done by adding a lot of redundant columns to our SQL tables and doing a COUNT(...) for each of those columns: So if a user was searching for real estate properties in the city 'Amsterdam', our facet-query would be something like: SELECT COUNT(hasGarden), COUNT(has2Bathrooms), COUNT(has3Bathrooms), COUNT(etc...) FROM Houses WHERE city = 'Amsterdam' While this solution worked fine for a couple of years, it wasn't very easy for developers to add new facets. And also, performing COUNT's on all matched rows only performs well if you have a limited amount of rows in a table (i.e. less than a million). Enter Solr "Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface." (quoted from Wikipedia's page on Solr) Solr isn't a database, it's more like a big index. Every time you upload data to Solr, it will analyze the data and create an inverted index from it (like the index-pages of a book). This way Solr can lookup data very quickly. To explain the inner workings of Solr is beyond the scope of this post, but if you want to learn more, please visit the Solr Wiki pages. Getting faceted search results from Solr is very easy; first let me show you how to send a http-query to Solr: http://localhost:8983/solr/select?q=city:Amsterdam This will return an XML document containing the search results (in this example only three houses in the city of Amsterdam): <response> <result name="response" numFound="3" start="0"> <doc> <long name="id">3203</long> <str name="city">Amsterdam</str> <str name="steet">Keizersgracht</str> <int name="numberOfBathrooms">2</int> </doc> <doc> <long name="id">3205</long> <str name="city">Amsterdam</str> <str name="steet">Vondelstraat</str> <int name="numberOfBathrooms">3</int> </doc> <doc> <long name="id">4293</long> <str name="city">Amsterdam</str> <str name="steet">Wibautstraat</str> <int name="numberOfBathrooms">2</int> </doc> </result> </response> By adding a facet-querypart for the field "numberOfBathrooms", Solr will return the facets for this particular field. We will see that there's one house in Amsterdam with three bathrooms and two houses with two bathrooms. http://localhost:8983/solr/select?q=city:Amsterdam&facet=true&facet.field=numberOfBathrooms The complete XML response from Solr now looks like: <response> <result name="response" numFound="3" start="0"> <doc> <long name="id">3203</long> <str name="city">Amsterdam</str> <str name="steet">Keizersgracht</str> <int name="numberOfBathrooms">2</int> </doc> <doc> <long name="id">3205</long> <str name="city">Amsterdam</str> <str name="steet">Vondelstraat</str> <int name="numberOfBathrooms">3</int> </doc> <doc> <long name="id">4293</long> <str name="city">Amsterdam</str> <str name="steet">Wibautstraat</str> <int name="numberOfBathrooms">2</int> </doc> </result> <lst name="facet_fields"> <lst name="numberOfBathrooms"> <int name="2">2</int> <int name="3">1</int> </lst> </lst> </response> Trying Solr for yourself To run Solr on your local machine and experiment with it, you should read the Solr tutorial. This tutorial really only takes 1 hour, in which you install Solr, upload sample data and get some query results. And yes, it works on Windows without a problem. Note that in the Solr tutorial, you're using Jetty as a Java Servlet Container (that's why you must start it using "java -jar start.jar"). In our environment we prefer to use Apache Tomcat to host Solr, which installs like a Windows service and works more like .NET developers expect. See the SolrTomcat page.Some best practices for running Solr on Windows: Use the 64-bits version of Tomcat. In our tests, this doubled the req/sec we were able to handle!Use a .NET XmlReader to convert Solr's XML output-stream to .NET objects. Don't use XPath; it won't scale well.Use filter queries ("fq" parameter) instead of the normal "q" parameter where possible. Filter queries are cached by Solr and will speed up Solr's response time (see FilterQueryGuidance)In my next post I’ll talk about how to keep Solr's indexed data in sync with the data in your SQL tables. Timestamps / rowversions will help you out here!

Read the article
Google tweets – Now search twitter archives using Google

- by samsudeen

Google has launched a Twitter archive service which allows you to search tweets in real time as well as on its huge public archive (remember Twitter crossed 10 billionth tweet last month). The search results are displayed as tweets with twitter logo. To explore the twitter search go to Google.com homepage and select “Show options” on the search results page, then select “Updates.”. The search is similar to the Google search with options to dig through the tweets by timeframe. You can explore results by zooming through a particular time range or date. In addition to the time chart, it also displays the relative volume of an activity on Twitter about the topic. as you can see there is a spike about GSLV launch after 3 PM today.There is also a short cut link “Now” on the left corner which displays the latest results on the topics searched.The tweets also gets refreshed automatically. Considering the huge volume of activity (50 million messages per day) on twitter, the archive is going to more and bigger. By providing such feature Google has once again proved it is way ahead of others in search Related Posts:None FoundJoin us on Facebook to read all our stories right inside your Facebook news feed.

Read the article
Google tweets – Now search twitter archives using Google

- by samsudeen

Google has launched a Twitter archive service which allows you to search tweets in real time as well as on its huge public archive (remember Twitter crossed 10 billionth tweet last month). The search results are displayed as tweets with twitter logo. To explore the twitter search go to Google.com homepage and select “Show options” on the search results page, then select “Updates.”. The search is similar to the Google search with options to dig through the tweets by timeframe. You can explore results by zooming through a particular time range or date. In addition to the time chart, it also displays the relative volume of an activity on Twitter about the topic. as you can see there is a spike about GSLV launch after 3 PM today.There is also a short cut link “Now” on the left corner which displays the latest results on the topics searched.The tweets also gets refreshed automatically. Considering the huge volume of activity (50 million messages per day) on twitter, the archive is going to more and bigger. By providing such feature Google has once again proved it is way ahead of others in search Related Posts:None FoundJoin us on Facebook to read all our stories right inside your Facebook news feed.

Read the article
Finding if a Binary Tree is a Binary Search Tree

- by dharam

Today I had an interview where I was asked to write a program which takes a Binary Tree and returns true if it is also a Binary Search Tree otherwise false. My Approach1: Perform an inroder traversal and store the elements in O(n) time. Now scan through the array/list of elements and check if element at ith index is greater than element at (i+1)th index. If such a condition is encountered, return false and break out of the loop. (This takes O(n) time). At the end return true. But this gentleman wanted me to provide an efficient solution. I tried but I was unsuccessfult, because to find if it is a BST I have to check each node. Moreover he was pointing me to think over recusrion. My Approach 2: A BT is a BST if for any node N N-left is < N and N-right N , and the INorder successor of left node of N is less than N and the inorder successor of right node of N is greater than N and the left and right subtrees are BSTs. But this is going to be complicated and running time doesn't seem to be good. Please help if you know any optimal solution. Thanks.

Read the article
Using the Static Code Analysis feature of Visual Studio (Premium/Ultimate) to find memory leakage problems

- by terje

Memory for managed code is handled by the garbage collector, but if you use any kind of unmanaged code, like native resources of any kind, open files, streams and window handles, your application may leak memory if these are not properly handled. To handle such resources the classes that own these in your application should implement the IDisposable interface, and preferably implement it according to the pattern described for that interface. When you suspect a memory leak, the immediate impulse would be to start up a memory profiler and start digging into that. However, before you follow that impulse, do a Static Code Analysis run with a ruleset tuned to finding possible memory leaks in your code. If you get any warnings from this, fix them before you go on with the profiling. How to use a ruleset In Visual Studio 2010 (Premium and Ultimate editions) you can define your own rulesets containing a list of Static Code Analysis checks. I have defined the memory checks as shown in the lists below as ruleset files, which can be downloaded – see bottom of this post. When you get them, you can easily attach them to every project in your solution using the Solution Properties dialog. Right click the solution, and choose Properties at the bottom, or use the Analyze menu and choose “Configure Code Analysis for Solution”: In this dialog you can now choose the Memorycheck ruleset for every project you want to investigate. Pressing Apply or Ok opens every project file and changes the projects code analysis ruleset to the one we have specified here. How to define your own ruleset (skip this if you just download my predefined rulesets) If you want to define the ruleset yourself, open the properties on any project, choose Code Analysis tab near the bottom, choose any ruleset in the drop box and press Open Clear out all the rules by selecting “Source Rule Sets” in the Group By box, and unselect the box Change the Group By box to ID, and select the checks you want to include from the lists below. Note that you can change the action for each check to either warning, error or none, none being the same as unchecking the check. Now go to the properties window and set a new name and description for your ruleset. Then save (File/Save as) the ruleset using the new name as its name, and use it for your projects as detailed above. It can also be wise to add the ruleset to your solution as a solution item. That way it’s there if you want to enable Code Analysis in some of your TFS builds. Running the code analysis In Visual Studio 2010 you can either do your code analysis project by project using the context menu in the solution explorer and choose “Run Code Analysis”, you can define a new solution configuration, call it for example Debug (Code Analysis), in for each project here enable the Enable Code Analysis on Build In Visual Studio Dev-11 it is all much simpler, just go to the Solution root in the Solution explorer, right click and choose “Run code analysis on solution”. The ruleset checks The following list is the essential and critical memory checks. CheckID Message Can be ignored ? Link to description with fix suggestions CA1001 Types that own disposable fields should be disposable No http://msdn.microsoft.com/en-us/library/ms182172.aspx CA1049 Types that own native resources should be disposable Only if the pointers assumed to point to unmanaged resources point to something else http://msdn.microsoft.com/en-us/library/ms182173.aspx CA1063 Implement IDisposable correctly No http://msdn.microsoft.com/en-us/library/ms244737.aspx CA2000 Dispose objects before losing scope No http://msdn.microsoft.com/en-us/library/ms182289.aspx CA2115 1 Call GC.KeepAlive when using native resources See description http://msdn.microsoft.com/en-us/library/ms182300.aspx CA2213 Disposable fields should be disposed If you are not responsible for release, of if Dispose occurs at deeper level http://msdn.microsoft.com/en-us/library/ms182328.aspx CA2215 Dispose methods should call base class dispose Only if call to base happens at deeper calling level http://msdn.microsoft.com/en-us/library/ms182330.aspx CA2216 Disposable types should declare a finalizer Only if type does not implement IDisposable for the purpose of releasing unmanaged resources http://msdn.microsoft.com/en-us/library/ms182329.aspx CA2220 Finalizers should call base class finalizers No http://msdn.microsoft.com/en-us/library/ms182341.aspx Notes: 1) Does not result in memory leak, but may cause the application to crash The list below is a set of optional checks that may be enabled for your ruleset, because the issues these points too often happen as a result of attempting to fix up the warnings from the first set. ID Message Type of fault Can be ignored ? Link to description with fix suggestions CA1060 Move P/invokes to NativeMethods class Security No http://msdn.microsoft.com/en-us/library/ms182161.aspx CA1816 Call GC.SuppressFinalize correctly Performance Sometimes, see description http://msdn.microsoft.com/en-us/library/ms182269.aspx CA1821 Remove empty finalizers Performance No http://msdn.microsoft.com/en-us/library/bb264476.aspx CA2004 Remove calls to GC.KeepAlive Performance and maintainability Only if not technically correct to convert to SafeHandle http://msdn.microsoft.com/en-us/library/ms182293.aspx CA2006 Use SafeHandle to encapsulate native resources Security No http://msdn.microsoft.com/en-us/library/ms182294.aspx CA2202 Do not dispose of objects multiple times Exception (System.ObjectDisposedException) No http://msdn.microsoft.com/en-us/library/ms182334.aspx CA2205 Use managed equivalents of Win32 API Maintainability and complexity Only if the replace doesn’t provide needed functionality http://msdn.microsoft.com/en-us/library/ms182365.aspx CA2221 Finalizers should be protected Incorrect implementation, only possible in MSIL coding No http://msdn.microsoft.com/en-us/library/ms182340.aspx Downloadable ruleset definitions I have defined three rulesets, one called Inmeta.Memorycheck with the rules in the first list above, and Inmeta.Memorycheck.Optionals containing the rules in the second list, and the last one called Inmeta.Memorycheck.All containing the sum of the two first ones. All three rulesets can be found in the zip archive “Inmeta.Memorycheck” downloadable from here. Links to some other resources relevant to Static Code Analysis MSDN Magazine Article by Mickey Gousset on Static Code Analysis in VS2010 MSDN : Analyzing Managed Code Quality by Using Code Analysis, root of the documentation for this Preventing generated code from being analyzed using attributes Online training course on Using Code Analysis with VS2010 Blogpost by Tatham Oddie on custom code analysis rules How to write custom rules, from Microsoft Code Analysis Team Blog Microsoft Code Analysis Team Blog

Read the article
what is exact difference in java script tag like <% = some code %> and <# some code %>

- by Arvind Bhatt

what is exact difference in java script tag like <% = some code % and <# some code %

Read the article
which free code coverage tool is better cobertura or emma?

- by RN

personally I found cobertura is very easy to configure in maven and it generate decent looking report with little or more coloring scheme which I like personally than emma. So +1 for cobertura. what you guys have experienced ?

Read the article
Breadth First Vs Depth First

- by Ted Smith

When Traversing a Tree/Graph what is the difference between Breadth First and Depth first? Any coding or pseudocode examples would be great.

Read the article
rails search nested set (categories and sub categories)

- by bob

Hello, I am using the http://github.com/collectiveidea/awesome_nested_set awesome nested set plugin and currently, if I choose a sub category as my category_id for an item, I can not search by its parent. Category.parent Category.Child I choose Category.child as the category that my item is in. So now my item has category_id of 4 stored in it. If I go to a page in my rails application, lets say teh Category page and I am on the Category.parent's page, I want to show products that have category_id's of all the descendants as well. So ideally i want to have a find method that can take into account the descendants. You can get the descendants of a root by calling root.descendants (a built in plugin method). How would I go about making it so I can query a find that gets the descendants of a root instead of what its doing now which is binging up nothing unless the product had a specific category_id of the Category.parent. I hope I am being clear here. I either need to figure out a way to create a find method or named_scope that can query and return an array of objects that have id's corresponding tot he descendants of a root OR if I have any other options, what are they? I thought about creating a field in my products table like parent_id which can keep track of the parent so i can then create two named scopes one finding the parent stuff and one finding the child stuff and chaining them. I know I can create a named scope for each child and chain them together for multiple children but this seems a very tedious process and also, if you add more children, you would need to specify more named scopes.

Read the article
Stand-alone Java code formatter/beautifier/pretty printer?

- by Greg Mattes

I'm interested in learning about the available choices of high-quality, stand-alone source code formatters for Java. The formatter must be stand-alone, that is, it must support a "batch" mode that is decoupled from any particular development environment. Ideally it should be independent of any particular operating system as well. So, a built-in formatter for the IDE du jour is of little interest here (unless that IDE supports batch mode formatter invocation, perhaps from the command line). A formatter written in closed-source C/C++ that only runs on, say, Windows is not ideal, but is somewhat interesting. To be clear, a "formatter" (or "beautifier") is not the same as a "style checker." A formatter accepts source code as input, applies styling rules, and produces styled source code that is semantically equivalent to the original source code. A style checker also applies styling rules, but it simply reports rule violations without producing modified source code as output. So the picture looks like this: Formatter (produces modified source code that conforms to styling rules) Read Source Code → Apply Styling Rules → Write Styled Source Code Style Checker (does not produce modified source code) Read Source Code → Apply Styling Rules → Write Rule Violations Further Clarifications Solutions must be highly configurable. I want to be able to specify my own style, not simply select from a canned list. Also, I'm not looking for a general purpose pretty-printer written in Java that can pretty-print many things. I want to style Java code. I'm also not necessarily interested in a grand-unified formatter for many languages. I suppose it might be nice for a solution to have support for languages other than Java, but that is not a requirement. Furthermore, tools that only perform code highlighting are right out. I'm also not interested in a web service. I want a tool that I can run locally. Finally, solutions need not be restricted to open source, public domain, shareware, free software, commercial, or anything else. All forms of licensing are acceptable.

Read the article
SQL Server Search Proper Names Full Text Index vs LIKE + SOUNDEX

- by Matthew Talbert

I have a database of names of people that has (currently) 35 million rows. I need to know what is the best method for quickly searching these names. The current system (not designed by me), simply has the first and last name columns indexed and uses "LIKE" queries with the additional option of using SOUNDEX (though I'm not sure this is actually used much). Performance has always been a problem with this system, and so currently the searches are limited to 200 results (which still takes too long to run). So, I have a few questions: Does full text index work well for proper names? If so, what is the best way to query proper names? (CONTAINS, FREETEXT, etc) Is there some other system (like Lucene.net) that would be better? Just for reference, I'm using Fluent NHibernate for data access, so methods that work will with that will be preferred. I'm using SQL Server 2008 currently. EDIT I want to add that I'm very interested in solutions that will deal with things like commonly misspelled names, eg 'smythe', 'smith', as well as first names, eg 'tomas', 'thomas'. Query Plan |--Parallelism(Gather Streams) |--Nested Loops(Inner Join, OUTER REFERENCES:([testdb].[dbo].[Test].[Id], [Expr1004]) OPTIMIZED WITH UNORDERED PREFETCH) |--Hash Match(Inner Join, HASH:([testdb].[dbo].[Test].[Id])=([testdb].[dbo].[Test].[Id])) | |--Bitmap(HASH:([testdb].[dbo].[Test].[Id]), DEFINE:([Bitmap1003])) | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([testdb].[dbo].[Test].[Id])) | | |--Index Seek(OBJECT:([testdb].[dbo].[Test].[IX_Test_LastName]), SEEK:([testdb].[dbo].[Test].[LastName] >= 'WHITDþ' AND [testdb].[dbo].[Test].[LastName] < 'WHITF'), WHERE:([testdb].[dbo].[Test].[LastName] like 'WHITE%') ORDERED FORWARD) | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([testdb].[dbo].[Test].[Id])) | |--Index Seek(OBJECT:([testdb].[dbo].[Test].[IX_Test_FirstName]), SEEK:([testdb].[dbo].[Test].[FirstName] >= 'THOMARþ' AND [testdb].[dbo].[Test].[FirstName] < 'THOMAT'), WHERE:([testdb].[dbo].[Test].[FirstName] like 'THOMAS%' AND PROBE([Bitmap1003],[testdb].[dbo].[Test].[Id],N'[IN ROW]')) ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([testdb].[dbo].[Test].[PK__TEST__3214EC073B95D2F1]), SEEK:([testdb].[dbo].[Test].[Id]=[testdb].[dbo].[Test].[Id]) LOOKUP ORDERED FORWARD) SQL for above: SELECT * FROM testdb.dbo.Test WHERE LastName LIKE 'WHITE%' AND FirstName LIKE 'THOMAS%' Based on advice from Mitch, I created an index like this: CREATE INDEX IX_Test_Name_DOB ON Test (LastName ASC, FirstName ASC, BirthDate ASC) INCLUDE (and here I list the other columns) My searches are now incredibly fast for my typical search (last, first, and birth date).

Read the article
Who is code wanderer?

- by DigiMortal

In every area of life there are people with some bad habits or misbehaviors that affect the work process. Software development is also not free of this kind of people. Today I will introduce you code wanderer. Who is code wanderer? Code wandering is more like bad habit than serious diagnose. Code wanderers tend to review and “fix” source code in files written by others. When code wanderer has some free moments he starts to open the code files he or she has never seen before and starts making little fixes to these files. Why is code wanderer dangerous? These fixes seem correct and are usually first choice to do when considering nice code. But as changes are made by coder who has no idea about the code he or she “fixes” then “fixing” usually ends up with messing up working code written by others. Often these “fixes” are not found immediately because they doesn’t introduce errors detected by compilers. So these “fixes” find easily way to production environments because there is also very good chance that “fixed” code goes through all tests without any problems. How to stop code wanderer? The first thing is to talk with person and explain him or her why those changes are dangerous. It is also good to establish rules that state clearly why, when and how can somebody change the code written by other people. If this does not work it is possible to isolate this person so he or she can post his or her changes to code repository as patches and somebody reviews those changes before applying them.

Read the article
What would you add to Code Complete 3rd Edition?

- by Peter Turner

It's been quite a few years since Code Complete was published. I really love the book, I keep it in the bathroom at the office and read a little out of it once or twice a day. I was just wondering for the sake of wonderment, what kinds of things need to be added to Code Complete 3e, and for the sake of reductionism, what kinds of things would be removed. Also, what languages would you use for code examples?

Read the article
How to facilitate code reviews in a small team for embedded software?

- by Adam Lewis

Short Question Does a cost-effective tool / workflow exist to facilitate code reviews in a small team? More specifically, a small team that relies on post-commit code reviews. Background Our team currently consists of 3 full time and 1 part time software engineers, with plans on hiring more in the near future. Due to our team size and volume of projects we all must juggle, the pre-commit workflow that major tools (such as Review Board and Code Collaborator) use is not obtainable for us right now. The best we can do at the moment is to perform post-commit reviews before major releases or as time permits. Nearly all of our projects are hosted on RepositoryHosting.com (which I highly recommend) and contain a mixture of SVN and GIT repositories. Current Thoughts Since I cannot find a tool that fits our needs right now, I am turning to TRAC that is built into our repository's site. At the moment we use TRAC to file tickets and track milestones, so to me this seems like a natural fit for code review results as well. The direction I am heading in right now is to use a spread sheet(s) to log all of the bugs and comments. Do some macro magic to get it in a format that I can use TRAC's import ticket method and use TRAC's ticketing system to create the action items / bug reports automatically. The auto ticket generation is darn near a must have, adding in bugs and comments one at a time from a web-gui is really painful. Secondary Question If this workflow makes sense, is there a good / standard template to use as a code review log?

Read the article
shell scripting: search/replace & check file exist

- by johndashen

I have a perl script (or any executable) E which will take a file foo.xml and write a file foo.txt. I use a Beowulf cluster to run E for a large number of XML files, but I'd like to write a simple job server script in shell (bash) which doesn't overwrite existing txt files. I'm currently doing something like #!/bin/sh PATTERN="[A-Z]*0[1-2][a-j]"; # this matches foo in all cases todo=`ls *.xml | grep $PATTERN -o`; isdone=`ls *.txt | grep $PATTERN -o`; whatsleft=todo - isdone; # what's the unix magic? #tack on the .xml prefix with sed or something #and then call the job server; jobserve E "$whatsleft"; and then I don't know how to get the difference between $todo and $isdone. I'd prefer using sort/uniq to something like a for loop with grep inside, but I'm not sure how to do it (pipes? temporary files?) As a bonus question, is there a way to do lookahead search in bash grep? To clarify: so the simplest way to do what i'm asking is (in pseudocode) for i in `/bin/ls *.xml` do replace xml suffix with txt if [that file exists] add to whatsleft list end done

Read the article
Python SQLite FTS3 alternatives?

- by Mike Cialowicz

Are there any good alternatives to SQLite + FTS3 for python? I'm iterating over a series of text documents, and would like to categorize them according to some text queries. For example, I might want to know if a document mentions the words "rating" or "upgraded" within three words of "buy." The FTS3 syntax for this query is the following: (rating OR upgraded) NEAR/3 buy That's all well and good, but if I use FTS3, this operation seems rather expensive. The process goes something like this: # create an SQLite3 db in memory conn = sqlite3.connect(':memory:') c = conn.cursor() c.execute('CREATE VIRTUAL TABLE fts USING FTS3(content TEXT)') conn.commit() Then, for each document, do something like this: #insert the document text into the fts table, so I can run a query c.execute('insert into fts(content) values (?)', content) conn.commit() # execute my FTS query here, look at the results, etc # remove the document text from the fts table before working on the next document c.execute('delete from fts') conn.commit() This seems rather expensive to me. The other problem I have with SQLite FTS is that it doesn't appear to work with Python 2.5.4. The 'CREATE VIRTUAL TABLE' syntax is unrecognized. This means that I'd have to upgrade to Python 2.6, which means re-testing numerous existing scripts and programs to make sure they work under 2.6. Is there a better way? Perhaps a different library? Something faster? Thank you.

Read the article
How to configure PDF iFilter for SharePoint Server 2010 or Search Server 2010

How to configure PDF iFilter for SharePoint Server 2010 or Search Server 2010

Read the article
What is the correct way to implement a massive hierarchical, geographical search for news?

- by Philip Brocoum

The company I work for is in the business of sending press releases. We want to make it possible for interested parties to search for press releases based on a number of criteria, the most important being location. For example, someone might search for all news sent to New York City, Massachusetts, or ZIP code 89134, sent from a governmental institution, under the topic of "traffic". Or whatever. The problem is, we've sent, literally, hundreds of thousands of press releases. Searching is slow and complex. For example, a press release sent to Queens, NY should show up in the search I mentioned above even though it wasn't specifically sent to New York City, because Queens is a subset of New York City. We may also want to implement "and" and "or" and negation and text search to the query to create complex searches. These searches also have to be fast enough to function as dynamic RSS feeds. I really don't know anything about search theory, or how it's properly done. The way we are getting by right now is using a data mart to store the locations the releases were sent to in a single table. However, because of the subset thing mentioned above, the data mart is gigantic with millions of rows. And we haven't even implemented cities yet, and there are about 50,000 cities in the United States, which will exponentially increase the size of the data mart by so much I'm afraid it just won't work anymore. Anyway, I realize this is not a simple question and there won't be a "do this" answer. However, I'm hoping one of you can point me in the right direction where I can learn about how massive searches are done? Because I really know nothing about it. And such a search engine is turning out to be incredibly difficult to make. Thanks! I know there must be a way because if Google can search the entire internet we must be able to search our own database :-)

Read the article
Can I have managed code within native code?

- by A-ha

Can I have managed code within native code?

Read the article
average case running time of linear search algorithm

- by Brahadeesh

Hi all. I am trying to derive the average case running time for deterministic linear search algorithm. The algorithm searches an element x in an unsorted array A in the order A[1], A[2], A[3]...A[n]. It stops when it finds the element x or proceeds until it reaches the end of the array. I searched on wikipedia and the answer given was (n+1)/(k+1) where k is the number of times x is present in the array. I approached in another way and am getting a different answer. Can anyone please give me the correct proof and also let me know whats wrong with my method? E(T)= 1*P(1) + 2*P(2) + 3*P(3) ....+ n*P(n) where P(i) is the probability that the algorithm runs for 'i' time (i.e. compares 'i' elements). P(i)= (n-i)C(k-1) * (n-k)! / n! Here, (n-i)C(k-1) is (n-i) Choose (k-1). As the algorithm has reached the ith step, the rest of k-1 x's must be in the last n-i elements. Hence (n-i)C(k-i). (n-k)! is the total number of ways of arranging the rest non x numbers, and n! is the total number of ways of arranging the n elements in the array. I am not getting (n+1)/(k+1) on simplifying.

Read the article
PHP: If no Results - Split the Searchrequest and Try to find Parts of the Search

- by elmaso

Hello, i want to split the searchrequest into parts, if there's nothing to find. example: "nelly furtado ft. jimmy jones" - no results - try to find with nelly, furtado, jimmy or jones.. i have an api url.. thats the difficult part.. i show you some of the actually snippets: $query = urlencode (strip_tags ($_GET[search])); and $found = '0'; if ($source == 'all') { if (!($res = @get_url ('http://api.example.com/?key=' . $API . '&phrase=' . $query . ' . '&sort=' . $sort))) { exit ('<error>Cannot get requested information.</error>'); ; } how can i put a else request in this snippet, like if nothing found take the first word, or the second word, is this possible? or maybe you can tell me were i can read stuff about this function? thank you!!

Read the article
Binary Search Tree - Postorder logic

- by daveb

I am looking at implementing code to work out binary search tree. Before I do this I was wanting to verify my input data in postorder and preorder. I am having trouble working out what the following numbers would be in postorder and preorder I have the following numbers 4, 3, 14 ,8 ,1, 15, 9, 5, 13, 10, 2, 7, 6, 12, 11, that I am intending to put into an empty binary tree in that order. The order I arrived at for the numbers in POSTORDER is 2, 1, 6, 3, 7, 11, 12, 10, 9, 8, 13, 15, 14, 4. Have I got this right? I was wondering if anyone here would be able to kindly verify if the postorder sequence I came up with is indeed the correct sequence for my input i.e doing left subtree, right subtree and then root. The order I got for pre order (Visit root, do left subtree, do right subtree) is 4, 3, 1, 2, 5, 6, 14 , 8, 7, 9, 10, 12, 11, 15, 13. I can't be certain I got this right. Very grateful for any verification. Many Thanks

Read the article
Who is code wanderer?

- by DigiMortal

In every area of life there are people with some bad habits or misbehaviors that affect the work process. Software development is also not free of this kind of people. Today I will introduce you code wanderer. Who is code wanderer? Code wandering is more like bad habit than serious diagnose. Code wanderers tend to review and “fix” source code in files written by others. When code wanderer has some free moments he starts to open the code files he or she has never seen before and starts making little fixes to these files. Why is code wanderer dangerous? These fixes seem correct and are usually first choice to do when considering nice code. But as changes are made by coder who has no idea about the code he or she “fixes” then “fixing” usually ends up with messing up working code written by others. Often these “fixes” are not found immediately because they doesn’t introduce errors detected by compilers. So these “fixes” find easily way to production environments because there is also very good chance that “fixed” code goes through all tests without any problems. How to stop code wanderer? The first thing is to talk with person and explain him or her why those changes are dangerous. It is also good to establish rules that state clearly why, when and how can somebody change the code written by other people. If this does not work it is possible to isolate this person so he or she can post his or her changes to code repository as patches and somebody reviews those changes before applying them.

Read the article
how do i filter my lucene search results?

- by Andrew Bullock

Say my requirement is "search for all users by name, who are over 18" If i were using SQL, i might write something like: Select * from [Users] Where ([firstname] like '%' + @searchTerm + '%' OR [lastname] like '%' + @searchTerm + '%') AND [age] >= 18 However, im having difficulty translating this into lucene.net. This is what i have so far: var parser = new MultiFieldQueryParser({ "firstname", "lastname"}, new StandardAnalyser()); var luceneQuery = parser.Parse(searchterm) var query = FullTextSession.CreateFullTextQuery(luceneQuery, typeof(User)); var results = query.List<User>(); How do i add in the "where age = 18" bit? I've heard about .SetFilter(), but this only accepts LuceneQueries, and not IQueries. If SetFilter is the right thing to use, how do I make the appropriate filter? If not, what do I use and how do i do it? Thanks! P.S. This is a vastly simplified version of what I'm trying to do for clarity, my WHERE clause is actually a lot more complicated than shown here. In reality i need to check if ids exist in subqueries and check a number of unindexed properties. Any solutions given need to support this. Thanks

Read the article

< Previous Page | 47 48 49 50 51 52 53 54 55 56 57 58 | Next Page >