Search Results

Search found 8611 results on 345 pages for 'apt fast'.

Page 299/345 | < Previous Page | 295 296 297 298 299 300 301 302 303 304 305 306  | Next Page >

  • Help with Neuroph neural network

    - by user359708
    For my graduate research I am creating a neural network that trains to recognize images. I am going much more complex than just taking a grid of RGB values, downsampling, and and sending them to the input of the network, like many examples do. I actually use over 100 independently trained neural networks that detect features, such as lines, shading patterns, etc. Much more like the human eye, and it works really well so far! The problem is I have quite a bit of training data. I show it over 100 examples of what a car looks like. Then 100 examples of what a person looks like. Then over 100 of what a dog looks like, etc. This is quite a bit of training data! Currently I am running at about one week to train the network. This is kind of killing my progress, as I need to adjust and retrain. I am using Neuroph, as the low-level neural network API. I am running a dual-quadcore machine(16 cores with hyperthreading), so this should be fast. My processor percent is at only 5%. Are there any tricks on Neuroph performance? Or Java peroformance in general? Suggestions? I am a cognitive psych doctoral student, and I am decent as a programmer, but do not know a great deal about performance programming.

    Read the article

  • Django LFS - custom views

    - by owca
    For all those ligthning fast shop users. I'm trying to implement my own first page view that will list all products from shop ( under '/' address). So I have a template : {% extends "lfs/shop/shop_base.html" %} {% block content %} <div id="najnowsze_produkty"> <ul> {% for obj in objects %} <li> {{ obj.name }} </li> {% endfor %} </ul> </div> {% endblock %} and then I've edited main shop view : from lfs.catalog.models import Category from lfs.catalog.models import Product def shop_view(request, template_name="lfs/shop/shop.html"): products = Product.objects.all() shop = lfs_get_object_or_404(Shop, pk=1) return render_to_response(template_name, RequestContext(request, { "shop" : shop, "products" : products })) but it just shows nothing. When I do Product.objects.all() query in shell I get results. Any ideas what could cause the problem ? Maybe I should filter products with 'active' status only ? But I'm not sure if it can influence all objects in any way.

    Read the article

  • WPF, C# - Making Intellisense/Autocomplete list, fastest way to filter list of strings

    - by user559548
    Hello everyone, I'm writing an Intellisense/Autocomplete like the one you find in Visual Studio. It's all fine up until when the list contains probably 2000+ items. I'm using a simple LINQ statement for doing the filtering: var filterCollection = from s in listCollection where s.FilterValue.IndexOf(currentWord, StringComparison.OrdinalIgnoreCase) >= 0 orderby s.FilterValue select s; I then assign this collection to a WPF Listbox's ItemSource, and that's the end of it, works fine. Noting that, the Listbox is also virtualised as well, so there will only be at most 7-8 visual elements in memory and in the visual tree. However the caveat right now is that, when the user types extremely fast in the richtextbox, and on every key up I execute the filtering + binding, there's this semi-race condition, or out of sync filtering, like the first key stroke's filtering could still be doing it's filtering or binding work, while the fourth key stroke is also doing the same. I know I could put in a delay before applying the filter, but I'm trying to achieve a seamless filtering much like the one in Visual Studio. I'm not sure where my problem exactly lies, so I'm also attributing it to IndexOf's string operation, or perhaps my list of string's could be optimised in some kind of index, that could speed up searching. Any suggestions of code samples are much welcomed. Thanks.

    Read the article

  • Python pixel manipulation library

    - by silinter
    So I'm going through the beginning stages of producing a game in Python, and I'm looking for a library that is able to manipulate pixels and blit them relatively fast. My first thought was pygame, as it deals in pure 2D surfaces, but it only allows pixel access through pygame.get_at(), pygame.set_at() and pygame.get_buffer(), all of which lock the surface each time they're called, making them slow to use. I can also use the PixelArray and surfarray classes, but they are locked for the duration of their lifetimes, and the only way to blit them to a surface is to either copy the pixels to a new surface, or use surfarray.blit_array, which requires creating a subsurface of the screen and blitting it to that, if the array is smaller than the screen (if it's bigger I can just use a slice of the array, which is no problem). I don't have much experience with PyOpenGL or Pyglet, but I'm wondering if there is a faster library for doing pixel manipulation in, or if there is a faster method, in Pygame, for doing pixel manupilation. I did some work with SDL and OpenGL in C, and I do like the idea of adding vertex/fragment shaders to my program. My program will chiefly be dealing in loading images and writing/reading to/from surfaces.

    Read the article

  • Building a News-feed that comprises posts "created by user's connections" && "on the topics user is following"

    - by aklin81
    I am working on a project of Questions & Answers website that allows a user to follow questions on certain topics from his network. A user's news-feed wall comprises of only those questions that have been posted by his connections and tagged on the topics that he is following(his expertise topics). I am confused what database's datamodel would be most fitting for such an application. The project needs to consider the future provisions for scalability and high performance issues. I have been looking at Cassandra and MySQL solutions as of now. After my study of Cassandra I realized that Simple news-feed design that shows all the posts from network would be easy to design using Cassandra by executing fast writes to all followers of a user about the post from user. But for my kind of application where there is an additional filter of 'followed topics', (ie, the user receives posts "created by his network" && "on topics user is following"), I could not convince myself with a good schema design in Cassandra. I hope if I missed something because of my short understanding of cassandra, perhaps, can you please help me out with your suggestions of how this news-feed could be implemented in Cassandra ? Looking for a great project with Cassandra ! Edit: There are going to be maximum 5 tags allowed for tagging the question (ie, max 5 topics can be tagged on a question).

    Read the article

  • Delphi, PGDac vs Zeos, Fetch, Lookup?

    - by durumdara
    Hi! I used Zeos to test to know: is ZTable uses fetch technics, or not? May in the future we migrate our lesser system to PGSQL, and this used now "Table" components (as BDE, but it have an SQL-like server). These tables use real cursors, a "Window" with N record, so lookup is very fast, because the Locate/Lookup is started on server, and only these N records are refreshed, no matter, how many records in the lookup table. PGSQL uses fetch technics as I know, and I tested it with a table (id int, name varchar(100)), and 1 million records. (I also trying this with mysql). The adapter is Zeos. ID, sec to find, allocated memory in bytes on client. MySQL 500000 2,761 113 196 344 1000000 3,214 225 471 232 313800 0,437 225 471 232 328066 0,468 225 471 232 276374 0,390 225 471 232 905984 1,264 225 471 232 260253 0,359 225 471 232 PGSQL 500000 3,042 113 188 184 1000000 3,744 225 463 064 313800 0,436 225 463 064 328066 0,452 225 463 064 276374 0,375 225 463 064 905984 1,295 225 463 064 260253 0,359 225 463 064 142023 0,203 225 463 064 As you see the records are fetched locally, this cause the 225 MB usage, and searches are slow a little, based where is the record we must find. I want to ask more things: a.) Is PGDAC have some technics to we can use the lookups without pay the fetch with memory and secs? b.) Or is PG ODBC driver can help in this problem with ADO? (As I know ADO can use server side cursors)? c.) Have anybody some experience with lookup tables, and performance? Is this critical question or it is not? (With client memory usage too). d.) If no chance to avoid fetch hell with lookups, what we can do? Server Side Joins, and unique code for Lookup field changing without real Lookup? Thanks for your help: dd

    Read the article

  • Most efficient way to check for DBNull and then assign to a variable?

    - by ilitirit
    This question comes up occasionally but I haven't seen a satisfactory answer. A typical pattern is (row is a DataRow): if (row["value"] != DBNull.Value) { someObject.Member = row["value"]; } My first question is which is more efficient (I've flipped the condition): row["value"] == DBNull.Value; // Or row["value"] is DBNull; // Or row["value"].GetType() == typeof(DBNull) // Or... any suggestions? This indicates that .GetType() should be faster, but maybe the compiler knows a few tricks I don't? Second question, is it worth caching the value of row["value"] or does the compiler optimize the indexer away anyway? eg. object valueHolder; if (DBNull.Value == (valueHolder = row["value"])) {} Disclaimers: row["value"] exists. I don't know the column index of the column (hence the column name lookup) I'm asking specifically about checking for DBNull and then assignment (not about premature optimization etc). Edit: I benchmarked a few scenarios (time in seconds, 10000000 trials): row["value"] == DBNull.Value: 00:00:01.5478995 row["value"] is DBNull: 00:00:01.6306578 row["value"].GetType() == typeof(DBNull): 00:00:02.0138757 Object.ReferenceEquals has the same performance as "==" The most interesting result? If you mismatch the name of the column by case (eg. "Value" instead of "value", it takes roughly ten times longer (for a string): row["Value"] == DBNull.Value: 00:00:12.2792374 The moral of the story seems to be that if you can't look up a column by it's index, then ensure that the column name you feed to the indexer matches the DataColumn's name exactly. Caching the value also appears to be nearly twice as fast: No Caching: 00:00:03.0996622 With Caching: 00:00:01.5659920 So the most efficient method seems to be: object temp; string variable; if (DBNull.Value != (temp = row["value"]) { variable = temp.ToString(); } This was a good learning experience.

    Read the article

  • How to trace the connection pool in a Java Web application - DBMS_APPLICATION_INFO

    - by Cleiton Garcia
    Hello, I need improve the traceability in a Web Application that usually run on fixed db user. The DBA should have a fast access for the information about the heavy users that are degrading the database. 5 years ago, I implemented a .NET ORM engine which makes a log of user and the server using the DBMS_APPLICATION_INFO package. Using a wrapper above the connection manager with the following code: DBMS_APPLICATION_INFO.SET_MODULE('" + User + " - " + appServerMachine + "',''); Each time that a connection get a connection from the pool, the package is executed to log the information in the V$SESSION. Has anyone discover or implemented a solution for this problem using the Toplink or Hibernate? Is there a default implementation for this problem? I found here a solutions as I implemented 5 years ago, but I'd like to know with anyone have a better solution and integrated with the ORM. http://stackoverflow.com/questions/53379/using-dbmsapplicationinfo-with-jboss My application is above Spring, the DAO are implemented with JPA (using hibernate) and actually running directly in Tomcat, with plans to (next year) migrate to SAP Netwevare Application Server. Thanks.

    Read the article

  • sybase - fails to use index unless string is hard-coded

    - by Garrett
    I'm using Sybase 12.5.3 (ASE); I'm new to Sybase though I've worked with MSSQL pretty extensively. I'm running into a scenario where a stored procedure is really very slow. I've traced the issue to a single SELECT stmt for a relatively large table. Modifying that statement dramatically improves the performance of the procedure (and reverting it drastically slows it down; i.e., the SELECT stmt is definitely the culprit). -- Sybase optimizes and uses multi-column index... fast!<br> SELECT ID,status,dateTime FROM myTable WHERE status in ('NEW','SENT') ORDER BY ID -- Sybase does not use index and does very slow table scan<br> SELECT ID,status,dateTime FROM myTable WHERE status in (select status from allowableStatusValues) ORDER BY ID The code above is an adapted/simplified version of the actual code. Note that I've already tried recompiling the procedure, updating statistics, etc. I have no idea why Sybase ASE would choose an index only when strings are hard-coded and choose a table scan when choosing from another table. Someone please give me a clue, and thank you in advance.

    Read the article

  • Trouble Shoot JavaScript Function in IE

    - by CreativeNotice
    So this function works fine in geko and webkit browsers, but not IE7. I've busted my brain trying to spot the issue. Anything stick out for you? Basic premise is you pass in a data object (in this case a response from jQuery's $.getJSON) we check for a response code, set the notification's class, append a layer and show it to the user. Then reverse the process after a time limit. function userNotice(data){ // change class based on error code returned var myClass = ''; if(data.code == 200){ myClass='success'; } else if(data.code == 400){ myClass='error'; } else{ myClass='notice'; } // create message html, add to DOM, FadeIn var myNotice = '<div id="notice" class="ajaxMsg '+myClass+'">'+data.msg+'</div>'; $("body").append(myNotice); $("#notice").fadeIn('fast'); // fadeout and remove from DOM after delay var t = setTimeout(function(){ $("#notice").fadeOut('slow',function(){ $(this).remove(); }); },5000); }

    Read the article

  • Sendkeys problem from .NET program

    - by user203123
    THe code below I copied from MSDN with a bit of modification: [DllImport("user32.dll", CharSet = CharSet.Unicode)] public static extern IntPtr FindWindow(string lpClassName,string lpWindowName); DllImport("User32")] public static extern bool SetForegroundWindow(IntPtr hWnd); int cnt = 0; private void button1_Click(object sender, EventArgs e) { IntPtr calculatorHandle = FindWindow("Notepad", "Untitled - Notepad"); if (calculatorHandle == IntPtr.Zero) { MessageBox.Show("Calculator is not running."); return; } SetForegroundWindow(calculatorHandle); SendKeys.SendWait(cnt.ToString()); SendKeys.SendWait("{ENTER}"); cnt++; SendKeys.Flush(); System.Threading.Thread.Sleep(1000); } The problem is the number sequence in Notepad is not continuously. The first click always results 0 (as expected). but from the second click, the result is unpredictable (but the sequence is still in order, e.g. 3, 4, 5, 10, 14, 15, ....) If I click the button fast enough, I was able to get the result in continuous order (0,1,2,3,4,....) but sometimes it produces more than 2 same numbers (e.g. 0,1,2,3,3,3,4,5,6,6,6,7,8,9,...)

    Read the article

  • Web Workers - Transferable Objects for JSON

    - by kclem06
    HTML 5 Web workers are very slow when using worker.postMessage on a large JSON object. I'm trying to figure out how to transfer a JSON Object to a web worker - using the 'Transferable Objects' types in Chrome, in order to increase the speed of this. Here is what I'm referring to and appears it should speed this up quite a bit: http://updates.html5rocks.com/2011/12/Transferable-Objects-Lightning-Fast I'm having trouble finding a good example of this (and I don't believe I want to use an ArrayBuffer). Any help would be appreciated. I'm imagining something like this: worker = new Worker('workers.js'); var large_json = {}; for(var i = 0; i < 20000; ++i){ large_json[i] = i; large_json["test" + i] = "string"; }; //How to make this call to use Transfer Objects? Takes approx 2 seconds to serialize this for me currently. worker.webkitPostMessage(large_json);

    Read the article

  • Dynamically created operators

    - by Gero
    I created a program using dev-cpp and wxwidgets which solves a puzzle. The user must fill the operations blocks and the results blocks, and the program will solve it. Im solving it using bruteforce, i generate all non repeated 9 length number combinations using a recursive algorithm. It does it pretty fast. Up to here all is great! But the problem is when my program operates depending the character on the blocks. Its extremely slow (it never gets the answer), because of the chars comparation against +, -, *, etc. Im doing a CASE. Is there some way or some programming language wich allows dinamic creation of operators? So i can define the operator ROW1COL2 to be a +, and the same way to all other operations. I leave a screenshot of the app, so its easier to understand how the puzzle works. http://www.imageshare.web.id/images/9gg5cev8vyokp8rhlot9.png PD: The algorithm works, i tryed it with a trivial puzzle, and solved it in a second.

    Read the article

  • How to recalculate all-pairs shorthest paths on-line if nodes are getting removed?

    - by Pavel Shved
    Latest news about underground bombing made me curious about the following problem. Assume we have a weighted undirected graph, nodes of which are sometimes removed. The problem is to re-calculate shortest paths between all pairs of nodes fast after such removals. With a simple modification of Floyd-Warshall algorithm we can calculate shortest paths between all pairs. These paths may be stored in a table, where shortest[i][j] contains the index of the next node on the shortest path between i and j (or NULL value if there's no path). The algorithm requires O(n³) time to build the table, and eacho query shortest(i,j) takes O(1). Unfortunately, we should re-run this algorithm after each removal. The other alternative is to run graph search on each query. This way each removal takes zero time to update an auxiliary structure (because there's none), but each query takes O(E) time. What algorithm can be used to "balance" the query and update time for all-pairs shortest-paths problem when nodes of the graph are being removed?

    Read the article

  • Algorithm for finding the best routes for food distribution in game

    - by Tautrimas
    Hello, I'm designing a city building game and got into a problem. Imagine Sierra's Caesar III game mechanics: you have many city districts with one market each. There are several granaries over the distance connected with a directed weighted graph. The difference: people (here cars) are units that form traffic jams (here goes the graph weights). Note: in Ceasar game series, people harvested food and stockpiled it in several big granaries, whereas many markets (small shops) took food from the granaries and delivered it to the citizens. The task: tell each district where they should be getting their food from while taking least time and minimizing congestions on the city's roads. Map example Sample diagram Suppose that yellow districts need 7, 7 and 4 apples accordingly. Bluish granaries have 7 and 11 apples accordingly. Suppose edges weights to be proportional to their length. Then, the solution should be something like the gray numbers indicated on the edges. Eg, first district gets 4 apples from the 1st and 3 apples from the 2nd granary, while the last district gets 4 apples from only the 2nd granary. Here, vertical roads are first occupied to the max, and then the remaining workers are sent to the diagonal paths. Question What practical and very fast algorithm should I use? I was looking at some papers (Congestion Games: Optimization in Competition etc.) describing congestion games, but could not get the big picture. Any help is very appreciated! P. S. I can post very little links and no images because of new user restriction.

    Read the article

  • How to make a product catalog in C#?

    - by Ervin
    I need to develop a product catalog (about 4000 products) application, which would be given to clients on CD or DVD. The catalog exists in webpage format using PHP and MySQL. IMPORTANT: the application is given to clients who maight have old PC, old System. For minimal requirements I would put Windows XP and Internet Explorer 6 (if needed). I need the following features: 1 search option (after productID AND after keyword) 2 print option (by selecting multiple products) 3 shopping cart (making a list which will be sent to an email address if there is any Internet Connection on the computer) When I was asked to do it I had 2 days to realise a very basic version, so I took the whole website and exported it in HTML pages, and developed an application in C# which contains an embeded browser. So the whole website is now static and put on a CD. Everything fine so far. Now here are the problems: 1. the search option was realized by parsing the html files and reading the productID or looking for keywords inside of them. Put on a CD it was extremely slow (searching in 600MB of html files). FOR THIS I WOULD NEED A SOLUTION WITH A STATIC DATABASE (USING ACCESS OR SOMETHING) TO HAVE INDEXED ROWS, SO THE SEARCH COULD BE A VERY FAST ONE. 2. the printing option was a simply call of the embeded Internet Explorer print functions. Here are two problems: a) user needs IE7 for printing the website scaled (FIT TO PAGE), otherwise the edges of the page are cut down. b) users of this app does not have even the basic PC usage skills, so they can't set the printing settings, so there will appear in header and footer the page numbers and titles. QUESTION: can I set these settings from CSS for printing? 3. couldn't make a a shopping cart as I don't use a database, so I have static websites and content is inside the HTML. QUESTION: WHICH ARE THE BEST SOLUTIONS FOR THE PROBLEMS DESCRIBED ABOVE? PLEASE ANSWER EVEN IF YOUR ANSWER IS FOR ONE QUESTION ONLY. THANKS

    Read the article

  • What issues to consider when rolling your own data-backend for Silverlight / AJAX on non-ASP.NET ser

    - by Edward Tanguay
    I have read-only Silverlight and AJAX apps which read static text and XML files from a PHP/Apache server, which works very nicely with features such as asynchronous loading, lazy-loading only what I need for each page, loading in the background, developed a little query language to get a PHP script to create custom XML files etc. it's pragmatic read-only REST, and all works fast and fine for read-only sites. Now I want to also add the ability to write data from these apps to a database on the same PHP/Apache server. For those of you who have built similar data-access layers, what do I need to consider while building this, especially regarding security so that not just any client can write and alter my database, e.g.: check HTTP_USER_AGENT for security check REMOTE_ADDR for security require a special code for security, perhaps a list of TAN codes (such as banks use for online transactions) each which can only be used once, both the client and server have these I wonder if there is some kind of standard REST query I should lean on for e.g. building SQL-like statements in the URL parameters, e.g. http://www.thedatalayersite.com/query?insertinto=customers&... Any thoughts, notes from experience, ideas, gotchas, especially ideas on tightening down security in this endeavor would be helpful.

    Read the article

  • How to improve performance of non-scalar aggregations on denormalized tables

    - by The Lazy DBA
    Suppose we have a denormalized table with about 80 columns, and grows at the rate of ~10 million rows (about 5GB) per month. We currently have 3 1/2 years of data (~400M rows, ~200GB). We create a clustered index to best suit retrieving data from the table on the following columns that serve as our primary key... [FileDate] ASC, [Region] ASC, [KeyValue1] ASC, [KeyValue2] ASC ... because when we query the table, we always have the entire primary key. So these queries always result in clustered index seeks and are therefore very fast, and fragmentation is kept to a minimum. However, we do have a situation where we want to get the most recent FileDate for every Region, typically for reports, i.e. SELECT [Region] , MAX([FileDate]) AS [FileDate] FROM HugeTable GROUP BY [Region] The "best" solution I can come up to this is to create a non-clustered index on Region. Although it means an additional insert on the table during loads, the hit isn't minimal (we load 4 times per day, so fewer than 100,000 additional index inserts per load). Since the table is also partitioned by FileDate, results to our query come back quickly enough (200ms or so), and that result set is cached until the next load. However I'm guessing that someone with more data warehousing experience might have a solution that's more optimal, as this, for some reason, doesn't "feel right".

    Read the article

  • High performance text file parsing in .net

    - by diamandiev
    Here is the situation: I am making a small prog to parse server log files. I tested it with a log file with several thousand requests (between 10000 - 20000 don't know exactly) What i have to do is to load the log text files into memory so that i can query them. This is taking the most resources. The methods that take the most cpu time are those (worst culprits first): string.split - splits the line values into a array of values string.contains - checking if the user agent contains a specific agent string. (determine browser ID) string.tolower - various purposes streamreader.readline - to read the log file line by line. string.startswith - determine if line is a column definition line or a line with values there were some others that i was able to replace. For example the dictionary getter was taking lots of resources too. Which i had not expected since its a dictionary and should have its keys indexed. I replaced it with a multidimensional array and saved some cpu time. Now i am running on a fast dual core and the total time it takes to load the file i mentioned is about 1 sec. Now this is really bad. Imagine a site that has tens of thousands of visits a day. It's going to take minutes to load the log file. So what are my alternatives? If any, cause i think this is just a .net limitation and i can't do much about it.

    Read the article

  • MS Access: Why is ADODB.Recordset.BatchUpdate so much slower than Application.ImportXML?

    - by apenwarr
    I'm trying to run the code below to insert a whole lot of records (from a file with a weird file format) into my Access 2003 database from VBA. After many, many experiments, this code is the fastest I've been able to come up with: it does 10000 records in about 15 seconds on my machine. At least 14.5 of those seconds (ie. almost all the time) is in the single call to UpdateBatch. I've read elsewhere that the JET engine doesn't support UpdateBatch. So maybe there's a better way to do it. Now, I would just think the JET engine is plain slow, but that can't be it. After generating the 'testy' table with the code below, I right clicked it, picked Export, and saved it as XML. Then I right clicked, picked Import, and reloaded the XML. Total time to import the XML file? Less than one second, ie. at least 15x faster. Surely there's an efficient way to insert data into Access that doesn't require writing a temp file? Sub TestBatchUpdate() CurrentDb.Execute "create table testy (x int, y int)" Dim rs As New ADODB.Recordset rs.CursorLocation = adUseServer rs.Open "testy", CurrentProject.AccessConnection, _ adOpenStatic, adLockBatchOptimistic, adCmdTableDirect Dim n, v n = Array(0, 1) v = Array(50, 55) Debug.Print "starting loop", Time For i = 1 To 10000 rs.AddNew n, v Next i Debug.Print "done loop", Time rs.UpdateBatch Debug.Print "done update", Time CurrentDb.Execute "drop table testy" End Sub I would be willing to resort to C/C++ if there's some API that would let me do fast inserts that way. But I can't seem to find it. It can't be that Application.ImportXML is using undocumented APIs, can it?

    Read the article

  • How to count term frequency for set of documents?

    - by ManBugra
    i have a Lucene-Index with following documents: doc1 := { caldari, jita, shield, planet } doc2 := { gallente, dodixie, armor, planet } doc3 := { amarr, laser, armor, planet } doc4 := { minmatar, rens, space } doc5 := { jove, space, secret, planet } so these 5 documents use 14 different terms: [ caldari, jita, shield, planet, gallente, dodixie, armor, amarr, laser, minmatar, rens, jove, space, secret ] the frequency of each term: [ 1, 1, 1, 4, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1 ] for easy reading: [ caldari:1, jita:1, shield:1, planet:4, gallente:1, dodixie:1, armor:2, amarr:1, laser:1, minmatar:1, rens:1, jove:1, space:2, secret:1 ] What i do want to know now is, how to obtain the term frequency vector for a set of documents? for example: Set<Documents> docs := [ doc2, doc3 ] termFrequencies = magicFunction(docs); System.out.pring( termFrequencies ); would result in the ouput: [ caldari:0, jita:0, shield:0, planet:2, gallente:1, dodixie:1, armor:2, amarr:1, laser:1, minmatar:0, rens:0, jove:0, space:0, secret:0 ] remove all zeros: [ planet:2, gallente:1, dodixie:1, armor:2, amarr:1, laser:1 ] Notice, that the result vetor contains only the term frequencies of the set of documents. NOT the overall frequencies of the whole index! The term 'planet' is present 4 times in the whole index but the source set of documents only contains it 2 times. A naive implementation would be to just iterate over all documents in the docs set, create a map and count each term. But i need a solution that would also work with a document set size of 100.000 or 500.000. Is there a feature in Lucene i can use to obtain this term vector? If there is no such feature, how would a data structure look like someone can create at index time to obtain such a term vector easily and fast? I'm not that Lucene expert so i'am sorry if the solution is obvious or trivial.

    Read the article

  • Efficient way to create a large number of SharePoint folders

    - by BeraCim
    Hi all: I'm currently creating a large number of SharePoint folders within a list (e.g. ~800 folders), with each folder containing a different number of items. The way it is currently done is that it programmatically reads off the content types, items, event listeners and the likes off the same folder from another web, then creates the same folder in the current web. That ran reasonably fine and fast on a dev environment. However when it goes to an environment with WFEs and farms, it slowed down a lot. I have checked that there are no leaks in the code, and that the code follows SharePoint coding best practices. At the moment I'm looking at it at the code level. From your experience, are there any efficient ways of creating a large number of SharePoint folders, lists and items? EDIT: I'm currently using SharePoint API, but will be looking at moving to using Web Service in the future. I'm interested in looking at both options though. Code wise, its just the general reading of a folder and its content types plus items and their details, then create the same folder in the same list with the same content types, then copy over the items using patch update. I want to know whether there are more efficient ways of doing the above. Thanks.

    Read the article

  • Solr associations

    - by Tom
    Hi all, The last couple of days we are thinking of using Solr as our search engine of choice. Most of the features we need are out of the box or can be easily configured. There is however one feature that we absolutely need that seems to be well hidden (or missing) in Solr. I'll try to explain with an example. We have lots of documents that are actually businesses: <document> <name>Apache</name> <cat>1</cat> ... </document> <document> <name>McDonalds</name> <cat>2</cat> ... </document> In addition we have another xml file with all the categories and synonyms: <cat id=1> <name>software</name> <synonym>IT<synonym> </cat> <cat id=2> <name>fast food</name> <synonym>restaurant<synonym> </cat> We want to associate both businesses and categories so we can search using the name and/or synonyms of the category. But we do not want to merge these files at indexing time because we should update the categories (adding.remioving synonyms...) without indexing all the businesses again. Is there anything in Solr that does this kind of associations or do we need to develop some specific pieces? All feedback and suggestions are welcome. Thanks in advance, Tom

    Read the article

  • Any HTTP proxies with explicit, configurable support for request/response buffering and delayed conn

    - by Carlos Carrasco
    When dealing with mobile clients it is very common to have multisecond delays during the transmission of HTTP requests. If you are serving pages or services out of a prefork Apache the child processes will be tied up for seconds serving a single mobile client, even if your app server logic is done in 5ms. I am looking for a HTTP server, balancer or proxy server that supports the following: A request arrives to the proxy. The proxy starts buffering in RAM or in disk the request, including headers and POST/PUT bodies. The proxy DOES NOT open a connection to the backend server. This is probably the most important part. The proxy server stops buffering the request when: A size limit has been reached (say, 4KB), or The request has been received completely, headers and body Only now, with (part of) the request in memory, a connection is opened to the backend and the request is relayed. The backend sends back the response. Again the proxy server starts buffering it immediately (up to a more generous size, say 64KB.) Since the proxy has a big enough buffer the backend response is stored completely in the proxy server in a matter of miliseconds, and the backend process/thread is free to process more requests. The backend connection is immediately closed. The proxy sends back the response to the mobile client, as fast or as slow as it is capable of, without having a connection to the backend tying up resources. I am fairly sure you can do 4-6 with Squid, and nginx appears to support 1-3 (and looks like fairly unique in this respect). My question is: is there any proxy server that empathizes these buffering and not-opening-connections-until-ready capabilities? Maybe there is just a bit of Apache config-fu that makes this buffering behaviour trivial? Any of them that it is not a dinosaur like Squid and that supports a lean single-process, asynchronous, event-based execution model? (Siderant: I would be using nginx but it doesn't support chunked POST bodies, making it useless for serving stuff to mobile clients. Yes cheap 50$ handsets love chunked POSTs... sigh)

    Read the article

  • app burns numbers into iPad screens, how can I prevent this?

    - by Andrew Johnson
    EDIT: My code for this is actually open source, if anyone would be able to look and comment. Things I can think of that might be an issue: using a custom font, using bright green, updating the label too fast? The repo is: https://github.com/andrewljohnson/StopWatch-of-Gaia The class for the time label: https://github.com/andrewljohnson/StopWatch-of-Gaia/blob/master/src/SWPTimeLabel.m The class that runs the timer to update the label: https://github.com/andrewljohnson/StopWatch-of-Gaia/blob/master/src/SWPViewController.m ============= My StopWatch app reportedly screen burns a number of iPads, for temporary periods. Does anyone have a suggestion about how I might prevent this screen persistence? Some known workaround to blank the pixels occasionally? I get emails all the time about it, and you can see numerous reviews here: http://itunes.apple.com/us/app/stopwatch+-timer-for-gym-kitchen/id518178439?mt=8 Apple can not advise me. I sent an email to appreview, and I was told to file a technical support request (DTS). When I filled the DTS, they told me it was not a code issue, and when I further asked for help from DTS, a "senior manager" told me that this was not an issue Apple knew about. He further advised me to file a bug with the Apple Radar bug tracker if I considered it to be a real issue. I filed the Radar bug a few weeks ago, but it has not been acknowledged. Updated radar link for Apple employees, per commenter's notes rdar://12173447

    Read the article

< Previous Page | 295 296 297 298 299 300 301 302 303 304 305 306  | Next Page >