large teams - Page 385 - Developer IT

Hibernate 3.5.0 causes extreme performance problems

- by user303396

I've recently updated from hibernate 3.3.1.GA to hibernate 3.5.0 and I'm having a lot of performance issues. As a test, I added around 8000 entities to my DB (which in turn cause other entities to be saved). These entities are saved in batches of 20 so that the transactions aren't too large for performance reasons. When using hibernate 3.3.1.GA all 8000 entities get saved in about 3 minutes. When using hibernate 3.5.0 it starts out slower than with hibernate 3.3.1. But it gets slower and slower. At around 4,000 entities, it sometimes takes 5 minutes just to save a batch of 20. If I then go to a mysql console and manually type in an insert statement from the mysql general query log, half of them run perfect in 0.00 seconds. And half of them take a long time (maybe 40 seconds) or timeout with "ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction" from MySQL. Has something changed in hibernate's transaction management in version 3.5.0 that I should be aware of? The ONLY thing I changed to experience these unusable performance issues is replace the following hibernate 3.3.1.GA jar files: com.springsource.org.hibernate-3.3.1.GA.jar, com.springsource.org.hibernate.annotations-3.4.0.GA.jar, com.springsource.org.hibernate.annotations.common-3.3.0.ga.jar, com.springsource.javassist-3.3.0.ga.jar with the new hibernate 3.5.0 release hibernate3.jar and javassist-3.9.0.GA.jar. Thanks.

Read the article

HTTP Compression problems on IIS7

- by Jonathan Wood

I've spent quite a bit of time on this but seem to be going nowhere. I have a large page that I really want to speed up. The obvious place to start seems to be HTTP compression, but I just can't seem to get it to work for me. After considerable searching, I've tried several variations of the code below. It kind of works, but after refreshing the browser, the results seem to fall apart. They were turning to garbage when the page used caching. If I turn off caching, then the page seems right but I lose my CSS formatting (stored in a separate file) and get an error that an included JS file contains invalid characters. Most of the resources I've found on the Web were either very old or focused on accessing IIS directly. My page is running on a shared hosting account and I do not have direct access to IIS7, which it's running on. protected void Application_BeginRequest(object sender, EventArgs e) { // Implement HTTP compression if (Request["HTTP_X_MICROSOFTAJAX"] == null) // Avoid compressing AJAX calls { // Retrieve accepted encodings string encodings = Request.Headers.Get("Accept-Encoding"); if (encodings != null) { // Verify support for or gzip (deflate takes preference) encodings = encodings.ToLower(); if (encodings.Contains("gzip") || encodings == "*") { Response.Filter = new GZipStream(Response.Filter, CompressionMode.Compress); Response.AppendHeader("Content-Encoding", "gzip"); Response.Cache.VaryByHeaders["Accept-encoding"] = true; } else if (encodings.Contains("deflate")) { Response.Filter = new DeflateStream(Response.Filter, CompressionMode.Compress); Response.AppendHeader("Content-Encoding", "deflate"); Response.Cache.VaryByHeaders["Accept-encoding"] = true; } } } } Is anyone having better success with this?

Read the article

Async task ASP.net HttpContext.Current.Items is empty - How do handle this?

- by GuruC

We are running a very large web application in asp.net MVC .NET 4.0. Recently we had an audit done and the performance team says that there were a lot of null reference exceptions. So I started investigating it from the dumps and event viewer. My understanding was as follows: We are using Asyn Tasks in our controllers. We rely on HttpContext.Current.Items hashtable to store a lot of Application level values. Task<Articles>.Factory.StartNew(() => { System.Web.HttpContext.Current = ControllerContext.HttpContext.ApplicationInstance.Context; var service = new ArticlesService(page); return service.GetArticles(); }).ContinueWith(t => SetResult(t, "articles")); So we are copying the context object onto the new thread that is spawned from Task factory. This context.Items is used again in the thread wherever necessary. Say for ex: public class SomeClass { internal static int StreamID { get { if (HttpContext.Current != null) { return (int)HttpContext.Current.Items["StreamID"]; } else { return DEFAULT_STREAM_ID; } } } This runs fine as long as number of parallel requests are optimal. My questions are as follows: 1. When the load is more and there are too many parallel requests, I notice that HttpContext.Current.Items is empty. I am not able to figure out a reason for this and this causes all the null reference exceptions. 2. How do we make sure it is not null ? Any workaround if present ? NOTE: I read through in StackOverflow and people have questions like HttpContext.Current is null - but in my case it is not null and its empty. I was reading one more article where the author says that sometimes request object is terminated and it may cause problems since dispose is already called on objects. I am doing a copy of Context object - its just a shallow copy and not a deep copy.

Read the article

Problem with Boost::Asio for C++

- by Martin Lauridsen

Hi there, For my bachelors thesis, I am implementing a distributed version of an algorithm for factoring large integers (finding the prime factorisation). This has applications in e.g. security of the RSA cryptosystem. My vision is, that clients (linux or windows) will download an application and compute some numbers (these are independant, thus suited for parallelization). The numbers (not found very often), will be sent to a master server, to collect these numbers. Once enough numbers have been collected by the master server, it will do the rest of the computation, which cannot be easily parallelized. Anyhow, to the technicalities. I was thinking to use Boost::Asio to do a socket client/server implementation, for the clients communication with the master server. Since I want to compile for both linux and windows, I thought windows would be as good a place to start as any. So I downloaded the Boost library and compiled it, as it said on the Boost Getting Started page: bootstrap .\bjam It all compiled just fine. Then I try to compile one of the tutorial examples, client.cpp, from Asio, found (here.. edit: cant post link because of restrictions). I am using the Visual C++ compiler from Microsoft Visual Studio 2008, like this: cl /EHsc /I D:\Downloads\boost_1_42_0 client.cpp But I get this error: /out:client.exe client.obj LINK : fatal error LNK1104: cannot open file 'libboost_system-vc90-mt-s-1_42.lib' Anyone have any idea what could be wrong, or how I could move forward? I have been trying pretty much all week, to get a simple client/server socket program for c++ working, but with no luck. Serious frustration kicking in. Thank you in advance.

Read the article

String or binary data would be truncated -- Heisenberg problem

- by harpo

When you get this error, the first thing you ask is, which column? Unfortunately, SQL Server is no help here. So you start doing trial and error. Well, right now I have a statement like: INSERT tbl (A, B, C, D, E, F, G) SELECT A, B * 2, C, D, E, q.F, G FROM tbl ,othertable q WHERE etc etc Note that Some values are modified or linked in from another table, but most values are coming from the original table, so they can't really cause truncation going back to the same field (that I know of). Eliminating fields one at a time eventually makes the error go away, if I do it cumulatively, but — and here's the kicker — it doesn't matter which fields I eliminate. It's as if SQL Server is objecting to the total length of the row, which I doubt, since there are only about 40 fields in all, and nothing large. Anyone ever seen this before? Thanks. UPDATE: I have also done "horizontal" testing, by filtering out the SELECT, with much the same result. In other words, if I say WHERE id BETWEEN 1 AND 100: Error WHERE id BETWEEN 1 AND 50: No error WHERE id BETWEEN 50 AND 100: No error I tried many combinations, and it cannot be limited to a single row.

Read the article

Client-server synchronization pattern / algorithm?

- by tm_lv

I have a feeling that there must be client-server synchronization patterns out there. But i totally failed to google up one. Situation is quite simple - server is the central node, that multiple clients connect to and manipulate same data. Data can be split in atoms, in case of conflict, whatever is on server, has priority (to avoid getting user into conflict solving). Partial synchronization is preferred due to potentially large amounts of data. Are there any patterns / good practices for such situation, or if you don't know of any - what would be your approach? Below is how i now think to solve it: Parallel to data, a modification journal will be held, having all transactions timestamped. When client connects, it receives all changes since last check, in consolidated form (server goes through lists and removes additions that are followed by deletions, merges updates for each atom, etc.). Et voila, we are up to date. Alternative would be keeping modification date for each record, and instead of performing data deletes, just mark them as deleted. Any thoughts?

Read the article

How should I configure grub for booting linux kernel from a USB hard drive?

- by skolima

I have a laptop hard drive in an external enclosure which I use as a large pendrive. For an added twist, I have installed Linux on it, so I can boot any machine with my distribution of choice (e.g. for data recovery or repairing a b0rked system or just using a borrowed laptop without destroying the preinstalled Windows). The problem is that, depending on the hardware configuration, the USB hard drive may be visible under different paths. For grub configuration I just use (hda0,0) as it is relative to the device the grub was launched from. I have UUID entries in /etc/fstab. I also specify rootwait in the kernel parameters so that it waits for the USB subsystem to settle down before trying to mount the device. What should I pass to the kernel as root= ? Currently boot from the pendrive once, check the debug messages to see what /dev/sdX device has been assigned to the USB drive by the kernel, then reboot and edit the grub configuration. I can't change anything on the PC besides enabling Boot from USB hard drive in BIOS and setting it to higher priority than internal hard drives. There are various initrd generating scripts which include support for UUID in root device path, unfortunately the Gentoo native one (genkernel) does not support rootwait and I had no luck trying to use others. The boot process goes like this (it is quite similar in Windows): The BIOS chooses the boot device and loads whatever is its MBR (which happens to be grub stage-1). Grub loads it's configuration and stage-2 files from device it has set as root, using (hd0) for the device it was loaded from by BIOS. Grub loads and starts a kernel (still the same numbering, so I can use (hd0,0) again ). Kernel initializes all built-in devices (rootwait does it's magic now). Kernel mounts the partition it was passed as root (this is a kernel parameter, not grub parameter). init.d starts the userland booting process, including mounting things from /etc/fstab. Part 5 is the one giving me problems.

Read the article

Bulletin board - Database optimisation

- by andrew

This question is a follow on from this Question The project and problem The project I am currently working on is a bulletin board for a large non-profit organisation. The bulletin board will be used to allow inter-office communication within the organisation. I am building the application and have been having trouble extracting the results that I need from my database because I don't think it is properly normalized and because of limitations in my knowledge of relational database theory and mysql. I would appreciate input into the design of the board in general and in particular, ways that the database structure can be improved to facilitate efficient queries and help me develop this application and future application faster Business Logic The bulletin board will be used in the following way Posting bulletins and responses to bulletins Employees or 'users' in offices around the country will be able to post messages to the bulletin board.Bulletins must be posted to a location and categorised- i'll call these "bulletins". Users will be able to post any number of replies to any one bulletin and users will be able to reply to their own bulletin - i'll call these 'replies'. Rating bulletins and replies Users will be able to either 'like' or 'dislike' a bulletin or a reply and the total number of likes or dislikes will be shown for each bulletin or reply. Viewing the bulletin board and responses Bulletins can be displayed chronologically. Users can sort bulletins chronologically or chronologically by the latest reply to that bulletin(let me know if you need more explanation) When a particular bulletin is selected, replies to that bulletin will be displayed chronologically @PerformanceDBA - edited 10:34 est 28/12/10I have begun implementing the data model. I assume that the 6th data model is the physical model because it contains the associative tables. I am going to post any questions that I have below. I will put up a database dump once I am done. I will then put up a list of all the queries that I need to run on the database and begin writing them. I hope you had a good Christmas. I'm in Canada and there's snow! Implementation of Physical model

Read the article

Programming exercises in Java inheritance for intern

- by Tenner

I work for a small software development team, working primarily in Java, for a very large company. Our new intern showed up sight-unseen (not uncommon in my company). He has some C++ experience but no Java. Worse, he's never worked with inheritance in C++. Our code has a great deal of abstraction and a heavy reliance on inheritance. We need to get him up to speed as quickly as possible. Of course the rest of the team is busy, and so we can't take the time out of our day to teach a one-student 200-level CS course. Instead, I'd like to give him an actual programming project to work on which highlights how classes, interfaces, method overrides, etc. work. I've had him look at Project Euler, but most of the solutions end up being procedural, and not object-oriented programs. Do any of you have any somewhat-straightforward (and relatively quick) projects which you would give to an intern in this situation? Or, any recent (or current) students have a school project they'd be willing to share? Anyone else had this experience?

Read the article

Need help implementing this algorithm with map Hadoop MapReduce

- by Julia

Hi all! i have algorithm that will go through a large data set read some text files and search for specific terms in those lines. I have it implemented in Java, but I didnt want to post code so that it doesnt look i am searching for someone to implement it for me, but it is true i really need a lot of help!!! This was not planned for my project, but data set turned out to be huge, so teacher told me I have to do it like this. EDIT(i did not clarified i previos version)The data set I have is on a Hadoop cluster, and I should make its MapReduce implementation I was reading about MapReduce and thaught that i first do the standard implementation and then it will be more/less easier to do it with mapreduce. But didnt happen, since algorithm is quite stupid and nothing special, and map reduce...i cant wrap my mind around it. So here is shortly pseudo code of my algorithm LIST termList (there is method that creates this list from lucene index) FOLDER topFolder INPUT topFolder IF it is folder and not empty list files (there are 30 sub folders inside) FOR EACH sub folder GET file "CheckedFile.txt" analyze(CheckedFile) ENDFOR END IF Method ANALYZE(CheckedFile) read CheckedFile WHILE CheckedFile has next line GET line FOR(loops through termList) GET third word from line IF third word = term from list append whole line to string buffer ENDIF ENDFOR END WHILE OUTPUT string buffer to file Also, as you can see, each time when "analyze" is called, new file has to be created, i understood that map reduce is difficult to write to many outputs??? I understand mapreduce intuition, and my example seems perfectly suited for mapreduce, but when it comes to do this, obviously I do not know enough and i am STUCK! Please please help.

Read the article

How do I create a class repository in Java and do I really need it?

- by Roman

I have a large number of objects which are identified by names (strings). So, I would like to have a kind of mapping from object name to the class instances. I was told that in this situation I can use a "repository" class which works like that: Server myServer = ServerRepository.getServer("NameOfServer"); So, if there is already an object (sever) with the "NameOfServer" it will be returned by the "getServer". If such an object does not exist yet, it will be created and returned by the "getServer". So, my question is how to program such a "repository" class? In this class I have to be able to check if there is an instance of a given class such that it has a given value of a given field. How can I do it? I need to have a kind of loop over all existing object of a given class? Another part of my question is why I cannot use associative arrays (associative container, map, mapping, dictionary, finite map)? (I am not sure how do you call it in Java) In more details, I have an "array" which maps names of objects to objects. So, whenever I create a new object, I add a new element to the array: myArray["NameOfServer"] = new Server("NameOfServer").

Read the article

What is the easiest way to add compression to WCF in Silverlight?

- by caryden

I have a silverlight 2 beta 2 application that accesses a WCF web service. Because of this, it currently can only use basicHttp binding. The webservice will return fairly large amounts of XML data. This seems fairly wasteful from a bandwidth usage standpoint as the response, if zipped, would be smaller by a factor of 5 (I actually pasted the response into a txt file and zipped it.). The request does have the "Accept-Encoding: gzip, deflate" - Is there any way have the WCF service gzip (or otherwise compress) the response? I did find this link but it sure seems a bit complex for functionality that should be handled out-of-the-box IMHO. OK - at first I marked the solution using the System.IO.Compression as the answer as I could never "seem" to get the IIS7 dynamic compression to work. Well, as it turns out: Dynamic Compression on IIS7 was working al along. It is just that Nikhil's Web Developer Helper plugin for IE did not show it working. My guess is that since SL hands the web service call off to the browser, that the browser handles it "under the covers" and Nikhil's tool never sees the compressed response. I was able to confirm this by using Fiddler which monitors traffic external to the browser application. In fiddler, the response was, in fact, gzip compressed!! The other problem with the System.IO.Compression solution is that System.IO.Compression does not exist in the Silverlight CLR. So from my perspective, the EASIEST way to enable WCF compression in Silverlight is to enable Dynamic Compression in IIS7 and write no code at all.

Read the article

Changing Apache2.2.11 httpd.conf has no effect

- by Adrian

Hi, Hopefully someone can help here. I recently installed wampserver ver 2.0 with Apache ver 2.2.11. My issue is, I have some large php scripts which timeout at the default 5 min (300 sec) browser limit (I'm using ie8). It is critcal I get this limit extended. I have tried changing the httpd.conf file to include the following: TimeOut 1200 My objective was to set the timeout at 1200 seconds, or 20 min. I had just chosen a random location to place this directive within the httpd.conf file as I cannot locate any documentation to suggest it belongs in a specific place within the file. Regardless, the changes I make appear in the httpd.conf file that can be found in the system tray for wampserver, however they have no effect - the browser still times out after 5 minutes. I thought perhaps I had the capitals incorrect, so I changed to: Timeout 1200 This change had no effect either. Can someone please help, this is very frustrating. Maybe the command can only be used within a specific module? If so, I have no idea which one, nor do I know the syntax to specify this. Regards Adrian.

Read the article

Where to store site settings: DB? XML? CONFIG? CLASS FILES?

- by Emin

I am re-building a news portal of which already have a large number of visits every day. One of the major concerns when re-building this site was to maximize performance and speed. Having said this, we have done many things from caching, to all sort of other measures to ensure speed. Now towards the end of the project, I am having a dilemma of where to store my site settings that would least affect performance. The site settings will include things such as: Domain, DefaultImgPath, Google Analytics code, default emails of editors as well as more dynamic design/display feature settings such as the background color of specific DIVs and default color for links etc.. As far as I know, I have 4 choices in storing all these info. Database: Storing general settings in the DB and caching them may be a solution however, I want to limit the access to the database for only necessary and essential functions of the project which generally are insert/update/delete news items, author articles etc.. XML: I can store these settings in an XML file but I have not done this sort of thing before so I don't know what kind of problems -if any- I might face in the future. CONFIG: I can also store these settings in web.config CLASS FILE: I can hard code all these settings in a SiteSettings class, but since the site admin himself will be able to edit these settings, It may not be the best solution. Currently, I am more close to choosing web.config but letting people fiddle with it too often is something I do not want. E.g. if somehow, I miss out a validation for something and it breaks the web.config, the whole site will go down. My concern basically is that, I cannot forsee any possible consequences of using any of the methods above (or is there any other?), I was hoping to get this question over to more experienced people out here who hopefully help make my decision.

Read the article

Haskell lazy I/O and closing files

- by Jesse

I've written a small Haskell program to print the MD5 checksums of all files in the current directory (searched recursively). Basically a Haskell version of md5deep. All is fine and dandy except if the current directory has a very large number of files, in which case I get an error like: <program>: <currentFile>: openBinaryFile: resource exhausted (Too many open files) It seems Haskell's laziness is causing it not to close files, even after its corresponding line of output has been completed. The relevant code is below. The function of interest is getList. import qualified Data.ByteString.Lazy as BS main :: IO () main = putStr . unlines =<< getList "." getList :: FilePath -> IO [String] getList p = let getFileLine path = liftM (\c -> (hex $ hash $ BS.unpack c) ++ " " ++ path) (BS.readFile path) in mapM getFileLine =<< getRecursiveContents p hex :: [Word8] -> String hex = concatMap (\x -> printf "%0.2x" (toInteger x)) getRecursiveContents :: FilePath -> IO [FilePath] -- ^ Just gets the paths to all the files in the given directory. Are there any ideas on how I could solve this problem? The entire program is available here: http://haskell.pastebin.com/PAZm0Dcb

Read the article

Parsing multiple files at a time in Perl

- by sfactor

I have a large data set (around 90GB) to work with. There are data files (tab delimited) for each hour of each day and I need to perform operations in the entire data set. For example, get the share of OSes which are given in one of the columns. I tried merging all the files into one huge file and performing the simple count operation but it was simply too huge for the server memory. So, I guess I need to perform the operation each file at a time and then add up in the end. I am new to perl and am especially naive about the performance issues. How do I do such operations in a case like this. As an example two columns of the file are. ID OS 1 Windows 2 Linux 3 Windows 4 Windows Lets do something simple, counting the share of the OSes in the data set. So, each .txt file has millions of these lines and there are many such files. What would be the most efficient way to operate on the entire files.

Read the article

Fast serarch of 2 dimensional array

- by Tim

I need a method of quickly searching a large 2 dimensional array. I extract the array from Excel, so 1 dimension represents the rows and the second the columns. I wish to obtain a list of the rows where the columns match certain criteria. I need to know the row number (or index of the array). For example, if I extract a range from excel. I may need to find all rows where column A =”dog” and column B = 7 and column J “a”. I only know which columns and which value to find at run time, so I can’t hard code the column index. I could use a simple loop, but is this efficient ? I need to run it several thousand times, searching for different criteria each time. For r As Integer = 0 To UBound(myArray, 0) - 1 match = True For c = 0 To UBound(myArray, 1) - 1 If not doesValueMeetCriteria(myarray(r,c) then match = False Exit For End If Next If match Then addRowToMatchedRows(r) Next The doesValueMeetCriteria function is a simple function that checks the value of the array element against the query requirement. e.g. Column A = dog etc. Is it more effiecent to create a datatable from the array and use the .select method ? Can I use Linq in some way ? Perhaps some form of dictionary or hashtable ? Or is the simple loop the most effiecent ? Your suggestions are most welcome.

Read the article

How to return ArrayList results from an IntentService

- by gcl1

I have an IntentService that loads up an ArrayList with data from a network source (AWS SDB tables). The ArrayList is in a global space -- accessible to both the calling Activity and the IntentService (like this: appState = ((App)getApplicationContext())). When the IntentService is done, it notifies the Activity through a ResultReceiver, and the Activity calls adapter.notifyDataChanged() to update the ListView. This solution works most of the time, ... but it violates the rule that only the UI thread should make changes to data underlying a ListView. So as it is, I sometimes get an error: "The content of the adapter has changed but ListView did not receive a notification." I think this must be a common situation. Please let me know if you have any suggestions or best practices for this problem. Here are three options I'm aware of: Keep the IntentService, and have it store the results in another "working" ArrayList, also in the global space. When the result is ready, the IntentService calls the ResultReceiver (on the UI thread), which can then: a) copy the result to the ArrayList associated with the ListView, and b) call adapter.notifyDataChanged(). CONS: I don't like the idea of putting temp/working data in a global space, and copying the result list seems inefficient. Keep the IntentService, and have it pass the results back through a bundle loaded with a ParcelableArrayList. CONS: I'm not sure if this approach would scale for very large result sets. It also requires copying the result list. Switch to a Service which builds a local copy of the result list. Have the Activity directly access the address space of the Service in order to read the result list. CON: Still requires copying results to the ArrayList associated with the ListView. Thank you.

Read the article

scrollLeft works but scrollTop doesn't work

- by Xiao Jia

I have the following HTML with CSS .container { overflow: scroll; } etc. <body> <div class="container"> <div id="markers"></div> <img id="map" src="/img/map.jpg"/> </div> </body> map.jpg is very large and I want to scroll to a fixed position like this: $(function(){ console.log($('.container').scrollTop()); $('.container').scrollTop(1000); console.log($('.container').scrollTop()); console.log($('.container').scrollLeft()); $('.container').scrollLeft(1750); console.log($('.container').scrollLeft()); }); scrollLeft works fine but scrollTop doesn't. Below is the console output. 0 0 0 1750 I've been searching for half an hour but still don't know how to fix it... UPDATE: CSS about .container #markers and #map .container { overflow: scroll; width: 100%; max-width: 100%; height: 100%; max-height: 100%; margin: 0; padding: 0; } #map { width: 5000px; max-width: 5000px; height: 2907px; max-height: 2907px; cursor: crosshair; } #markers { position: relative; top: 0; left: 0; width: 0; height: 0; margin: 0; padding: 0; }

Read the article

What is the appropriate HTML 5 element for a hero unit/showcase?

- by deb

A lot of marketing and content-heavy sites showcase the page's primary content using large text and/or images, sometimes with a slider, containing a call to action for signing up for a service, or downloading an app, etc.. I'm not sure what this design element is called, I got the term hero unit from twitter bootstrap: http://twitter.github.com/bootstrap/components.html#typography I think most of you know what I'm trying to describe... If it's not clear I can add screenshots or links to this question. I looked at a few different sites, and some put this hero unit inside a ASIDE element, others use SECTION, ARTICLE and even HEADER. Using twitter bootstrap as an example again: <header class="jumbotron masthead"> <div class="inner"> <h1>Bootstrap, from Twitter</h1> <p>Simple and flexible HTML, CSS, and Javascript for popular user interface components and interactions.</p> <p class="download-info"> Is HEADER the most appropriate tag for this type of content? Or should I use ASIDE, ARTICLE or SECTION?

Read the article

N-gram split function for string similarity comparison

- by Michael

As part of excersise to better understand F# which I am currently learning , I wrote function to split given string into n-grams. 1) I would like to receive feedback about my function : can this be written simpler or in more efficient way? 2) My overall goal is to write function that returns string similarity (on 0.0 .. 1.0 scale) based on n-gram similarity; Does this approach works well for short strings comparisons , or can this method reliably be used to compare large strings (like articles for example). 3) I am aware of the fact that n-gram comparisons ignore context of two strings. What method would you suggest to accomplish my goal? //s:string - target string to split into n-grams //n:int - n-gram size to split string into let ngram_split (s:string, n:int) = let ngram_count = s.Length - (s.Length % n) let ngram_list = List.init ngram_count (fun i -> if( i + n >= s.Length ) then s.Substring(i,s.Length - i) + String.init ((i + n) - s.Length) (fun i -> "#") else s.Substring(i,n) ) let ngram_array_unique = ngram_list |> Seq.ofList |> Seq.distinct |> Array.ofSeq //produce tuples of ngrams (ngram string,how much occurrences in original string) Seq.init ngram_array_unique.Length (fun i -> (ngram_array_unique.[i], ngram_list |> List.filter(fun item -> item = ngram_array_unique.[i]) |> List.length) )

Read the article

IntelliJ Doesn't Notice Changes in Interface

- by yar

[I've decided to give IntelliJ another go (to replace Eclipse), since its Groovy support is supposed to be the best. But back to Java...] I have an Interface that defines a constant public static final int CHANNEL_IN = 1; and about 20 classes in my Module that implement that interface. I've decided that this constant was a bad idea so I did what I do in Eclipse: I deleted the entire line. This should cause the Project tree to light up like a Christmas tree and all classes that implement that interface and use that constant to break. Instead, this is not happening. If I don't actually double-click on the relevant classes -- which I find using grep -- the module even builds correctly (using Build - Make Module). If I double-click on a relevant class, the error is shown both in the Project Tree and in the Editor. I am not able to replicate this behavior in small tests, but in large modules it works (incorrectly) this way. Is there some relevant setting in IntelliJ for this?

Read the article

django join-like expansion of queryset

- by jimbob

I have a list of Persons each which have multiple fields that I usually filter what's upon, using the object_list generic view. Each person can have multiple Comments attached to them, each with a datetime and a text string. What I ultimately want to do is have the option to filter comments based on dates. class Person(models.Model): name = models.CharField("Name", max_length=30) ## has ~30 other fields, usually filtered on as well class Comment(models.Model): date = models.DateTimeField() person = models.ForeignKey(Person) comment = models.TextField("Comment Text", max_length=1023) What I want to do is get a queryset like Person.objects.filter(comment__date__gt=date(2011,1,1)).order_by('comment__date') send that queryset to object_list and be able to only see the comments ordered by date with only so many objects on a page. E.g., if "Person A" has comments 12/3/11, 1/2/11, 1/5/11, "Person B" has no comments, and person C has a comment on 1/3, I would see: "Person A", 1/2 - comment "Person C", 1/3 - comment "Person A", 1/5 - comment I would strongly prefer not to have to switch to filtering based on Comments.objects.filter(), as that would make me have to largely repeat large sections of code in the both the view and template. Right now if I tried executing the following command, I will get a queryset returning (PersonA, PersonC, PersonA), but if I try rendering that in a template each persons comment_set will contain all their comments even if they aren't in the date range. Ideally they're would be some sort of functionality where I could expand out a Person queryset's comment_set into a larger queryset that can be sorted and ordered based on the comment and put into a object_list generic view. This normally is fairly simple to do in SQL with a JOIN, but I don't want to abandon the ORM, which I use everywhere else.

Read the article

In SQL can I return a tables with a varying number of columns

- by Matt

I have a somewhat more complicated scenario, but I think it should be possible. I have a large SPROC whose result is a set of characteristics for a set of persons. So the Table would look something like this: Property | Client1 Client 2 Client3 ----------------------------------------------------------- Sex | M F M Age | 67 56 67 Income | Low Mid Low It's built using cursors, iterating over different datasets. The problem I am facing is that there is a varying number of Clients and Properties, so an equally valid result over different input sets might be: Property | Client1 Client 2 ------------------------------------------- Sex | M F Age | 67 56 Weight | 122 122 The different number of properties is easy, those are just extra rows. My problem is that I need to declare a temporary table with a varying number of columns. There could be 2 clients or 100. Every client in guaranteed to have every property ultimately listed. What SQL structure would statisfy this and how can I declare it and insert things into it? I can't just flip the columns and rows either because there is a variable number of each.

Read the article

Warning: cast increases required alignment

- by dash-tom-bang

I'm recently working on this platform for which a legacy codebase issues a large number of "cast increases required alignment to N" warnings, where N is the size of the target of the cast. struct Message { int32_t id; int32_t type; int8_t data[16]; }; int32_t GetMessageInt(const Message& m) { return *reinterpret_cast<int32_t*>(&data[0]); } Hopefully it's obvious that a "real" implementation would be a bit more complex, but the basic point is that I've got data coming from somewhere, I know that it's aligned (because I need the id and type to be aligned), and yet I get the message that the cast is increasing the alignment, in the example case, to 4. Now I know that I can suppress the warning with an argument to the compiler, and I know that I can cast the bit inside the parentheses to void* first, but I don't really want to go through every bit of code that needs this sort of manipulation (there's a lot because we load a lot of data off of disk, and that data comes in as char buffers so that we can easily pointer-advance), but can anyone give me any other thoughts on this problem? I mean, to me it seems like such an important and common option that you wouldn't want to warn, and if there is actually the possibility of doing it wrong then suppressing the warning isn't going to help. Finally, can't the compiler know as I do how the object in question is actually aligned in the structure, so it should be able to not worry about the alignment on that particular object unless it got bumped a byte or two?

Search Results

Search found 11409 results on 457 pages for 'large teams'.

Page 385/457 | < Previous Page | 381 382 383 384 385 386 387 388 389 390 391 392 | Next Page >

- by user303396

- by Jonathan Wood

- by GuruC

- by Martin Lauridsen

- by harpo

- by tm_lv

- by skolima

- by andrew

- by Tenner

- by Julia

- by Roman

- by caryden

- by Adrian

- by Emin

- by Jesse

- by sfactor

- by Tim

- by gcl1

- by Xiao Jia

- by deb

- by Michael

- by yar

- by jimbob

- by Matt

- by dash-tom-bang

< Previous Page | 381 382 383 384 385 386 387 388 389 390 391 392 | Next Page >