Search Results

Search found 20904 results on 837 pages for 'disk performance'.

Page 798/837 | < Previous Page | 794 795 796 797 798 799 800 801 802 803 804 805  | Next Page >

  • Using PHP's IMAP library triggers Kaspersky's Antivirus

    - by TMG
    Hello, I just started today working with PHP's IMAP library, and while imap_fetchbody or imap_body are called, it is triggering my Kaspersky antivirus. The viruses are Trojan.Win32.Agent.dmyq and Trojan.Win32.FraudPack.aoda. I am running this off a local development machine with XAMPP and Kaspersky AV. Now, I am sure there are viruses there since there is spam in the box (who doesn't need a some viagra or vicodin these days?). And I know that since the raw body includes attachments and different mime-types, bad stuff can be in the body. So my question is: are there any risks using these libraries? I am assuming that the IMAP functions are retrieving the body, caching it to disk/memory and the AV scanning it sees the data. Is that correct? Are there any known security concerns using this library (I couldn't find any)? Does it clean up cached message parts perfectly or might viral files be sitting somewhere? Is there a better way to get plain text out of the body than this? Right now I am using the following code (credit to Kevin Steffer): function get_mime_type(&$structure) { $primary_mime_type = array("TEXT", "MULTIPART","MESSAGE", "APPLICATION", "AUDIO","IMAGE", "VIDEO", "OTHER"); if($structure->subtype) { return $primary_mime_type[(int) $structure->type] . '/' .$structure->subtype; } return "TEXT/PLAIN"; } function get_part($stream, $msg_number, $mime_type, $structure = false, $part_number = false) { if(!$structure) { $structure = imap_fetchstructure($stream, $msg_number); } if($structure) { if($mime_type == get_mime_type($structure)) { if(!$part_number) { $part_number = "1"; } $text = imap_fetchbody($stream, $msg_number, $part_number); if($structure->encoding == 3) { return imap_base64($text); } else if($structure->encoding == 4) { return imap_qprint($text); } else { return $text; } } if($structure->type == 1) /* multipart */ { while(list($index, $sub_structure) = each($structure->parts)) { if($part_number) { $prefix = $part_number . '.'; } $data = get_part($stream, $msg_number, $mime_type, $sub_structure,$prefix . ($index + 1)); if($data) { return $data; } } // END OF WHILE } // END OF MULTIPART } // END OF STRUTURE return false; } // END OF FUNCTION $connection = imap_open($server, $login, $password); $count = imap_num_msg($connection); for($i = 1; $i <= $count; $i++) { $header = imap_headerinfo($connection, $i); $from = $header->fromaddress; $to = $header->toaddress; $subject = $header->subject; $date = $header->date; $body = get_part($connection, $i, "TEXT/PLAIN"); }

    Read the article

  • Logging raw HTTP request/response in ASP.NET MVC & IIS7

    - by Greg Beech
    I'm writing a web service (using ASP.NET MVC) and for support purposes we'd like to be able to log the requests and response in as close as possible to the raw, on-the-wire format (i.e including HTTP method, path, all headers, and the body) into a database. What I'm not sure of is how to get hold of this data in the least 'mangled' way. I can re-constitute what I believe the request looks like by inspecting all the properties of the HttpRequest object and building a string from them (and similarly for the response) but I'd really like to get hold of the actual request/response data that's sent on the wire. I'm happy to use any interception mechanism such as filters, modules, etc. and the solution can be specific to IIS7. However, I'd prefer to keep it in managed code only. Any recommendations? Edit: I note that HttpRequest has a SaveAs method which can save the request to disk but this reconstructs the request from the internal state using a load of internal helper methods that cannot be accessed publicly (quite why this doesn't allow saving to a user-provided stream I don't know). So it's starting to look like I'll have to do my best to reconstruct the request/response text from the objects... groan. Edit 2: Please note that I said the whole request including method, path, headers etc. The current responses only look at the body streams which does not include this information. Edit 3: Does nobody read questions around here? Five answers so far and yet not one even hints at a way to get the whole raw on-the-wire request. Yes, I know I can capture the output streams and the headers and the URL and all that stuff from the request object. I already said that in the question, see: I can re-constitute what I believe the request looks like by inspecting all the properties of the HttpRequest object and building a string from them (and similarly for the response) but I'd really like to get hold of the actual request/response data that's sent on the wire. If you know the complete raw data (including headers, url, http method, etc.) simply cannot be retrieved then that would be useful to know. Similarly if you know how to get it all in the raw format (yes, I still mean including headers, url, http method, etc.) without having to reconstruct it, which is what I asked, then that would be very useful. But telling me that I can reconstruct it from the HttpRequest/HttpResponse objects is not useful. I know that. I already said it. Please note: Before anybody starts saying this is a bad idea, or will limit scalability, etc., we'll also be implementing throttling, sequential delivery, and anti-replay mechanisms in a distributed environment, so database logging is required anyway. I'm not looking for a discussion of whether this is a good idea, I'm looking for how it can be done.

    Read the article

  • How Can I create different selectors for accepting new connection in java NIO

    - by Deepak
    I want to write java tcp socket programming using java NIO. Its working fine. But I am using the same selector for accepting reading from and writing to the clients. How Can I create different selectors for accepting new connection in java NIO, reading and writing. Is there any online help. Actually when I am busy in reading or writing my selector uses more iterator. So If more number of clients are connected then performance of accepting new coneection became slow. But I donot want the accepting clients to be slow // Create a selector and register two socket channels Selector selector = null; try { // Create the selector selector = Selector.open(); // Create two non-blocking sockets. This method is implemented in // e173 Creating a Non-Blocking Socket. SocketChannel sChannel1 = createSocketChannel("hostname.com", 80); SocketChannel sChannel2 = createSocketChannel("hostname.com", 80); // Register the channel with selector, listening for all events sChannel1.register(selector, sChannel1.validOps()); sChannel2.register(selector, sChannel1.validOps()); } catch (IOException e) { } // Wait for events while (true) { try { // Wait for an event selector.select(); } catch (IOException e) { // Handle error with selector break; } // Get list of selection keys with pending events Iterator it = selector.selectedKeys().iterator(); // Process each key at a time while (it.hasNext()) { // Get the selection key SelectionKey selKey = (SelectionKey)it.next(); // Remove it from the list to indicate that it is being processed it.remove(); try { processSelectionKey(selKey); } catch (IOException e) { // Handle error with channel and unregister selKey.cancel(); } } } public void processSelectionKey(SelectionKey selKey) throws IOException { // Since the ready operations are cumulative, // need to check readiness for each operation if (selKey.isValid() && selKey.isConnectable()) { // Get channel with connection request SocketChannel sChannel = (SocketChannel)selKey.channel(); boolean success = sChannel.finishConnect(); if (!success) { // An error occurred; handle it // Unregister the channel with this selector selKey.cancel(); } } if (selKey.isValid() && selKey.isReadable()) { // Get channel with bytes to read SocketChannel sChannel = (SocketChannel)selKey.channel(); // See e174 Reading from a SocketChannel } if (selKey.isValid() && selKey.isWritable()) { // Get channel that's ready for more bytes SocketChannel sChannel = (SocketChannel)selKey.channel(); } } Thanks Deepak

    Read the article

  • XML: When to use attributes instead of child nodes?

    - by Rosarch
    For tree leaves in XML, when is it better to use attributes, and when is it better to use descendant nodes? For example, in the following XML document: <?xml version="1.0" encoding="utf-8" ?> <savedGame> <links> <link rootTagName="zombies" packageName="zombie" /> <link rootTagName="ghosts" packageName="ghost" /> <link rootTagName="players" packageName="player" /> <link rootTagName="trees" packageName="tree" /> </links> <locations> <zombies> <zombie> <positionX>41</positionX> <positionY>100</positionY> </zombie> <zombie> <positionX>55</positionX> <positionY>56</positionY> </zombie> </zombies> <ghosts> <ghost> <positionX>11</positionX> <positionY>90</positionY> </ghost> </ghosts> </locations> </savedGame> The <link> tag has attributes, but it could also be written as: <link> <rootTagName>trees</rootTagName> <packageName>tree</packageName> </link> Similarly, the location tags could be written as: <zombie positionX="55" positionY="56" /> instead of: <zombie> <positionX>55</positionX> <positionY>56</positionY> </zombie> What reasons are there to prefer one over the other? Is it just a stylistic issue? Any performance considerations?

    Read the article

  • Working with Hibernate Queries

    - by jschoen
    I am new to hibernate queries, and trying to get a grasp on how everything works. I am using Hibernate 3 with Netbeans 6.5. I have a basic project set up and have been playing around with how to do everything. I started with essentially a search query. Where the user can enter values into one or more fields. The table would be Person with the columns first_name, middle_name, last_name for the sake of the example. The first way I found was to have a method that took firstName, middleName, and lastName as parameters: Session session = HibernateUtil.getSessionFactory().getCurrentSession(); Transaction tx = session.beginTransaction(); String query = "from Person where (first_name = :firstName or :firstName is null) "+ "and (middle_name = :middleName or :middleName is null) " "and (last_name = :lastname or :lastName is null)"; Query q = session.createQuery(query); q.setString("firstName", firstName); q.setString("middleName", middleName); q.setString("lastName", lastName); List<Person> results = (List<Person>) q.list(); This did not sit well with me, since it seemed like I should not have to write that much, and well, that I was doing it wrong. So I kept digging and found another way: Session session = HibernateUtil.getSessionFactory().getCurrentSession(); Transaction tx = session.beginTransaction(); Criteria crit = session.createCriteria(Person.class); if (firstName != null) { crit.add(Expression.ge("firstName", firstName); } if (middleName != null) { crit.add(Expression.ge("middleName", middleName); } if (lastName != null) { crit.add(Expression.ge("lastName", lastName); } List<Person> results = (List<Person>) crit.list(); So what I am trying to figure out is which way is the preferred way for this type of query? Criteria or Query? Why? I am guessing that Criteria is the preferred way and you should only use Query when you need to write it by hand for performance type reasons. Am I close?

    Read the article

  • Xcache - No different after using it

    - by Charles Yeung
    Hi I have installed Xcache in my site(using xampp), I have tested more then 10 times on several page and the result is same as default(no any cache installed), is it something wrong with the configure? Updated [xcache-common] ;; install as zend extension (recommended), normally "$extension_dir/xcache.so" zend_extension = /usr/local/lib/php/extensions/non-debug-non-zts-xxx/xcache.so zend_extension_ts = /usr/local/lib/php/extensions/non-debug-zts-xxx/xcache.so ;; For windows users, replace xcache.so with php_xcache.dll zend_extension_ts = C:\xampp\php\ext\php_xcache.dll ;; or install as extension, make sure your extension_dir setting is correct ; extension = xcache.so ;; or win32: ; extension = php_xcache.dll [xcache.admin] xcache.admin.enable_auth = On xcache.admin.user = "mOo" ; xcache.admin.pass = md5($your_password) xcache.admin.pass = "" [xcache] ; ini only settings, all the values here is default unless explained ; select low level shm/allocator scheme implemenation xcache.shm_scheme = "mmap" ; to disable: xcache.size=0 ; to enable : xcache.size=64M etc (any size > 0) and your system mmap allows xcache.size = 60M ; set to cpu count (cat /proc/cpuinfo |grep -c processor) xcache.count = 1 ; just a hash hints, you can always store count(items) > slots xcache.slots = 8K ; ttl of the cache item, 0=forever xcache.ttl = 0 ; interval of gc scanning expired items, 0=no scan, other values is in seconds xcache.gc_interval = 0 ; same as aboves but for variable cache xcache.var_size = 4M xcache.var_count = 1 xcache.var_slots = 8K ; default ttl xcache.var_ttl = 0 xcache.var_maxttl = 0 xcache.var_gc_interval = 300 xcache.test = Off ; N/A for /dev/zero xcache.readonly_protection = Off ; for *nix, xcache.mmap_path is a file path, not directory. ; Use something like "/tmp/xcache" if you want to turn on ReadonlyProtection ; 2 group of php won't share the same /tmp/xcache ; for win32, xcache.mmap_path=anonymous map name, not file path xcache.mmap_path = "/dev/zero" ; leave it blank(disabled) or "/tmp/phpcore/" ; make sure it's writable by php (without checking open_basedir) xcache.coredump_directory = "" ; per request settings xcache.cacher = On xcache.stat = On xcache.optimizer = Off [xcache.coverager] ; per request settings ; enable coverage data collecting for xcache.coveragedump_directory and xcache_coverager_start/stop/get/clean() functions (will hurt executing performance) xcache.coverager = Off ; ini only settings ; make sure it's readable (care open_basedir) by coverage viewer script ; requires xcache.coverager=On xcache.coveragedump_directory = "" Thanks you

    Read the article

  • Oracle Schema Design: Seperate Schema with I/O Overhead?

    - by Guru
    We are designing database schema for a new system based on Oracle 11gR1. We have identified a main schema which would have close to 100 tables, these will be accessed from the front end Java application. We have a requirement to audit the values which got changed in close to 50 tables, this has to be done every row. Which means, it is possible that, for a single row in MYSYS.T1 there might be 50 (or more) rows in MYSYS_AUDIT.T1_AUD table. We might be having old values of every column entry and new values available from T1. DBA gave an observation, advising against this method, because he said, separate schema meant an extra I/O for every operation. Basically AUDIT schema would be used only to do some analyse and enter values (thus SELECT and INSERT). Is it true that, "a separate schema means an extra I/O" ? I could not find justification. It appears logical to me, as the AUDIT data should not be tampered with, so a separate schema. Also, we designed a separate schema for archiving some tables from MYSYS. From MYSYS_ARC the table might be backed up into tapes or deleted after sufficient time. Few stats: Few tables (close to 20, 30) in MYSYS schema could grow to around 50M rows. We have asked for a total disk space of 4 TB. MYSYS_AUDIT schema might be having 10 times that of MYSYS but we wont keep them more than 3 months. Questions Given all these, can you suggest me any improvements? Separate schema affects disc I/O? (one extra I/O for every schema ?) Any general suggestions? Figure: +-------------------+ +-------------------+ | MYSYS | | MYSYS_AUDIT | | | | | | 1. T1 | | 1. T1_AUD | | 2. T2 | | 2. T2_AUD | | 3. T3 |--------->| 3. T3_AUD | | 4. T4 |(SELECT, | 4. T4_AUD | | . | INSERT) | . | | . | | . | | . | | . | | 100. T100 | | 50. T50_AUD | +-------------------+ +-------------------+ | | | | |(INSERT) | | | * +-------------------+ | MYSYS_ARC | | | | 1. T1_ARC | | 2. T2_ARC | | 3. T3_ARC | | 4. T4_ARC | | . | | . | | . | | 100. T100_ARC | +-------------------+ Apart from this, we have two more schemas with only read only rights, but mainly they are for adhoc purpose and we dont mind the performance on them.

    Read the article

  • How to optimize my PostgreSQL DB for prefix search?

    - by asmaier
    I have a table called "nodes" with roughly 1.7 million rows in my PostgreSQL db =#\d nodes Table "public.nodes" Column | Type | Modifiers --------+------------------------+----------- id | integer | not null title | character varying(256) | score | double precision | Indexes: "nodes_pkey" PRIMARY KEY, btree (id) I want to use information from that table for autocompletion of a search field, showing the user a list of the ten titles having the highest score fitting to his input. So I used this query (here searching for all titles starting with "s") =# explain analyze select title,score from nodes where title ilike 's%' order by score desc; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Sort (cost=64177.92..64581.38 rows=161385 width=25) (actual time=4930.334..5047.321 rows=161264 loops=1) Sort Key: score Sort Method: external merge Disk: 5712kB -> Seq Scan on nodes (cost=0.00..46630.50 rows=161385 width=25) (actual time=0.611..4464.413 rows=161264 loops=1) Filter: ((title)::text ~~* 's%'::text) Total runtime: 5260.791 ms (6 rows) This was much to slow for using it with autocomplete. With some information from Using PostgreSQL in Web 2.0 Applications I was able to improve that with a special index =# create index title_idx on nodes using btree(lower(title) text_pattern_ops); =# explain analyze select title,score from nodes where lower(title) like lower('s%') order by score desc limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=18122.41..18122.43 rows=10 width=25) (actual time=1324.703..1324.708 rows=10 loops=1) -> Sort (cost=18122.41..18144.60 rows=8876 width=25) (actual time=1324.700..1324.702 rows=10 loops=1) Sort Key: score Sort Method: top-N heapsort Memory: 17kB -> Bitmap Heap Scan on nodes (cost=243.53..17930.60 rows=8876 width=25) (actual time=96.124..1227.203 rows=161264 loops=1) Filter: (lower((title)::text) ~~ 's%'::text) -> Bitmap Index Scan on title_idx (cost=0.00..241.31 rows=8876 width=0) (actual time=90.059..90.059 rows=161264 loops=1) Index Cond: ((lower((title)::text) ~>=~ 's'::text) AND (lower((title)::text) ~<~ 't'::text)) Total runtime: 1325.085 ms (9 rows) So this gave me a speedup of factor 4. But can this be further improved? What if I want to use '%s%' instead of 's%'? Do I have any chance of getting a decent performance with PostgreSQL in that case, too? Or should I better try a different solution (Lucene?, Sphinx?) for implementing my autocomplete feature?

    Read the article

  • LINQ aggregate left join on SQL CE

    - by P Daddy
    What I need is such a simple, easy query, it blows me away how much work I've done just trying to do it in LINQ. In T-SQL, it would be: SELECT I.InvoiceID, I.CustomerID, I.Amount AS AmountInvoiced, I.Date AS InvoiceDate, ISNULL(SUM(P.Amount), 0) AS AmountPaid, I.Amount - ISNULL(SUM(P.Amount), 0) AS AmountDue FROM Invoices I LEFT JOIN Payments P ON I.InvoiceID = P.InvoiceID WHERE I.Date between @start and @end GROUP BY I.InvoiceID, I.CustomerID, I.Amount, I.Date ORDER BY AmountDue DESC The best equivalent LINQ expression I've come up with, took me much longer to do: var invoices = ( from I in Invoices where I.Date >= start && I.Date <= end join P in Payments on I.InvoiceID equals P.InvoiceID into payments select new{ I.InvoiceID, I.CustomerID, AmountInvoiced = I.Amount, InvoiceDate = I.Date, AmountPaid = ((decimal?)payments.Select(P=>P.Amount).Sum()).GetValueOrDefault(), AmountDue = I.Amount - ((decimal?)payments.Select(P=>P.Amount).Sum()).GetValueOrDefault() } ).OrderByDescending(row=>row.AmountDue); This gets an equivalent result set when run against SQL Server. Using a SQL CE database, however, changes things. The T-SQL stays almost the same. I only have to change ISNULL to COALESCE. Using the same LINQ expression, however, results in an error: There was an error parsing the query. [ Token line number = 4, Token line offset = 9,Token in error = SELECT ] So we look at the generated SQL code: SELECT [t3].[InvoiceID], [t3].[CustomerID], [t3].[Amount] AS [AmountInvoiced], [t3].[Date] AS [InvoiceDate], [t3].[value] AS [AmountPaid], [t3].[value2] AS [AmountDue] FROM ( SELECT [t0].[InvoiceID], [t0].[CustomerID], [t0].[Amount], [t0].[Date], COALESCE(( SELECT SUM([t1].[Amount]) FROM [Payments] AS [t1] WHERE [t0].[InvoiceID] = [t1].[InvoiceID] ),0) AS [value], [t0].[Amount] - (COALESCE(( SELECT SUM([t2].[Amount]) FROM [Payments] AS [t2] WHERE [t0].[InvoiceID] = [t2].[InvoiceID] ),0)) AS [value2] FROM [Invoices] AS [t0] ) AS [t3] WHERE ([t3].[Date] >= @p0) AND ([t3].[Date] <= @p1) ORDER BY [t3].[value2] DESC Ugh! Okay, so it's ugly and inefficient when run against SQL Server, but we're not supposed to care, since it's supposed to be quicker to write, and the performance difference shouldn't be that large. But it just doesn't work against SQL CE, which apparently doesn't support subqueries within the SELECT list. In fact, I've tried several different left join queries in LINQ, and they all seem to have the same problem. Even: from I in Invoices join P in Payments on I.InvoiceID equals P.InvoiceID into payments select new{I, payments} generates: SELECT [t0].[InvoiceID], [t0].[CustomerID], [t0].[Amount], [t0].[Date], [t1].[InvoiceID] AS [InvoiceID2], [t1].[Amount] AS [Amount2], [t1].[Date] AS [Date2], ( SELECT COUNT(*) FROM [Payments] AS [t2] WHERE [t0].[InvoiceID] = [t2].[InvoiceID] ) AS [value] FROM [Invoices] AS [t0] LEFT OUTER JOIN [Payments] AS [t1] ON [t0].[InvoiceID] = [t1].[InvoiceID] ORDER BY [t0].[InvoiceID] which also results in the error: There was an error parsing the query. [ Token line number = 2, Token line offset = 5,Token in error = SELECT ] So how can I do a simple left join on a SQL CE database using LINQ? Am I wasting my time?

    Read the article

  • Can simple javascript inheritance be simplified even further?

    - by Will
    John Resig (of jQuery fame) provides a concise and elegant way to allow simple JavaScript inheritance. It was so short and sweet, in fact, that it inspired me to try and simplify it even further (see code below). I've modified his original function such that it still passes all his tests and has the potential advantage of: readability (50% less code) simplicity (you don't have to be a ninja to understand it) performance (no extra wrappers around super/base method calls) consistency with C#'s base keyword Because this seems almost too good to be true, I want to make sure my logic doesn't have any fundamental flaws/holes/bugs, or if anyone has additional suggestions to improve or refute the code (perhaps even John Resig could chime in here!). Does anyone see anything wrong with my approach (below) vs. John Resig's original approach? if (!window.Class) { window.Class = function() {}; window.Class.extend = function(members) { var prototype = new this(); for (var i in members) prototype[i] = members[i]; prototype.base = this.prototype; function object() { if (object.caller == null && this.initialize) this.initialize.apply(this, arguments); } object.constructor = object; object.prototype = prototype; object.extend = arguments.callee; return object; }; } And the tests (below) are nearly identical to the original ones except for the syntax around base/super method calls (for the reason enumerated above): var Person = Class.extend( { initialize: function(isDancing) { this.dancing = isDancing; }, dance: function() { return this.dancing; } }); var Ninja = Person.extend( { initialize: function() { this.base.initialize(false); }, dance: function() { return this.base.dance(); }, swingSword: function() { return true; } }); var p = new Person(true); alert("true? " + p.dance()); // => true var n = new Ninja(); alert("false? " + n.dance()); // => false alert("true? " + n.swingSword()); // => true alert("true? " + (p instanceof Person && p instanceof Class && n instanceof Ninja && n instanceof Person && n instanceof Class));

    Read the article

  • How to split HTML code with javascript or JQuery

    - by Dean
    Hi I'm making a website using JSP and servlets and I have to now break up a list of radio buttons to insert a textarea and a button. I have got the button and textarea to hide and show when you click on the radio button it shows the text area and button. But this only appears at the top and when there are hundreds on the page this will become awkward so i need a way for it to appear underneath. Here is what my HTML looks like when complied: <form action="addSpotlight" method="POST"> <table> <tr><td><input type="radio" value="29" name="publicationIDs" ></td><td>A System For Dynamic Server Allocation in Application Server Clusters, IEEE International Symposium on Parallel and Distributed Processsing with Applications, 2008</td> </tr> <tr><td><input type="radio" value="30" name="publicationIDs" ></td><td>Analysing BitTorrent's Seeding Strategies, 7th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC-09), 2009</td> </tr> <tr><td><input type="radio" value="31" name="publicationIDs" ></td><td>The Effect of Server Reallocation Time in Dynamic Resource Allocation, UK Performance Engineering Workshop 2009, 2009</td> </tr> <tr><td><input type="radio" value="32" name="publicationIDs" ></td><td>idk, hello, 1992</td> </tr> <tr><td><input type="radio" value="33" name="publicationIDs" ></td><td>sad, safg, 1992</td> </tr> <div class="abstractWriteup"><textarea name="abstract"></textarea> <input type="submit" value="Add Spotlight"></div> </table> </form> Now here is what my JSP looks like: <form action="addSpotlight" method="POST"> <table> <%int i = 0; while(i<ids.size()){%> <tr><td><input type="radio" value="<%=ids.get(i)%>" name="publicationIDs" ></td><td><%=info.get(i)%></td> </tr> <%i++; }%> <div class="abstractWriteup"><textarea name="abstract"></textarea> <input type="submit" value="Add Spotlight"></div> </table> </form> Thanks in Advance Dean

    Read the article

  • What are the benefits of using ORM over XML Serialization/Deserialization?

    - by Tequila Jinx
    I've been reading about NHibernate and Microsoft's Entity Framework to perform Object Relational Mapping against my data access layer. I'm interested in the benefits of having an established framework to perform ORM, but I'm curious as to the performance costs of using it against standard XML Serialization and Deserialization. Right now, I develop stored procedures in Oracle and SQL Server that use XML Types for either input or output parameters and return or shred XML depending on need. I use a custom database command object that uses generics to deserialize the XML results into a specified serializable class. By using a combination of generics, xml (de)serialization and Microsoft's DAAB, I've got a process that's fairly simple to develop against regardless of the data source. Moreover, since I exclusively use Stored Procedures to perform database operations, I'm mostly protected from changes in the data structure. Here's an over-simplified example of what I've been doing. static void main() { testXmlClass test = new test(1); test.Name = "Foo"; test.Save(); } // Example Serializable Class ------------------------------------------------ [XmlRootAttribute("test")] class testXmlClass() { [XmlElement(Name="id")] public int ID {set; get;} [XmlElement(Name="name")] public string Name {set; get;} //create an instance of the class loaded with data. public testXmlClass(int id) { GenericDBProvider db = new GenericDBProvider(); this = db.ExecuteSerializable("myGetByIDProcedure"); } //save the class to the database... public Save() { GenericDBProvider db = new GenericDBProvider(); db.AddInParameter("myInputParameter", DbType.XML, this); db.ExecuteSerializableNonQuery("mySaveProcedure"); } } // Database Handler ---------------------------------------------------------- class GenericDBProvider { public T ExecuteSerializable<T>(string commandText) where T : class { XmlSerializer xml = new XmlSerializer(typeof(T)); // connection and command code is assumed for the purposes of this example. // the final results basically just come down to... return xml.Deserialize(commandResults) as T; } public void ExecuteSerializableNonQuery(string commandText) { // once again, connection and command code is assumed... // basically, just execute the command along with the specified // parameters which have been serialized. } public void AddInParameter(string name, DbType type, object value) { StringWriter w = new StringWriter(); XmlSerializer x = new XmlSerializer(value.GetType()); //handle serialization for serializable classes. if (type == DbType.Xml && (value.GetType() != typeof(System.String))) { x.Serialize(w, value); w.Close(); // store serialized object in a DbParameterCollection accessible // to my other methods. } else { //handle all other parameter types } } } I'm starting a new project which will rely heavily on database operations. I'm very curious to know whether my current practices will be sustainable in a high-traffic situation and whether or not I should consider switching to NHibernate or Microsoft's Entity Framework to perform what essentially seems to boil down to the same thing I'm currently doing. I appreciate any advice you may have.

    Read the article

  • Error in facebook.dll facebooksdk

    - by Surendar Radhakrishnan
    I got the web application working with facebooksdk and when i deployed it...it is running fine for sometime and it is throwing the error like this... Server Error in '/' Application. Could not load file or assembly 'Facebook, Version=4.1.1.0, Culture=neutral, PublicKeyToken=58cb4f2111d1e6de' or one of its dependencies. Access is denied. Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code. Exception Details: System.IO.FileLoadException: Could not load file or assembly 'Facebook, Version=4.1.1.0, Culture=neutral, PublicKeyToken=58cb4f2111d1e6de' or one of its dependencies. Access is denied. Source Error: An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below. Assembly Load Trace: The following information can be helpful to determine why the assembly 'Facebook, Version=4.1.1.0, Culture=neutral, PublicKeyToken=58cb4f2111d1e6de' could not be loaded. WRN: Assembly binding logging is turned OFF. To enable assembly bind failure logging, set the registry value [HKLM\Software\Microsoft\Fusion!EnableLog] (DWORD) to 1. Note: There is some performance penalty associated with assembly bind failure logging. To turn this feature off, remove the registry value [HKLM\Software\Microsoft\Fusion!EnableLog]. Stack Trace: [FileLoadException: Could not load file or assembly 'Facebook, Version=4.1.1.0, Culture=neutral, PublicKeyToken=58cb4f2111d1e6de' or one of its dependencies. Access is denied.] Secured_Login.FacebookVerification() +0 System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr fp, Object o, Object t, EventArgs e) +25 System.Web.UI.Control.LoadRecursive() +71 System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +3048 Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.0.30319.1 i got this method in pageload protected void Page_Load(object sender, EventArgs e) { FacebookVerification(); } protected void FacebookVerification() { try { FacebookApp fbApp = new FacebookApp(); if (fbApp.Session != null) { dynamic myinfo = fbApp.Get("me"); String firstname = myinfo.first_name; String lastname = myinfo.last_name; lblFBStatus.Text = "you signed in as " + firstname + " " + lastname ; } else { lblFBStatus.Text = "Please sign in with facebook"; } } catch (Exception) { throw; } }

    Read the article

  • Execution Plan Optimization when where clause is removed then added back

    - by nmushov
    I have a stored procedure that uses a table valued function which executes in 9 seconds. If I alter the table valued function and remove the where clause, the stored procedure executes in 3 seconds. If I add the where clause back, the query still executes in 3 seconds. I took a look at the execution plans and it appears that after I remove the where clause, the execution plan includes parallelism and the scan count for 2 of my tables drops for 50000 and 65000 down to 5 and 3. After I add the where clause back, the optimized execution plan still runs unless I run DBCC FREEPROCCACHE. Questions 1. Why would SQL Server start using the optimized execution plan for both queries only when I first remove the where clause? Is there a way to force SQL Server to use this execution plan? Also, this is a paramaterized all-in-one query that uses the (Parameter is null or Parameter) in the where clause, which I believe is bad for performance. RETURNS TABLE AS RETURN ( SELECT TOP (@PageNumber * @PageSize) CASE WHEN @SortOrder = 'Expensive' THEN ROW_NUMBER() OVER (ORDER BY SellingPrice DESC) WHEN @SortOrder = 'Inexpensive' THEN ROW_NUMBER() OVER (ORDER BY SellingPrice ASC) WHEN @SortOrder = 'LowMiles' THEN ROW_NUMBER() OVER (ORDER BY Mileage ASC) WHEN @SortOrder = 'HighMiles' THEN ROW_NUMBER() OVER (ORDER BY Mileage DESC) WHEN @SortOrder = 'Closest' THEN ROW_NUMBER() OVER (ORDER BY P1.Distance ASC) WHEN @SortOrder = 'Newest' THEN ROW_NUMBER() OVER (ORDER BY [Year] DESC) WHEN @SortOrder = 'Oldest' THEN ROW_NUMBER() OVER (ORDER BY [Year] ASC) ELSE ROW_NUMBER() OVER (ORDER BY InventoryID ASC) END as rn, P1.InventoryID, P1.SellingPrice, P1.Distance, P1.Mileage, Count(*) OVER () RESULT_COUNT, dimCarStatus.[year] FROM (SELECT InventoryID, SellingPrice, Zip.Distance, Mileage, ColorKey, CarStatusKey, CarKey FROM facInventory JOIN @ZipCodes Zip ON Zip.DealerKey = facInventory.DealerKey) as P1 JOIN dimColor ON dimColor.ColorKey = P1.ColorKey JOIN dimCarStatus ON dimCarStatus.CarStatusKey = P1.CarStatusKey JOIN dimCar ON dimCar.CarKey = P1.CarKey WHERE (@ExteriorColor is NULL OR dimColor.ExteriorColor like @ExteriorColor) AND (@InteriorColor is NULL OR dimColor.InteriorColor like @InteriorColor) AND (@Condition is NULL OR dimCarStatus.Condition like @Condition) AND (@Year is NULL OR dimCarStatus.[Year] like @Year) AND (@Certified is NULL OR dimCarStatus.Certified like @Certified) AND (@Make is NULL OR dimCar.Make like @Make) AND (@ModelCategory is NULL OR dimCar.ModelCategory like @ModelCategory) AND (@Model is NULL OR dimCar.Model like @Model) AND (@Trim is NULL OR dimCar.Trim like @Trim) AND (@BodyType is NULL OR dimCar.BodyType like @BodyType) AND (@VehicleTypeCode is NULL OR dimCar.VehicleTypeCode like @VehicleTypeCode) AND (@MinPrice is NULL OR P1.SellingPrice >= @MinPrice) AND (@MaxPrice is NULL OR P1.SellingPrice < @MaxPrice) AND (@Mileage is NULL OR P1.Mileage < @Mileage) ORDER BY CASE WHEN @SortOrder = 'Expensive' THEN -SellingPrice WHEN @SortOrder = 'Inexpensive' THEN SellingPrice WHEN @SortOrder = 'LowMiles' THEN Mileage WHEN @SortOrder = 'HighMiles' THEN -Mileage WHEN @SortOrder = 'Closest' THEN P1.Distance WHEN @SortOrder = 'Newest' THEN -[YEAR] WHEN @SortOrder = 'Oldest' THEN [YEAR] ELSE InventoryID END )

    Read the article

  • Scala actors: receive vs react

    - by jqno
    Let me first say that I have quite a lot of Java experience, but have only recently become interested in functional languages. Recently I've started looking at Scala, which seems like a very nice language. However, I've been reading about Scala's Actor framework in Programming in Scala, and there's one thing I don't understand. In chapter 30.4 it says that using react instead of receive makes it possible to re-use threads, which is good for performance, since threads are expensive in the JVM. Does this mean that, as long as I remember to call react instead of receive, I can start as many Actors as I like? Before discovering Scala, I've been playing with Erlang, and the author of Programming Erlang boasts about spawning over 200,000 processes without breaking a sweat. I'd hate to do that with Java threads. What kind of limits am I looking at in Scala as compared to Erlang (and Java)? Also, how does this thread re-use work in Scala? Let's assume, for simplicity, that I have only one thread. Will all the actors that I start run sequentially in this thread, or will some sort of task-switching take place? For example, if I start two actors that ping-pong messages to each other, will I risk deadlock if they're started in the same thread? According to Programming in Scala, writing actors to use react is more difficult than with receive. This sounds plausible, since react doesn't return. However, the book goes on to show how you can put a react inside a loop using Actor.loop. As a result, you get loop { react { ... } } which, to me, seems pretty similar to while (true) { receive { ... } } which is used earlier in the book. Still, the book says that "in practice, programs will need at least a few receive's". So what am I missing here? What can receive do that react cannot, besides return? And why do I care? Finally, coming to the core of what I don't understand: the book keeps mentioning how using react makes it possible to discard the call stack to re-use the thread. How does that work? Why is it necessary to discard the call stack? And why can the call stack be discarded when a function terminates by throwing an exception (react), but not when it terminates by returning (receive)? I have the impression that Programming in Scala has been glossing over some of the key issues here, which is a shame, because otherwise it's a truly excellent book.

    Read the article

  • Why is cell phone software still so primitive?

    - by Tomislav Nakic-Alfirevic
    I don't do mobile development, but it strikes me as odd that features like this aren't available by default on most phones: full text search: searches all address book contents, messages, anything else being a plus better call management: e.g. a rotating audio call log, meaning you always have the last N calls recorded for your listening pleasure later (your little girl just said her first "da-da" while you were on a business trip, you had a telephone job interview, you received complex instructions to do something etc.) bluetooth remote control (like e.g. anyRemote, but available by default on a bluetooth phone) no multitasking capabilities worth mentioning and in general no e.g. weekly software updates, making the phone much more usable (even if it had to be done over USB, rather than over the network). I'm sure I was dumbfounded by the lack or design of other features as well, but they don't come to mind right now. To clarify, I'm not talking about smartphones here: my plain, 2-year old phone has a CPU an order of magnitude faster than my first PC, about as much storage space and it's ridiculous how bad (slow, unwieldy) the software is and it's not one phone or one manufacturer. What keeps the (to me) obvious software functionality vacuum on a capable hardware platform from being filled up? Edit: I believe a clarification on the multitasking point might be beneficial. I'll use my phone as an example, although the point is much more general. The phone can multitask and in fact does: you can listen to music and do something else at the same time. On the other hand, the way the software has been designed makes multitasking next to useless. (Ditto with the external touch screen: it can take touch commands, but only one application makes use of it, and only with 3 commands.) To take the multitasking example to the extreme, if I plug my phone into my laptop and it registers as an external disk, it doesn't allow any kind of operation: messages, calling, calendar, everything out of reach, although I can receive a call. No "battery life" issue there: it's charging while connected. BTW, another example of design below the current state of the art: I don't see a phone on the horizon which will remember where in an audio or video file you were when you stopped listening/watching it last time (podcasts are a good use case). Simplistic rewind/fast forward functionality only aggravates the problem.

    Read the article

  • SQL Select - adding field to Select is changing the results

    - by nycdan
    I'm stumped by this SQL problem that I suspect will be easy pickings for someone out there. I have a table that contains rows representing several daily lists of ranked items. The relevent fields are as follows: ID, ListID, ItemID, ItemName, ItemRank, Date. I have a query that returns the items that were on a list yesterday but not today (Items Off List) as follows: Select ItemID, ListID, ItemName, convert(varchar(10),MAX(date),101) as date, COUNT(ItemName) as days_on_list From Table Group By ItemID, ListID, ItemName Having Max(date) = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1 Order By ListID, ItemName, COUNT(ItemName) Basically I'm looking for records where the max date is yesterday. It works fine and shows the number of days each item was previously on the list (although not necessarily consecutively, but that's fine for now). The problem is when I try to add ranking to see what yesterday's rank was. I tried the following: Select ItemID, ListID, ItemName, ranking, convert(varchar(10),MAX(date),101) as date, COUNT(ItemName) as days_on_list From Table Group By ItemID, ListID, ItemName, ranking Having Max(date) = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1 Order By ListID, ItemName, ranking, COUNT(ItemName) This returns a great deal more records than the previous query so something isn't right with it. I want the same number of records, but with the ranking included. I can get the rank by doing a self-join with a subquery and getting records where the ItemID occurs yesterday but not today - but then I don't know how to get the Count any more. Appreciation in advance for any help with this. ======== SOLVED ============== Select ItemID, ListID, ItemName, ranking, convert(varchar(10),MAX(date),101) as date, COUNT(ItemName) as days_on_list from Table T Where date = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1 and T.ItemID Not In (select T.ItemID from Table T join Table T2 on T.ItemID = T2.ItemID and T.ListID = T2.ListID where T.date = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and T2.date = convert (varchar(10),getdate(),101) and T.ListID = 1) Group by ItemID, ListID, ItemName, ranking Basically, what I did was create a subquery that finds all items that appear in both days, and finds items that appeared yesterday but are not in the set of items that appeared both days. Then I was able to do the aggregate function and grouping correctly. I would NOT be surprised if this is more convoluted than necessary but I understand it and can modify it as needed and performance doesn't seem to be an issue. Thanks everyone for the assist.

    Read the article

  • Any way to speed up this hierarchical query?

    - by RenderIn
    I've got a serious performance problem with a hierarchical query that I can't seem to fix. I am modeling several organization charts in my database, each representing a virtual organization within our company. For example, we have several temporary committees that are created from time to time and there may be a Committee Organizer role at the top of this virtual hierarchy, with several people assigned to the Committee Member role beneath the organizer. Some of our virtual organizations have many levels and several branches at each level. I have a single table in which I represent all the role assignments. i.e. a ROLE_ID column and a PARENT_ROLE_ID column which is a foreign key to the ROLE_ID column. For each assignment we also store as a column the location in the company where this person has the assignment. For example, the Committee Organizer would have a company-level/ CEO assignment, while the committee members would have department-level assignments such as ACCOUNTING, MARKETING, etc. So to model the organizer/member relationship for two individuals we would have: ROLE_ID = 4 PARENT_ROLE_ID = NULL EMPLOYEE_NUMBER = 213423 COMPANY_LOCATION = CEO ROLE_ID = 5 PARENT_ROLE_ID = 4 EMPLOYEE_NUMBER = 838221 COMPANY_LOCATION = ACCOUNTING Here's where things get tricky. I have an application that every person in the organization can log in to. When they log in they should be able to view all the virtual organizations in our company. e.g. the committee members should be able to see the committee organizer and vice-versa. However, only the committee organizer should be able to edit the committee members. The difficulty is in determining whether an individual (who can have multiple role assignments) has edit access for each other assignment. While this seems simple in the example, consider a virtual organization in which we have President at the top, 5 departments directly beneath him, 2 subdepartments below each department. We only want people in the Accounting department to be able to edit individuals in the subdepartments belonging to the Accounting department. They should not have edit access to anybody in the Marketing department or its subdepartments. To determine edit access when a user views a virtual organization in our company I run a query that executes two inline views: A) Hierarchically query for all assignments in this virtual organization and using SYS_CONNECT_BY_PATH to store the entire path to each user/role/company_location and B) Hierarchically retrieve all the assignments the individual logged in has and using the SYS_CONNECT_BY_PATH to store the entire path to each of these assignments. The result of the query is all the records from A) plus a boolean determined by joining with B) which flags whether the logged in user has edit access for each record. Indexes don't seem to be helping... it simply appears that there is too much processing going on to separate all the records and then determine edit access. One issue is that I can't store the SYS_CONNECT_BY_PATH and index it... determining whether an individual record has edit access consists of comparing if: test_record_sys_path LIKE individual_record_sys_path || '%' Is a materialized view the answer?

    Read the article

  • Any tool(s) for knowing the layout (segments) of running process in Windows?

    - by claws
    I've always been curious about How exactly the process looks in memory? What are the different segments(parts) in it? How exactly will be the program (on the disk) & process (in the memory) are related? My previous question: http://stackoverflow.com/questions/1966920/more-info-on-memory-layout-of-an-executable-program-process In my quest, I finally found a answer. I found this excellent article that cleared most of my queries: http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html In the above article, author shows how to get different segments of the process (LINUX) & he compares it with its corresponding ELF file. I'm quoting this section here: Courious to see the real layout of process segment? We can use /proc//maps file to reveal it. is the PID of the process we want to observe. Before we move on, we have a small problem here. Our test program runs so fast that it ends before we can even dump the related /proc entry. I use gdb to solve this. You can use another trick such as inserting sleep() before it calls return(). In a console (or a terminal emulator such as xterm) do: $ gdb test (gdb) b main Breakpoint 1 at 0x8048376 (gdb) r Breakpoint 1, 0x08048376 in main () Hold right here, open another console and find out the PID of program "test". If you want the quick way, type: $ cat /proc/`pgrep test`/maps You will see an output like below (you might get different output): [1] 0039d000-003b2000 r-xp 00000000 16:41 1080084 /lib/ld-2.3.3.so [2] 003b2000-003b3000 r--p 00014000 16:41 1080084 /lib/ld-2.3.3.so [3] 003b3000-003b4000 rw-p 00015000 16:41 1080084 /lib/ld-2.3.3.so [4] 003b6000-004cb000 r-xp 00000000 16:41 1080085 /lib/tls/libc-2.3.3.so [5] 004cb000-004cd000 r--p 00115000 16:41 1080085 /lib/tls/libc-2.3.3.so [6] 004cd000-004cf000 rw-p 00117000 16:41 1080085 /lib/tls/libc-2.3.3.so [7] 004cf000-004d1000 rw-p 004cf000 00:00 0 [8] 08048000-08049000 r-xp 00000000 16:06 66970 /tmp/test [9] 08049000-0804a000 rw-p 00000000 16:06 66970 /tmp/test [10] b7fec000-b7fed000 rw-p b7fec000 00:00 0 [11] bffeb000-c0000000 rw-p bffeb000 00:00 0 [12] ffffe000-fffff000 ---p 00000000 00:00 0 Note: I add number on each line as reference. Back to gdb, type: (gdb) q So, in total, we see 12 segment (also known as Virtual Memory Area--VMA). But I want to know about Windows Process & PE file format. Any tool(s) for getting the layout (segments) of running process in Windows? Any other good resources for learning more on this subject? EDIT: Are there any good articles which shows the mapping between PE file sections & VA segments?

    Read the article

  • Can parser combination be made efficient?

    - by Jon Harrop
    Around 6 years ago, I benchmarked my own parser combinators in OCaml and found that they were ~5× slower than the parser generators on offer at the time. I recently revisited this subject and benchmarked Haskell's Parsec vs a simple hand-rolled precedence climbing parser written in F# and was surprised to find the F# to be 25× faster than the Haskell. Here's the Haskell code I used to read a large mathematical expression from file, parse and evaluate it: import Control.Applicative import Text.Parsec hiding ((<|>)) expr = chainl1 term ((+) <$ char '+' <|> (-) <$ char '-') term = chainl1 fact ((*) <$ char '*' <|> div <$ char '/') fact = read <$> many1 digit <|> char '(' *> expr <* char ')' eval :: String -> Int eval = either (error . show) id . parse expr "" . filter (/= ' ') main :: IO () main = do file <- readFile "expr" putStr $ show $ eval file putStr "\n" and here's my self-contained precedence climbing parser in F#: let rec (|Expr|) (P(f, xs)) = Expr(loop (' ', f, xs)) and shift oop f op (P(g, xs)) = let h, xs = loop (op, g, xs) loop (oop, f h, xs) and loop = function | ' ' as oop, f, ('+' | '-' as op)::P(g, xs) | (' ' | '+' | '-' as oop), f, ('*' | '/' as op)::P(g, xs) | oop, f, ('^' as op)::P(g, xs) -> let h, xs = loop (op, g, xs) let op = match op with | '+' -> (+) | '-' -> (-) | '*' -> (*) | '/' -> (/) | '^' -> pown loop (oop, op f h, xs) | _, f, xs -> f, xs and (|P|) = function | '-'::P(f, xs) -> let f, xs = loop ('~', f, xs) P(-f, xs) | '('::Expr(f, ')'::xs) -> P(f, xs) | c::xs when '0' <= c && c <= '9' -> P(int(string c), xs) My impression is that even state-of-the-art parser combinators waste a lot of time back tracking. Is that correct? If so, is it possible to write parser combinators that generate state machines to obtain competitive performance or is it necessary to use code generation?

    Read the article

  • The question about the basics of LINQ to SQL

    - by Alex
    I just started learning LINQ to SQL, and so far I'm impressed with the easy of use and good performance. I used to think that when doing LINQ queries like from Customer in DB.Customers where Customer.Age > 30 select Customer LINQ gets all customers from the database ("SELECT * FROM Customers"), moves them to the Customers array and then makes a search in that Array using .NET methods. This is very inefficient, what if there are hundreds of thousands of customers in the database? Making such big SELECT queries would kill the web application. Now after experiencing how actually fast LINQ to SQL is, I start to suspect that when doing that query I just wrote, LINQ somehow converts it to a SQL Query string SELECT * FROM Customers WHERE Age > 30 And only when necessary it will run the query. So my question is: am I right? And when is the query actually run? The reason why I'm asking is not only because I want to understand how it works in order to build good optimized applications, but because I came across the following problem. I have 2 tables, one of them is Books, the other has information on how many books were sold on certain days. My goal is to select books that had at least 50 sales/day in past 10 days. It's done with this simple query: from Book in DB.Books where (from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID).Contains(Book.ID) select Book The point is, I have to use the checking part in several queries and I decided to create an array with IDs of all popular books: var popularBooksIDs = from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID; BUT when I try to do the query now: from Book in DB.Books where popularBooksIDs.Contains(Book.ID) select Book It doesn't work! That's why I think that we can't use thins kinds of shortcuts in LINQ to SQL queries, like we can't use them in real SQL. We have to create straightforward queries, am I right?

    Read the article

  • Recursive N-way merge/diff algorithm for directory trees?

    - by BobMcGee
    What algorithms or Java libraries are available to do N-way, recursive diff/merge of directories? I need to be able to generate a list of folder trees that have many identical files, and have subdirectories with many similar files. I want to be able to use 2-way merge operations to quickly remove as much redundancy as possible. Goals: Find pairs of directories that have many similar files between them. Generate short list of directory pairs that can be synchronized with 2-way merge to eliminate duplicates Should operate recursively (there may be nested duplicates of higher-level directories) Run time and storage should be O(n log n) in numbers of directories and files Should be able to use an embedded DB or page to disk for processing more files than fit in memory (100,000+). Optional: generate an ancestry and change-set between folders Optional: sort the merge operations by how many duplicates they can elliminate I know how to use hashes to find duplicate files in roughly O(n) space, but I'm at a loss for how to go from this to finding partially overlapping sets between folders and their children. EDIT: some clarification The tricky part is the difference between "exact same" contents (otherwise hashing file hashes would work) and "similar" (which will not). Basically, I want to feed this algorithm at a set of directories and have it return a set of 2-way merge operations I can perform in order to reduce duplicates as much as possible with as few conflicts possible. It's effectively constructing an ancestry tree showing which folders are derived from each other. The end goal is to let me incorporate a bunch of different folders into one common tree. For example, I may have a folder holding programming projects, and then copy some of its contents to another computer to work on it. Then I might back up and intermediate version to flash drive. Except I may have 8 or 10 different versions, with slightly different organizational structures or folder names. I need to be able to merge them one step at a time, so I can chose how to incorporate changes at each step of the way. This is actually more or less what I intend to do with my utility (bring together a bunch of scattered backups from different points in time). I figure if I can do it right I may as well release it as a small open source util. I think the same tricks might be useful for comparing XML trees though.

    Read the article

  • How to convert a 32bpp image to an indexed format?

    - by Ed Swangren
    So here are the details (I am using C# BTW): I receive a 32bpp image (JPEG compressed) from a server. At some point, I would like to use the Palette property of a bitmap to color over-saturated pixels (brightness 240) red. To do so, I need to get the image into an indexed format. I have tried converting the image to a GIF, but I get quality loss. I have tried creating a new bitmap in an index format by these methods: // causes a "Parameter not valid" error Bitmap indexed = new Bitmap(orig.Width, orig.Height, PixelFormat.Indexed) // no error, but the resulting image is black due to information loss I assume Bitmap indexed = new Bitmap(orig.Width, orig.Height, PixelFormat.Format8bppIndexed) I am at a loss now. The data in this image is changed constantly by the user, so I don't want to manually set pixels that have a brightness 240 if I can avoid it. If I can set the palette once when the image is created, my work is done. If I am going about this the wrong way to begin with please let me know. EDIT: Thanks guys, here is some more detail on what I am attempting to accomplish. We are scanning a tissue slide at high resolution (pathology application). I write the interface to the actual scanner. We use a line-scan camera. To test the line rate of the camera, the user scans a very small portion and looks at the image. The image is displayed next to a track bar. When the user moves the track bar (adjusting line rate), I change the overall intensity of the image in an attempt to model what it would look like at the new line rate. I do this using an ImageAttributes and ColorMatrix object currently. When the user adjusts the track bar, I adjust the matrix. This does not give me per pixel information, but the performance is very nice. I could use LockBits and some unsafe code here, but I would rather not rewrite it if possible. When the new image is created, I would like for all pixels with a brightness value of 240 to be colored red. I was thinking that defining a palette for the bitmap up front would be a clean way of doing this.

    Read the article

  • stored as array understanding mongoid

    - by Gagan
    Hello frens, This is not a problem, but I just want to know stored as array of Mongoid better. I have following code in my model. class Company include Mongoid::Document include Mongoid::Timestamps references_many :people, :stored_as => :array, :inverse_of => :companies end class Person include Mongoid::Document include Sunspot::Mongoid references_many :companies, :stored_as => :array, :inverse_of => :people end Now in Company object we get person_ids as a result of stored_as array and company_ids in Person object. Now initially I inserted lots of person in company and the ids in person_ids fields is huge. Now I deleted most of person from company down to 8 people. Now I don't get why person_ids fields of Company object storing all the deleted ids of person. My console snapshot is follwing ruby-1.9.2-head Company.first.person_ids = [BSON::ObjectId('4d12d2907adf350695000025'), BSON::ObjectId('4d12d2907adf35069500002c'), BSON::ObjectId('4d12d2907adf350695000035'), BSON::ObjectId('4d12d2907adf35069500003f'), BSON::ObjectId('4d12d2907adf350695000048'), BSON::ObjectId('4d12d2907adf350695000052'), BSON::ObjectId('4d12d2907adf350695000059'), BSON::ObjectId('4d12d2907adf350695000062'), BSON::ObjectId('4d12d4017adf35069500008d'), BSON::ObjectId('4d12d4017adf350695000094'), BSON::ObjectId('4d12d4017adf35069500009d'), BSON::ObjectId('4d12d4017adf3506950000a7'), BSON::ObjectId('4d12d4017adf3506950000b0'), BSON::ObjectId('4d12d4017adf3506950000ba'), BSON::ObjectId('4d12d4017adf3506950000c1'), BSON::ObjectId('4d12d4017adf3506950000ca'), BSON::ObjectId('4d12d48a7adf3506950000f5'), BSON::ObjectId('4d12d48a7adf3506950000fc'), BSON::ObjectId('4d12d48a7adf350695000108'), BSON::ObjectId('4d12d48b7adf350695000115'), BSON::ObjectId('4d12d48b7adf350695000121'), BSON::ObjectId('4d12d48b7adf35069500012e'), BSON::ObjectId('4d12d48b7adf350695000135'), BSON::ObjectId('4d12d48b7adf350695000141'), BSON::ObjectId('4d12d53e7adf35069500016f'), BSON::ObjectId('4d12d53e7adf350695000176'), BSON::ObjectId('4d12d53e7adf350695000182'), BSON::ObjectId('4d12d53e7adf35069500018f'), BSON::ObjectId('4d12d53e7adf35069500019b'), BSON::ObjectId('4d12d53f7adf3506950001a8'), BSON::ObjectId('4d12d53f7adf3506950001af'), BSON::ObjectId('4d12d53f7adf3506950001bb'), BSON::ObjectId('4d12d8587adf3506950001e9'), BSON::ObjectId('4d12d8587adf3506950001f0'), BSON::ObjectId('4d12d8587adf3506950001ff'), BSON::ObjectId('4d12d8597adf35069500020f'), BSON::ObjectId('4d12d8597adf35069500021e'), BSON::ObjectId('4d12d8597adf35069500022e'), BSON::ObjectId('4d12d8597adf350695000235'), BSON::ObjectId('4d12d85a7adf350695000244'), BSON::ObjectId('4d12d9587adf35069500025b'), BSON::ObjectId('4d12db8b7adf35069500026a'), BSON::ObjectId('4d12de6f7adf3509c9000024'), BSON::ObjectId('4d12de6f7adf3509c900002b'), BSON::ObjectId('4d12de6f7adf3509c900003a'), BSON::ObjectId('4d12de707adf3509c900004a'), BSON::ObjectId('4d12de707adf3509c9000059'), BSON::ObjectId('4d12de707adf3509c9000069'), BSON::ObjectId('4d12de707adf3509c9000070'), BSON::ObjectId('4d12de717adf3509c900007f'), BSON::ObjectId('4d12e7f27adf350bd2000009'), BSON::ObjectId('4d12e81f7adf350bd2000015'), BSON::ObjectId('4d12e87f7adf350bd2000024'), BSON::ObjectId('4d12e8b87adf350bd200004c'), BSON::ObjectId('4d12e8b97adf350bd2000053'), BSON::ObjectId('4d12e8b97adf350bd200005c'), BSON::ObjectId('4d12e8b97adf350bd2000066'), BSON::ObjectId('4d12e8b97adf350bd200006f'), BSON::ObjectId('4d12e8b97adf350bd2000079'), BSON::ObjectId('4d12e8ba7adf350bd2000080'), BSON::ObjectId('4d12e8ba7adf350bd2000089'), BSON::ObjectId('4d12ee6b7adf350bd2000198'), BSON::ObjectId('4d12ee6b7adf350bd200019f'), BSON::ObjectId('4d12ee6c7adf350bd20001a5'), BSON::ObjectId('4d12ee6c7adf350bd20001ac'), BSON::ObjectId('4d12ee6c7adf350bd20001b2'), BSON::ObjectId('4d12ee6c7adf350bd20001b9'), BSON::ObjectId('4d12ee6c7adf350bd20001c0'), BSON::ObjectId('4d12ee6c7adf350bd20001c6'), BSON::ObjectId('4d141ca57adf35033e00006e'), BSON::ObjectId('4d141ca57adf35033e000075'), BSON::ObjectId('4d1420aa7adf350705000003'), BSON::ObjectId('4d1420aa7adf35070500000a'), BSON::ObjectId('4d1420f47adf350705000011'), BSON::ObjectId('4d1420f57adf350705000015'), BSON::ObjectId('4d1420f57adf350705000018'), BSON::ObjectId('4d1420f57adf35070500001c'), BSON::ObjectId('4d1420f57adf350705000023'), BSON::ObjectId('4d1420f57adf350705000026'), BSON::ObjectId('4d14215f7adf35070500004b'), BSON::ObjectId('4d14215f7adf350705000052'), BSON::ObjectId('4d14215f7adf350705000055'), BSON::ObjectId('4d14215f7adf350705000059'), BSON::ObjectId('4d14215f7adf35070500005c'), BSON::ObjectId('4d14215f7adf350705000060'), BSON::ObjectId('4d14215f7adf350705000067'), BSON::ObjectId('4d14215f7adf35070500006a')] Company.first.people.collect(&:id) = [BSON::ObjectId('4d14215f7adf35070500004b'), BSON::ObjectId('4d14215f7adf350705000052'), BSON::ObjectId('4d14215f7adf350705000055'), BSON::ObjectId('4d14215f7adf350705000059'), BSON::ObjectId('4d14215f7adf35070500005c'), BSON::ObjectId('4d14215f7adf350705000060'), BSON::ObjectId('4d14215f7adf350705000067'), BSON::ObjectId('4d14215f7adf35070500006a')] Isn't the Company.first.person_ids array be only storing the ids shown by Company.first.people.collect(&:id) It would be helpful if some one tell me when to best use stored_as = :array method. Do stored_as = :array increase querying performance? Thanks

    Read the article

  • Issue with RegConnectRegistry connecting to 64 bit machines

    - by RA
    I'm seeing a weird thing when connecting to the performance registry on 64 bit editions of Windows. The whole program stalls and callstacks becomes unreadable. After a long timeout, the connection attempts aborts and everything goes back to normal. The only solution is to make sure that only one thread at the time queries the remote registry, unless the remote machine is a 32 bit Windows XP, 2003, 2000 , then you can use as many threads as you like. Have anyone a technical explanation why this might be happening ? I've spent 2-3 days searching the web without coming up with anything. Here is a test program, run it first with one thread (connecting to a 64 bit Windows), then remove the comment in tmain and run it with 4 threads. Running it with one thread works as expected, running with 4, returns ERROR_BUSY (dwRet == 170) after stalling for a while. Remember to set a remote machine correctly in RegConnectRegistry before running the program. #define TOTALBYTES 8192 #define BYTEINCREMENT 4096 void PerfmonThread(void *pData) { DWORD BufferSize = TOTALBYTES; DWORD cbData; DWORD dwRet; PPERF_DATA_BLOCK PerfData = (PPERF_DATA_BLOCK) malloc( BufferSize ); cbData = BufferSize; printf("\nRetrieving the data..."); HKEY hKey; DWORD dwAccessRet = RegConnectRegistry(L"REMOTE_MACHINE",HKEY_PERFORMANCE_DATA,&hKey); dwRet = RegQueryValueEx( hKey,L"global",NULL,NULL,(LPBYTE) PerfData, &cbData ); while( dwRet == ERROR_MORE_DATA ) { // Get a buffer that is big enough. BufferSize += BYTEINCREMENT; PerfData = (PPERF_DATA_BLOCK) realloc( PerfData, BufferSize ); cbData = BufferSize; printf("."); dwRet = RegQueryValueEx( hKey,L"global",NULL,NULL,(LPBYTE) PerfData,&cbData ); } if( dwRet == ERROR_SUCCESS ) printf("\n\nFinal buffer size is %d\n", BufferSize); else printf("\nRegQueryValueEx failed (%d)\n", dwRet); RegCloseKey(hKey); } int _tmain(int argc, _TCHAR* argv[]) { _beginthread(PerfmonThread,0,NULL); /* _beginthread(PerfmonThread,0,NULL); _beginthread(PerfmonThread,0,NULL); _beginthread(PerfmonThread,0,NULL); */ while(1) { Sleep(2000); } }

    Read the article

< Previous Page | 794 795 796 797 798 799 800 801 802 803 804 805  | Next Page >